Skip to main content

Showing 1–50 of 81 results for author: Garcia, N

  1. arXiv:2404.03242  [pdf, other

    cs.CV

    Would Deep Generative Models Amplify Bias in Future Models?

    Authors: Tianwei Chen, Yusuke Hirota, Mayu Otani, Noa Garcia, Yuta Nakashima

    Abstract: We investigate the impact of deep generative models on potential social biases in upcoming computer vision models. As the internet witnesses an increasing influx of AI-generated images, concerns arise regarding inherent biases that may accompany them, potentially leading to the dissemination of harmful content. This paper explores whether a detrimental feedback loop, resulting in bias amplificatio… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted to CVPR 2024

  2. arXiv:2403.17752  [pdf, other

    cs.CL

    Can multiple-choice questions really be useful in detecting the abilities of LLMs?

    Authors: Wangyue Li, Liangzhi Li, Tong Xiang, Xiao Liu, Wei Deng, Noa Garcia

    Abstract: Multiple-choice questions (MCQs) are widely used in the evaluation of large language models (LLMs) due to their simplicity and efficiency. However, there are concerns about whether MCQs can truly measure LLM's capabilities, particularly in knowledge-intensive scenarios where long-form generation (LFG) answers are required. The misalignment between the task and the evaluation method demands a thoug… ▽ More

    Submitted 23 May, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: LREC-COLING 2024

  3. arXiv:2403.16277  [pdf, ps, other

    cs.RO

    Combined Task and Motion Planning Via Sketch Decompositions (Extended Version with Supplementary Material)

    Authors: Magí Dalmau-Moreno, Néstor García, Vicenç Gómez, Héctor Geffner

    Abstract: The challenge in combined task and motion planning (TAMP) is the effective integration of a search over a combinatorial space, usually carried out by a task planner, and a search over a continuous configuration space, carried out by a motion planner. Using motion planners for testing the feasibility of task plans and filling out the details is not effective because it makes the geometrical constra… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  4. arXiv:2403.07091  [pdf, other

    cs.RO

    Sim-to-Real gap in RL: Use Case with TIAGo and Isaac Sim/Gym

    Authors: Jaume Albardaner, Alberto San Miguel, Néstor García, Magí Dalmau-Moreno

    Abstract: This paper explores policy-learning approaches in the context of sim-to-real transfer for robotic manipulation using a TIAGo mobile manipulator, focusing on two state-of-art simulators, Isaac Gym and Isaac Sim, both developed by Nvidia. Control architectures are discussed, with a particular emphasis on achieving collision-less movement in both simulation and the real environment. Presented results… ▽ More

    Submitted 27 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted in ERF24 workshop "Towards Efficient and Portable Robot Learning for Real-World Settings". To be published in Springer Proceedings in Advanced Robotics

  5. arXiv:2403.00612  [pdf, other

    eess.IV cs.CV

    Advancing dermatological diagnosis: Development of a hyperspectral dermatoscope for enhanced skin imaging

    Authors: Martin J. Hetz, Carina Nogueira Garcia, Sarah Haggenmüller, Titus J. Brinker

    Abstract: Clinical dermatology necessitates precision and innovation for efficient diagnosis and treatment of various skin conditions. This paper introduces the development of a cutting-edge hyperspectral dermatoscope (the Hyperscope) tailored for human skin analysis. We detail the requirements to such a device and the design considerations, from optical configurations to sensor selection, necessary to capt… ▽ More

    Submitted 25 June, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Comments: 12 pages, 11 Figures

  6. arXiv:2402.14114  [pdf, other

    cs.CV

    Multi-organ Self-supervised Contrastive Learning for Breast Lesion Segmentation

    Authors: Hugo Figueiras, Helena Aidos, Nuno Cruz Garcia

    Abstract: Self-supervised learning has proven to be an effective way to learn representations in domains where annotated labels are scarce, such as medical imaging. A widely adopted framework for this purpose is contrastive learning and it has been applied to different scenarios. This paper seeks to advance our understanding of the contrastive learning framework by exploring a novel perspective: employing m… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  7. arXiv:2312.03027  [pdf, other

    cs.CV

    Stable Diffusion Exposed: Gender Bias from Prompt to Image

    Authors: Yankun Wu, Yuta Nakashima, Noa Garcia

    Abstract: Recent studies have highlighted biases in generative models, shedding light on their predisposition towards gender-based stereotypes and imbalances. This paper contributes to this growing body of research by introducing an evaluation protocol designed to automatically analyze the impact of gender indicators on Stable Diffusion images. Leveraging insights from prior work, we explore how gender indi… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  8. arXiv:2311.18662  [pdf, ps, other

    cs.AI

    Solving the Team Orienteering Problem with Transformers

    Authors: Daniel Fuertes, Carlos R. del-Blanco, Fernando Jaureguizar, Narciso García

    Abstract: Route planning for a fleet of vehicles is an important task in applications such as package delivery, surveillance, or transportation. This problem is usually modeled as a Combinatorial Optimization problem named as Team Orienteering Problem. The most popular Team Orienteering Problem solvers are mainly based on either linear programming, which provides accurate solutions by employing a large comp… ▽ More

    Submitted 1 December, 2023; v1 submitted 30 November, 2023; originally announced November 2023.

  9. arXiv:2311.18345  [pdf

    cs.CY

    Situating the social issues of image generation models in the model life cycle: a sociotechnical approach

    Authors: Amelia Katirai, Noa Garcia, Kazuki Ide, Yuta Nakashima, Atsuo Kishimoto

    Abstract: The race to develop image generation models is intensifying, with a rapid increase in the number of text-to-image models available. This is coupled with growing public awareness of these technologies. Though other generative AI models--notably, large language models--have received recent critical attention for the social and other non-technical issues they raise, there has been relatively little c… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  10. arXiv:2307.01458  [pdf, other

    cs.CL

    CARE-MI: Chinese Benchmark for Misinformation Evaluation in Maternity and Infant Care

    Authors: Tong Xiang, Liangzhi Li, Wangyue Li, Mingbai Bai, Lu Wei, Bowen Wang, Noa Garcia

    Abstract: The recent advances in natural language processing (NLP), have led to a new trend of applying large language models (LLMs) to real-world scenarios. While the latest LLMs are astonishingly fluent when interacting with humans, they suffer from the misinformation problem by unintentionally generating factually false statements. This can lead to harmful consequences, especially when produced within se… ▽ More

    Submitted 26 October, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023 Datasets and Benchmarks Track

  11. arXiv:2305.02731  [pdf

    cs.NE

    A Cluster-Based Opposition Differential Evolution Algorithm Boosted by a Local Search for ECG Signal Classification

    Authors: Mehran Pourvahab, Seyed Jalaleddin Mousavirad, Virginie Felizardo, Nuno Pombo, Henriques Zacarias, Hamzeh Mohammadigheymasi, Sebastião Pais, Seyed Nooreddin Jafari, Nuno M. Garcia

    Abstract: Electrocardiogram (ECG) signals, which capture the heart's electrical activity, are used to diagnose and monitor cardiac problems. The accurate classification of ECG signals, particularly for distinguishing among various types of arrhythmias and myocardial infarctions, is crucial for the early detection and treatment of heart-related diseases. This paper proposes a novel approach based on an impro… ▽ More

    Submitted 6 October, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: 44 pages, 9 figures

  12. Not Only Generative Art: Stable Diffusion for Content-Style Disentanglement in Art Analysis

    Authors: Yankun Wu, Yuta Nakashima, Noa Garcia

    Abstract: The duality of content and style is inherent to the nature of art. For humans, these two elements are clearly different: content refers to the objects and concepts in the piece of art, and style to the way it is expressed. This duality poses an important challenge for computer vision. The visual appearance of objects and concepts is modulated by the style that may reflect the author's emotions, so… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

  13. arXiv:2304.03693  [pdf, other

    cs.CV

    Model-Agnostic Gender Debiased Image Captioning

    Authors: Yusuke Hirota, Yuta Nakashima, Noa Garcia

    Abstract: Image captioning models are known to perpetuate and amplify harmful societal bias in the training set. In this work, we aim to mitigate such gender bias in image captioning models. While prior work has addressed this problem by forcing models to focus on people to reduce gender misclassification, it conversely generates gender-stereotypical words at the expense of predicting the correct gender. Fr… ▽ More

    Submitted 21 December, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  14. arXiv:2304.02828  [pdf, other

    cs.CV cs.CY

    Uncurated Image-Text Datasets: Shedding Light on Demographic Bias

    Authors: Noa Garcia, Yusuke Hirota, Yankun Wu, Yuta Nakashima

    Abstract: The increasing tendency to collect large and uncurated datasets to train vision-and-language models has raised concerns about fair representations. It is known that even small but manually annotated datasets, such as MSCOCO, are affected by societal bias. This problem, far from being solved, may be getting worse with data crawled from the Internet without much control. In addition, the lack of too… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  15. arXiv:2303.12806  [pdf

    q-bio.QM cs.CV cs.LG eess.IV

    Dermatologist-like explainable AI enhances trust and confidence in diagnosing melanoma

    Authors: Tirtha Chanda, Katja Hauser, Sarah Hobelsberger, Tabea-Clara Bucher, Carina Nogueira Garcia, Christoph Wies, Harald Kittler, Philipp Tschandl, Cristian Navarrete-Dechent, Sebastian Podlipnik, Emmanouil Chousakos, Iva Crnaric, Jovana Majstorovic, Linda Alhajwan, Tanya Foreman, Sandra Peternel, Sergei Sarap, İrem Özdemir, Raymond L. Barnhill, Mar Llamas Velasco, Gabriela Poch, Sören Korsing, Wiebke Sondermann, Frank Friedrich Gellrich, Markus V. Heppt , et al. (10 additional authors not shown)

    Abstract: Although artificial intelligence (AI) systems have been shown to improve the accuracy of initial melanoma diagnosis, the lack of transparency in how these systems identify melanoma poses severe obstacles to user acceptance. Explainable artificial intelligence (XAI) methods can help to increase transparency, but most XAI methods are unable to produce precisely located domain-specific explanations,… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

  16. arXiv:2303.09227  [pdf, other

    cs.RO

    MROS: A framework for robot self-adaptation

    Authors: Gustavo Rezende Silva, Darko Bozhinoski, Mario Garzon Oviedo, Mariano Ramírez Montero, Nadia Hammoudeh Garcia, Harshavardhan Deshpande, Andrzej Wasowski, Carlos Hernandez Corbato

    Abstract: Self-adaptation can be used in robotics to increase system robustness and reliability. This work describes the Metacontrol method for self-adaptation in robotics. Particularly, it details how the MROS (Metacontrol for ROS Systems) framework implements and packages Metacontrol, and it demonstrate how MROS can be applied in a navigation scenario where a mobile robot navigates in a factory floor. Vid… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: 5 pages, 4 figures, accepted at ICSE 2023 demo track

  17. A Comparative Analysis of Bias Amplification in Graph Neural Network Approaches for Recommender Systems

    Authors: Nikzad Chizari, Niloufar Shoeibi, María N. Moreno-García

    Abstract: Recommender Systems (RSs) are used to provide users with personalized item recommendations and help them overcome the problem of information overload. Currently, recommendation methods based on deep learning are gaining ground over traditional methods such as matrix factorization due to their ability to represent the complex relationships between users and items and to incorporate additional infor… ▽ More

    Submitted 18 January, 2023; originally announced January 2023.

    ACM Class: I.2.1

    Journal ref: Chizari, N.; Shoeibi, N.; Moreno-García, M.N. A Comparative Analysis of Bias Amplification in Graph Neural Network Approaches for Recommender Systems. Electronics 2022, 11, 3301

  18. arXiv:2208.10758  [pdf, other

    cs.CV cs.AI

    Learning More May Not Be Better: Knowledge Transferability in Vision and Language Tasks

    Authors: Tianwei Chen, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Hajime Nagahara

    Abstract: Is more data always better to train vision-and-language models? We study knowledge transferability in multi-modal tasks. The current tendency in machine learning is to assume that by joining multiple datasets from different tasks their overall performance will improve. However, we show that not all the knowledge transfers well or has a positive impact on related tasks, even when they share a commo… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

  19. arXiv:2205.10233  [pdf, other

    cs.CL

    RigoBERTa: A State-of-the-Art Language Model For Spanish

    Authors: Alejandro Vaca Serrano, Guillem Garcia Subies, Helena Montoro Zamorano, Nuria Aldama Garcia, Doaa Samy, David Betancur Sanchez, Antonio Moreno Sandoval, Marta Guerrero Nieto, Alvaro Barbero Jimenez

    Abstract: This paper presents RigoBERTa, a State-of-the-Art Language Model for Spanish. RigoBERTa is trained over a well-curated corpus formed up from different subcorpora with key features. It follows the DeBERTa architecture, which has several advantages over other architectures of similar size as BERT or RoBERTa. RigoBERTa performance is assessed over 13 NLU tasks in comparison with other available Spani… ▽ More

    Submitted 3 June, 2022; v1 submitted 27 April, 2022; originally announced May 2022.

  20. Gender and Racial Bias in Visual Question Answering Datasets

    Authors: Yusuke Hirota, Yuta Nakashima, Noa Garcia

    Abstract: Vision-and-language tasks have increasingly drawn more attention as a means to evaluate human-like reasoning in machine learning models. A popular task in the field is visual question answering (VQA), which aims to answer questions about images. However, VQA models have been shown to exploit language bias by learning the statistical correlations between questions and answers without looking into t… ▽ More

    Submitted 3 June, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

    Comments: ACM Conference on Fairness, Accountability, and Transparency (FAccT 2022)

  21. Emerging Immersive Communication Systems: Overview, Taxonomy, and Good Practises for QoE Assessment

    Authors: Pablo Pérez, Ester Gonzalez-Sosa, Jesús Gutiérrez, Narciso García

    Abstract: Several technological and scientific advances have been achieved recently in the fields of immersive systems, which are offering new possibilities to applications and services in different communication domains, such as entertainment, virtual conferencing, working meetings, social relations, healthcare, and industry. Users of these immersive technologies can explore and experience the stimuli in a… ▽ More

    Submitted 1 September, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

    Comments: Frontiers in Signal Processing

    Journal ref: Front. Signal Process. (2022)

  22. arXiv:2203.15395  [pdf, other

    cs.CV cs.MM

    Quantifying Societal Bias Amplification in Image Captioning

    Authors: Yusuke Hirota, Yuta Nakashima, Noa Garcia

    Abstract: We study societal bias amplification in image captioning. Image captioning models have been shown to perpetuate gender and racial biases, however, metrics to measure, quantify, and evaluate the societal bias in captions are not yet standardized. We provide a comprehensive study on the strengths and limitations of each metric, and propose LIC, a metric to study captioning bias amplification. We arg… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: CVPR 2022

  23. arXiv:2202.01747  [pdf, other

    cs.CV

    The Met Dataset: Instance-level Recognition for Artworks

    Authors: Nikolaos-Antonios Ypsilantis, Noa Garcia, Guangxing Han, Sarah Ibrahimi, Nanne Van Noord, Giorgos Tolias

    Abstract: This work introduces a dataset for large-scale instance-level recognition in the domain of artworks. The proposed benchmark exhibits a number of different challenges such as large inter-class similarity, long tail distribution, and many classes. We rely on the open access collection of The Met museum to form a large training set of about 224k classes, where each class corresponds to a museum exhib… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

  24. Selecting the suitable resampling strategy for imbalanced data classification regarding dataset properties

    Authors: Mohamed S. Kraiem, Fernando Sánchez-Hernández, María N. Moreno-García

    Abstract: In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples. Thus, the prediction model is unreliable although the ov… ▽ More

    Submitted 15 December, 2021; originally announced January 2022.

    Comments: Kraiem, M.S., Sánchez-Hernández, F., Moreno-García, M.N. Selecting the Suitable Resampling Strategy for Imbalanced Data Classification Regarding Dataset Properties. An Approach Based on Association Models. Appl. Sci. 2021, 11(18), 8546, 2021

    ACM Class: I.2.1

    Journal ref: Appl. Sci. 2021, 11(18), 8546, 2021

  25. arXiv:2110.13395  [pdf, other

    cs.CV cs.AI

    Transferring Domain-Agnostic Knowledge in Video Question Answering

    Authors: Tianran Wu, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Haruo Takemura

    Abstract: Video question answering (VideoQA) is designed to answer a given question based on a relevant video clip. The current available large-scale datasets have made it possible to formulate VideoQA as the joint understanding of visual and language information. However, this training procedure is costly and still less competent with human performance. In this paper, we investigate a transfer learning met… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

  26. arXiv:2110.07430  [pdf, other

    cs.LG stat.CO stat.ME

    Detecting Renewal States in Chains of Variable Length via Intrinsic Bayes Factors

    Authors: Victor Freguglia, Nancy Garcia

    Abstract: Markov chains with variable length are useful parsimonious stochastic models able to generate most stationary sequence of discrete symbols. The idea is to identify the suffixes of the past, called contexts, that are relevant to predict the future symbol. Sometimes a single state is a context, and looking at the past and finding this specific state makes the further past irrelevant. States with suc… ▽ More

    Submitted 6 January, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: 25 pages, 3 figures

  27. arXiv:2109.11231  [pdf

    cs.IR

    Dynamic inference of user context through social tag embedding for music recommendation

    Authors: Diego Sánchez-Moreno, Álvaro Lozano Murciego, Vivian F. López Batista, María Dolores Muñoz Vicente, María N. Moreno-García

    Abstract: Music listening preferences at a given time depend on a wide range of contextual factors, such as user emotional state, location and activity at listening time, the day of the week, the time of the day, etc. It is therefore of great importance to take them into account when recommending music. However, it is very difficult to develop context-aware recommender systems that consider these factors, b… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: 15th ACM Conference on Recommender Systems-Workshop on Context-Aware Recommender Systems (RECSYS 2021-CARS)

  28. arXiv:2109.05743  [pdf, other

    cs.CV cs.AI cs.CL

    Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

    Authors: Zechen Bai, Yuta Nakashima, Noa Garcia

    Abstract: Have you ever looked at a painting and wondered what is the story behind it? This work presents a framework to bring art closer to people by generating comprehensive descriptions of fine-art paintings. Generating informative descriptions for artworks, however, is extremely challenging, as it requires to 1) describe multiple aspects of the image such as its style, content, or composition, and 2) pr… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: ICCV 2021

  29. arXiv:2108.06432  [pdf, other

    cs.CV

    Soccer line mark segmentation and classification with stochastic watershed transform

    Authors: Daniel Berjón, Carlos Cuevas, Narciso García

    Abstract: Augmented reality applications are beginning to change the way sports are broadcast, providing richer experiences and valuable insights to fans. The first step of augmented reality systems is camera calibration, possibly based on detecting the line markings of the playing field. Most existing proposals for line detection rely on edge detection and Hough transform, but radial distortion and extrane… ▽ More

    Submitted 3 August, 2022; v1 submitted 13 August, 2021; originally announced August 2021.

    Comments: 18 pages, 11 figures

    ACM Class: I.4.6

  30. arXiv:2106.13445  [pdf, other

    cs.CV

    A Picture May Be Worth a Hundred Words for Visual Question Answering

    Authors: Yusuke Hirota, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Ittetsu Taniguchi, Takao Onoye

    Abstract: How far can we go with textual representations for understanding pictures? In image understanding, it is essential to use concise but detailed image representations. Deep visual features extracted by vision models, such as Faster R-CNN, are prevailing used in multiple tasks, and especially in visual question answering (VQA). However, conventional deep visual features may struggle to convey all the… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

  31. arXiv:2105.11852  [pdf

    cs.LG cs.CV

    GCNBoost: Artwork Classification by Label Propagation through a Knowledge Graph

    Authors: Cheikh Brahim El Vaigh, Noa Garcia, Benjamin Renoust, Chenhui Chu, Yuta Nakashima, Hajime Nagahara

    Abstract: The rise of digitization of cultural documents offers large-scale contents, opening the road for development of AI systems in order to preserve, search, and deliver cultural heritage. To organize such cultural content also means to classify them, a task that is very familiar to modern computer science. Contextual information is often the key to structure such real world data, and we propose to use… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

  32. Subjective Assessment Experiments That Recruit Few Observers With Repetitions (FOWR)

    Authors: Pablo Perez, Lucjan Janowski, Narciso Garcia, Margaret Pinson

    Abstract: Recent studies have shown that it is possible to characterize subject bias and variance in subjective assessment tests. Apparent differences among subjects can, for the most part, be explained by random factors. Building on that theory, we propose a subjective test design where three to four team members each rate the stimuli multiple times. The results are comparable to a high performing objectiv… ▽ More

    Submitted 20 July, 2022; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: IEEE Transactions on Multimedia

  33. arXiv:2104.01406  [pdf

    cs.NI

    The Internet Protocol -- Past, some current limitations and a glimpse of a possible future

    Authors: Nuno M. Garcia

    Abstract: The network layer is central to the networking scientific area. It is around the network layer that all the data communications develop, and one of its main tasks is to allow the identification of each single interface/machine between the potentially many interfaces in a network. This seminar addresses some of the issues that are usually presented to young Computer Science Engineering students in… ▽ More

    Submitted 3 April, 2021; originally announced April 2021.

  34. Methodology to Assess Quality, Presence, Empathy, Attitude, and Attention in 360-degree Videos for Immersive Communications

    Authors: Marta Orduna, Pablo Pérez, Jesús Gutiérrez, Narciso García

    Abstract: This paper analyzes the joint assessment of quality, spatial and social presence, empathy, attitude, and attention in three conditions: (A)visualizing and rating the quality of contents in a Head-Mounted Display (HMD), (B)visualizing the contents in an HMD,and (C)visualizing the contents in an HMD where participants can see their hands and take notes. The experiment simulates an immersive communic… ▽ More

    Submitted 9 February, 2022; v1 submitted 3 March, 2021; originally announced March 2021.

    Comments: IEEE Transactions on Affective Computing, Early Access

  35. arXiv:2101.05479  [pdf, other

    cs.CV cs.LG

    Understanding the Role of Scene Graphs in Visual Question Answering

    Authors: Vinay Damodaran, Sharanya Chakravarthy, Akshay Kumar, Anjana Umapathy, Teruko Mitamura, Yuta Nakashima, Noa Garcia, Chenhui Chu

    Abstract: Visual Question Answering (VQA) is of tremendous interest to the research community with important applications such as aiding visually impaired users and image-based search. In this work, we explore the use of scene graphs for solving the VQA task. We conduct experiments on the GQA dataset which presents a challenging set of questions requiring counting, compositionality and advanced reasoning ca… ▽ More

    Submitted 16 January, 2021; v1 submitted 14 January, 2021; originally announced January 2021.

  36. arXiv:2010.09145  [pdf, other

    cs.RO cs.SE

    MROS: Runtime Adaptation For Robot Control Architectures

    Authors: Darko Bozhinoski, Carlos Hernandez Corbato, Mario Garzon Oviedo, Gijs van der Hoorn, Nadia Hammoudeh Garcia, Harshavardhan Deshpande, Jon Tjerngren, Andrzej Wasowski

    Abstract: Known attempts to build autonomous robots rely on complex control architectures, often implemented with the Robot Operating System platform (ROS). Runtime adaptation is needed in these systems, to cope with component failures and with contingencies arising from dynamic environments-otherwise, these affect the reliability and quality of the mission execution. Existing proposals on how to build self… ▽ More

    Submitted 23 November, 2021; v1 submitted 18 October, 2020; originally announced October 2020.

  37. arXiv:2009.14545  [pdf, other

    cs.CV cs.SI

    Demographic Influences on Contemporary Art with Unsupervised Style Embeddings

    Authors: Nikolai Huckle, Noa Garcia, Yuta Nakashima

    Abstract: Computational art analysis has, through its reliance on classification tasks, prioritised historical datasets in which the artworks are already well sorted with the necessary annotations. Art produced today, on the other hand, is numerous and easily accessible, through the internet and social networks that are used by professional and amateur artists alike to display their work. Although this art,… ▽ More

    Submitted 1 December, 2020; v1 submitted 30 September, 2020; originally announced September 2020.

    Comments: To be published in Proceedings of the European Conference in Computer Vision Workshops 2020

  38. arXiv:2008.12520  [pdf, other

    cs.CV cs.CL

    A Dataset and Baselines for Visual Question Answering on Art

    Authors: Noa Garcia, Chentao Ye, Zihua Liu, Qingtao Hu, Mayu Otani, Chenhui Chu, Yuta Nakashima, Teruko Mitamura

    Abstract: Answering questions related to art pieces (paintings) is a difficult task, as it implies the understanding of not only the visual information that is shown in the picture, but also the contextual knowledge that is acquired through the study of the history of art. In this work, we introduce our first attempt towards building a new dataset, coined AQUA (Art QUestion Answering). The question-answer (… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

  39. Time-Aware Music Recommender Systems: Modeling the Evolution of Implicit User Preferences and User Listening Habits in A Collaborative Filtering Approach

    Authors: Diego Sánchez-Moreno, Yong Zheng, María N. Moreno-García

    Abstract: Online streaming services have become the most popular way of listening to music. The majority of these services are endowed with recommendation mechanisms that help users to discover songs and artists that may interest them from the vast amount of music available. However, many are not reliable as they may not take into account contextual aspects or the ever-evolving user behavior. Therefore, it… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

    Journal ref: Applied Sciences, 10(15), 5324, 33 pages, 2020

  40. arXiv:2007.08751  [pdf, other

    cs.CV cs.CL

    Knowledge-Based Video Question Answering with Unsupervised Scene Descriptions

    Authors: Noa Garcia, Yuta Nakashima

    Abstract: To understand movies, humans constantly reason over the dialogues and actions shown in specific scenes and relate them to the overall storyline already seen. Inspired by this behaviour, we design ROLL, a model for knowledge-based video story question answering that leverages three crucial aspects of movie understanding: dialog comprehension, scene reasoning, and storyline recalling. In ROLL, each… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

  41. arXiv:2007.04456  [pdf

    eess.SP cs.AI cs.CY cs.LG

    An Efficient Data Imputation Technique for Human Activity Recognition

    Authors: Ivan Miguel Pires, Faisal Hussain, Nuno M. Garcia, Eftim Zdravevski

    Abstract: The tremendous applications of human activity recognition are surging its span from health monitoring systems to virtual reality applications. Thus, the automatic recognition of daily life activities has become significant for numerous applications. In recent years, many datasets have been proposed to train the machine learning models for efficient monitoring and recognition of human daily living… ▽ More

    Submitted 8 July, 2020; originally announced July 2020.

    Comments: 8 Pages, 8 Figures, 1 Table. Accepted in 14th Multi Conference on Computer Science and Information Systems 2020 (MCCSIS 2020)

  42. FVV Live: A real-time free-viewpoint video system with consumer electronics hardware

    Authors: Pablo Carballeira, Carlos Carmona, César Díaz, Daniel Berjón, Daniel Corregidor, Julián Cabrera, Francisco Morán, Carmen Doblado, Sergio Arnaldo, María del Mar Martín, Narciso García

    Abstract: FVV Live is a novel end-to-end free-viewpoint video system, designed for low cost and real-time operation, based on off-the-shelf components. The system has been designed to yield high-quality free-viewpoint video using consumer-grade cameras and hardware, which enables low deployment costs and easy installation for immersive event-broadcasting or videoconferencing. The paper describes the archi… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

  43. arXiv:2007.00127  [pdf, other

    cs.NI

    Identifying Packet Loss and Reordering Packets in Keyed UDP Transmissions

    Authors: Fábio Machado Gil, Nuno M. Garcia, Bárbara Matos, Nuno Pombo, Rossitza Goleva, Ciprian Dobre

    Abstract: The User Datagram Protocol (UDP) and other similar protocols send the application data from the source machine to the destination machine inside segments, without foreseeing nor allowing for any type of control on the transmission or success metrics. These protocols are very convenient for e.g. real time data transmission. But when the reliability of the transmitted data is critical, other protoco… ▽ More

    Submitted 30 June, 2020; originally announced July 2020.

    Comments: 5 pages, 4 figures

  44. FVV Live: Real-Time, Low-Cost, Free Viewpoint Video

    Authors: Daniel Berjón, Pablo Carballeira, Julián Cabrera, Carlos Carmona, Daniel Corregidor, César Díaz, Francisco Morán, Narciso García

    Abstract: FVV Live is a novel real-time, low-latency, end-to-end free viewpoint system including capture, transmission, synthesis on an edge server and visualization and control on a mobile terminal. The system has been specially designed for low-cost and real-time operation, only using off-the-shelf components.

    Submitted 30 June, 2020; originally announced June 2020.

  45. arXiv:2006.03541  [pdf

    cs.CL cs.IR cs.LG

    Sentiment Analysis Based on Deep Learning: A Comparative Study

    Authors: Nhan Cach Dang, María N. Moreno-García, Fernando De la Prieta

    Abstract: The study of public opinion can provide us with valuable information. The analysis of sentiment on social networks, such as Twitter or Facebook, has become a powerful means of learning about the users' opinions and has a wide range of applications. However, the efficiency and accuracy of sentiment analysis is being hindered by the challenges encountered in natural language processing (NLP). In rec… ▽ More

    Submitted 5 June, 2020; originally announced June 2020.

    Journal ref: Electronics, 9 (3), 483, 29 pages, 2020

  46. arXiv:2005.03582  [pdf

    cs.LG stat.ML

    Predictive Modeling of ICU Healthcare-Associated Infections from Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling Approach

    Authors: Fernando Sánchez-Hernández, Juan Carlos Ballesteros-Herráez, Mohamed S. Kraiem, Mercedes Sánchez-Barba, María N. Moreno-García

    Abstract: Early detection of patients vulnerable to infections acquired in the hospital environment is a challenge in current health systems given the impact that such infections have on patient mortality and healthcare costs. This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units by means of machine-learning methods. Th… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

    Journal ref: Applied Sciences 9(24),5287,2019

  47. arXiv:2004.13007  [pdf

    cs.IR cs.HC cs.LG cs.SD eess.AS

    A session-based song recommendation approach involving user characterization along the play power-law distribution

    Authors: Diego Sánchez-Moreno, Vivian F. López Batista, M. Dolores Muñoz Vicente, Ana B. Gil González, María N. Moreno-García

    Abstract: In recent years, streaming music platforms have become very popular mainly due to the huge number of songs these systems make available to users. This enormous availability means that recommendation mechanisms that help users to select the music they like need to be incorporated. However, developing reliable recommender systems in the music field involves dealing with many problems, some of which… ▽ More

    Submitted 25 April, 2020; originally announced April 2020.

    Comments: Accepted in Complexity (ISSN: 1099-0526)

  48. arXiv:2004.08385  [pdf, other

    cs.CV cs.CL

    Knowledge-Based Visual Question Answering in Videos

    Authors: Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima

    Abstract: We propose a novel video understanding task by fusing knowledge-based and video question answering. First, we introduce KnowIT VQA, a video dataset with 24,282 human-generated question-answer pairs about a popular sitcom. The dataset combines visual, textual and temporal coherence reasoning together with knowledge-based questions, which need of the experience obtained from the viewing of the serie… ▽ More

    Submitted 16 April, 2020; originally announced April 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:1910.10706

  49. arXiv:2003.08748  [pdf

    eess.IV cs.CV cs.LG

    Reduction of Surgical Risk Through the Evaluation of Medical Imaging Diagnostics

    Authors: Marco A. V. M. Grinet, Nuno M. Garcia, Ana I. R. Gouveia, Jose A. F. Moutinho, Abel J. P. Gomes

    Abstract: Computer aided diagnosis (CAD) of Breast Cancer (BRCA) images has been an active area of research in recent years. The main goals of this research is to develop reliable automatic methods for detecting and diagnosing different types of BRCA from diagnostic images. In this paper, we present a review of the state of the art CAD methods applied to magnetic resonance (MRI) and mammography images of BR… ▽ More

    Submitted 8 March, 2020; originally announced March 2020.

    Comments: 25 pages, 7 figures, Scientific grant report

    ACM Class: I.4.6

  50. arXiv:1912.10982  [pdf, other

    cs.CV

    DMCL: Distillation Multiple Choice Learning for Multimodal Action Recognition

    Authors: Nuno C. Garcia, Sarah Adel Bargal, Vitaly Ablavsky, Pietro Morerio, Vittorio Murino, Stan Sclaroff

    Abstract: In this work, we address the problem of learning an ensemble of specialist networks using multimodal data, while considering the realistic and challenging scenario of possible missing modalities at test time. Our goal is to leverage the complementary information of multiple modalities to the benefit of the ensemble and each individual network. We introduce a novel Distillation Multiple Choice Lear… ▽ More

    Submitted 23 December, 2019; originally announced December 2019.