subscribe to arXiv mailings

Path-based Algebraic Foundations of Graph Query Languages

Authors: Renzo Angles, Angela Bonifati, Roberto García, Domagoj Vrgoč

Abstract: Graph databases are gaining momentum thanks to the flexibility and expressiveness of their data model and query languages. A standardization activity driven by the ISO/IEC standardization body is also ongoing and has already conducted to the specification of the first versions of two standard graph query languages, namely SQL/PGQ and GQL, respectively in 2023 and 2024. Apart from the standards, th… ▽ More Graph databases are gaining momentum thanks to the flexibility and expressiveness of their data model and query languages. A standardization activity driven by the ISO/IEC standardization body is also ongoing and has already conducted to the specification of the first versions of two standard graph query languages, namely SQL/PGQ and GQL, respectively in 2023 and 2024. Apart from the standards, there exists a panoply of concrete graph query languages in commercial and open-source graph databases, each of which exhibits different features and modes. In this paper, we tackle the heterogeneity problem of graph query languages by laying the foundations of a unifying path-oriented algebraic framework. Such a theoretical framework is currently missing in the graph databases landscape, thus impeding a lingua franca in which different graph query language implementations can be expressed and cross-compared. Our framework gives a blueprint for correct implementation of graph queries of different expressiveness. It allows to overcome the boundaries of current versions of standard query languages, thus paving the way to future extensions including query composability. It also allows, when the path-based semantics is stripped off, to express classical Codd's relational algebra enhanced with a recursive operator, thus proving its utility for a wide range of queries in database management systems. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: Under review

arXiv:2405.08200 [pdf, other]

Interactive Lab Notebooks for Robotics Researchers

Authors: Rolando Garcia

Abstract: Interactive notebooks, such as Jupyter, have revolutionized the field of data science by providing an integrated environment for data, code, and documentation. However, their adoption by robotics researchers and model developers has been limited. This study investigates the logging and record-keeping practices of robotics researchers, drawing parallels to the pre-interactive notebook era of data s… ▽ More Interactive notebooks, such as Jupyter, have revolutionized the field of data science by providing an integrated environment for data, code, and documentation. However, their adoption by robotics researchers and model developers has been limited. This study investigates the logging and record-keeping practices of robotics researchers, drawing parallels to the pre-interactive notebook era of data science. Through interviews with robotics researchers, we identified the reliance on diverse and often incompatible tools for managing experimental data, leading to challenges in reproducibility and data traceability. Our findings reveal that robotics researchers can benefit from a specialized version of interactive notebooks that supports comprehensive data entry, continuous context capture, and agile data staging. We propose extending interactive notebooks to better serve the needs of robotics researchers by integrating features akin to traditional lab notebooks. This adaptation aims to enhance the organization, analysis, and reproducibility of experimental data in robotics, fostering a more streamlined and efficient research workflow. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2404.18007 [pdf, ps, other]

A Formal Model to Prove Instantiation Termination for E-matching-Based Axiomatisations (Extended Version)

Authors: Rui Ge, Ronald Garcia, Alexander J. Summers

Abstract: SMT-based program analysis and verification often involve reasoning about program features that have been specified using quantifiers; incorporating quantifiers into SMT-based reasoning is, however, known to be challenging. If quantifier instantiation is not carefully controlled, then runtime and outcomes can be brittle and hard to predict. In particular, uncontrolled quantifier instantiation can… ▽ More SMT-based program analysis and verification often involve reasoning about program features that have been specified using quantifiers; incorporating quantifiers into SMT-based reasoning is, however, known to be challenging. If quantifier instantiation is not carefully controlled, then runtime and outcomes can be brittle and hard to predict. In particular, uncontrolled quantifier instantiation can lead to unexpected incompleteness and even non-termination. E-matching is the most widely-used approach for controlling quantifier instantiation, but when axiomatisations are complex, even experts cannot tell if their use of E-matching guarantees completeness or termination. This paper presents a new formal model that facilitates the proof, once and for all, that giving a complex E-matching-based axiomatisation to an SMT solver, such as Z3 or cvc5, will not cause non-termination. Key to our technique is an operational semantics for solver behaviour that models how the E-matching rules common to most solvers are used to determine when quantifier instantiations are enabled, but abstracts over irrelevant details of individual solvers. We demonstrate the effectiveness of our technique by presenting a termination proof for a set theory axiomatisation adapted from those used in the Dafny and Viper verifiers. △ Less

Submitted 27 April, 2024; originally announced April 2024.

Comments: extended version of IJCAR 2024 publication

arXiv:2404.14997 [pdf, other]

Mining higher-order triadic interactions

Authors: Anthony Baptista, Marta Niedostatek, Jun Yamamoto, Ben MacArthur, Jurgen Kurths, Ruben Sanchez Garcia, Ginestra Bianconi

Abstract: Complex systems often present higher-order interactions which require us to go beyond their description in terms of pairwise networks. Triadic interactions are a fundamental type of higher-order interaction that occurs when one node regulates the interaction between two other nodes. Triadic interactions are a fundamental type of higher-order networks, found in a large variety of biological systems… ▽ More Complex systems often present higher-order interactions which require us to go beyond their description in terms of pairwise networks. Triadic interactions are a fundamental type of higher-order interaction that occurs when one node regulates the interaction between two other nodes. Triadic interactions are a fundamental type of higher-order networks, found in a large variety of biological systems, from neuron-glia interactions to gene-regulation and ecosystems. However, triadic interactions have been so far mostly neglected. In this article, we propose a theoretical principle to model and mine triadic interactions from node metadata, and we apply this framework to gene expression data finding new candidates for triadic interactions relevant for Acute Myeloid Leukemia. Our work reveals important aspects of higher-order triadic interactions often ignored, which can transform our understanding of complex systems and be applied to a large variety of systems ranging from biology to the climate. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.01491 [pdf, other]

SUGAR: Pre-training 3D Visual Representations for Robotics

Authors: Shizhe Chen, Ricardo Garcia, Ivan Laptev, Cordelia Schmid

Abstract: Learning generalizable visual representations from Internet data has yielded promising results for robotics. Yet, prevailing approaches focus on pre-training 2D representations, being sub-optimal to deal with occlusions and accurately localize objects in complex 3D scenes. Meanwhile, 3D representation learning has been limited to single-object understanding. To address these limitations, we introd… ▽ More Learning generalizable visual representations from Internet data has yielded promising results for robotics. Yet, prevailing approaches focus on pre-training 2D representations, being sub-optimal to deal with occlusions and accurately localize objects in complex 3D scenes. Meanwhile, 3D representation learning has been limited to single-object understanding. To address these limitations, we introduce a novel 3D pre-training framework for robotics named SUGAR that captures semantic, geometric and affordance properties of objects through 3D point clouds. We underscore the importance of cluttered scenes in 3D representation learning, and automatically construct a multi-object dataset benefiting from cost-free supervision in simulation. SUGAR employs a versatile transformer-based model to jointly address five pre-training tasks, namely cross-modal knowledge distillation for semantic learning, masked point modeling to understand geometry structures, grasping pose synthesis for object affordance, 3D instance segmentation and referring expression grounding to analyze cluttered scenes. We evaluate our learned representation on three robotic-related tasks, namely, zero-shot 3D object recognition, referring expression grounding, and language-driven robotic manipulation. Experimental results show that SUGAR's 3D representation outperforms state-of-the-art 2D and 3D representations. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: Accepted to CVPR 2024. Project webpage: https://cshizhe.github.io/projects/robot_sugar.html

arXiv:2403.16795 [pdf, other]

doi 10.1145/3653697

"We Have No Idea How Models will Behave in Production until Production": How Engineers Operationalize Machine Learning

Authors: Shreya Shankar, Rolando Garcia, Joseph M Hellerstein, Aditya G Parameswaran

Abstract: Organizations rely on machine learning engineers (MLEs) to deploy models and maintain ML pipelines in production. Due to models' extensive reliance on fresh data, the operationalization of machine learning, or MLOps, requires MLEs to have proficiency in data science and engineering. When considered holistically, the job seems staggering -- how do MLEs do MLOps, and what are their unaddressed chall… ▽ More Organizations rely on machine learning engineers (MLEs) to deploy models and maintain ML pipelines in production. Due to models' extensive reliance on fresh data, the operationalization of machine learning, or MLOps, requires MLEs to have proficiency in data science and engineering. When considered holistically, the job seems staggering -- how do MLEs do MLOps, and what are their unaddressed challenges? To address these questions, we conducted semi-structured ethnographic interviews with 18 MLEs working on various applications, including chatbots, autonomous vehicles, and finance. We find that MLEs engage in a workflow of (i) data preparation, (ii) experimentation, (iii) evaluation throughout a multi-staged deployment, and (iv) continual monitoring and response. Throughout this workflow, MLEs collaborate extensively with data scientists, product stakeholders, and one another, supplementing routine verbal exchanges with communication tools ranging from Slack to organization-wide ticketing and reporting systems. We introduce the 3Vs of MLOps: velocity, visibility, and versioning -- three virtues of successful ML deployments that MLEs learn to balance and grow as they mature. Finally, we discuss design implications and opportunities for future work. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: arXiv admin note: text overlap with arXiv:2209.09125

Journal ref: Proc. ACM Hum.-Comput. Interact. 8, CSCW1, Article 206 (April 2024)

arXiv:2403.01068 [pdf, other]

Automated Continuous Force-Torque Sensor Bias Estimation

Authors: Philippe Nadeau, Miguel Rogel Garcia, Emmett Wise, Jonathan Kelly

Abstract: Six axis force-torque sensors are commonly attached to the wrist of serial robots to measure the external forces and torques acting on the robot's end-effector. These measurements are used for load identification, contact detection, and human-robot interaction amongst other applications. Typically, the measurements obtained from the force-torque sensor are more accurate than estimates computed fro… ▽ More Six axis force-torque sensors are commonly attached to the wrist of serial robots to measure the external forces and torques acting on the robot's end-effector. These measurements are used for load identification, contact detection, and human-robot interaction amongst other applications. Typically, the measurements obtained from the force-torque sensor are more accurate than estimates computed from joint torque readings, as the former is independent of the robot's dynamic and kinematic models. However, the force-torque sensor measurements are affected by a bias that drifts over time, caused by the compounding effects of temperature changes, mechanical stresses, and other factors. In this work, we present a pipeline that continuously estimates the bias and the drift of the bias of a force-torque sensor attached to the wrist of a robot. The first component of the pipeline is a Kalman filter that estimates the kinematic state (position, velocity, and acceleration) of the robot's joints. The second component is a kinematic model that maps the joint-space kinematics to the task-space kinematics of the force-torque sensor. Finally, the third component is a Kalman filter that estimates the bias and the drift of the bias of the force-torque sensor assuming that the inertial parameters of the gripper attached to the distal end of the force-torque sensor are known with certainty. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: Technical Report STARS-2024-001, University of Toronto Institute for Aerospace Studies (7 pages, 0 figure)

Report number: STARS-2024-001

arXiv:2401.10897 [pdf]

Transformations in the Time of The Transformer

Authors: Peyman Faratin, Ray Garcia, Jacomo Corbo

Abstract: Foundation models offer a new opportunity to redesign existing systems and workflows with a new AI first perspective. However, operationalizing this opportunity faces several challenges and tradeoffs. The goal of this article is to offer an organizational framework for making rational choices as enterprises start their transformation journey towards an AI first organization. The choices provided a… ▽ More Foundation models offer a new opportunity to redesign existing systems and workflows with a new AI first perspective. However, operationalizing this opportunity faces several challenges and tradeoffs. The goal of this article is to offer an organizational framework for making rational choices as enterprises start their transformation journey towards an AI first organization. The choices provided are holistic, intentional and informed while avoiding distractions. The field may appear to be moving fast, but there are core fundamental factors that are relatively more slow moving. We focus on these invariant factors to build the logic of the argument. △ Less

Submitted 25 January, 2024; v1 submitted 18 December, 2023; originally announced January 2024.

Comments: font issues and file title fixed

arXiv:2312.02401 [pdf, other]

Harmonizing Global Voices: Culturally-Aware Models for Enhanced Content Moderation

Authors: Alex J. Chan, José Luis Redondo García, Fabrizio Silvestri, Colm O'Donnel, Konstantina Palla

Abstract: Content moderation at scale faces the challenge of considering local cultural distinctions when assessing content. While global policies aim to maintain decision-making consistency and prevent arbitrary rule enforcement, they often overlook regional variations in interpreting natural language as expressed in content. In this study, we are looking into how moderation systems can tackle this issue b… ▽ More Content moderation at scale faces the challenge of considering local cultural distinctions when assessing content. While global policies aim to maintain decision-making consistency and prevent arbitrary rule enforcement, they often overlook regional variations in interpreting natural language as expressed in content. In this study, we are looking into how moderation systems can tackle this issue by adapting to local comprehension nuances. We train large language models on extensive datasets of media news and articles to create culturally attuned models. The latter aim to capture the nuances of communication across geographies with the goal of recognizing cultural and societal variations in what is considered offensive content. We further explore the capability of these models to generate explanations for instances of content violation, aiming to shed light on how policy guidelines are perceived when cultural and societal contexts change. We find that training on extensive media datasets successfully induced cultural awareness and resulted in improvements in handling content violations on a regional basis. Additionally, these advancements include the ability to provide explanations that align with the specific local norms and nuances as evidenced by the annotators' preference in our conducted study. This multifaceted success reinforces the critical role of an adaptable content moderation approach in keeping pace with the ever-evolving nature of the content it oversees. △ Less

Submitted 4 December, 2023; originally announced December 2023.

Comments: 12 pages, 8 Figures. Supplementary material

arXiv:2310.18845 [pdf, other]

doi 10.1145/3626252.3630780

Application of Collaborative Learning Paradigms within Software Engineering Education: A Systematic Mapping Study

Authors: Rita Garcia, Christoph Treude, Andrew Valentine

Abstract: Collaboration is used in Software Engineering (SE) to develop software. Industry seeks SE graduates with collaboration skills to contribute to productive software development. SE educators can use Collaborative Learning (CL) to help students develop collaboration skills. This paper uses a Systematic Mapping Study (SMS) to examine the application of the CL educational theory in SE Education. The SM… ▽ More Collaboration is used in Software Engineering (SE) to develop software. Industry seeks SE graduates with collaboration skills to contribute to productive software development. SE educators can use Collaborative Learning (CL) to help students develop collaboration skills. This paper uses a Systematic Mapping Study (SMS) to examine the application of the CL educational theory in SE Education. The SMS identified 14 papers published between 2011 and 2022. We used qualitative analysis to classify the papers into four CL paradigms: Conditions, Effect, Interactions, and Computer-Supported Collaborative Learning (CSCL). We found a high interest in CSCL, with a shift in student interaction research to computer-mediated technologies. We discussed the 14 papers in depth, describing their goals and further analysing the CSCL research. Almost half the papers did not achieve the appropriate level of supporting evidence; however, calibrating the instruments presented could strengthen findings and support multiple CL paradigms, especially opportunities to learn at the social and community levels, where research was lacking. Though our results demonstrate limited CL educational theory applied in SE Education, we discuss future work to layer the theory on existing study designs for more effective teaching strategies. △ Less

Submitted 28 October, 2023; originally announced October 2023.

Comments: 7 pages

arXiv:2310.12133 [pdf, ps, other]

A comprehensible analysis of the efficacy of Ensemble Models for Bug Prediction

Authors: Ingrid Marçal, Rogério Eduardo Garcia

Abstract: The correctness of software systems is vital for their effective operation. It makes discovering and fixing software bugs an important development task. The increasing use of Artificial Intelligence (AI) techniques in Software Engineering led to the development of a number of techniques that can assist software developers in identifying potential bugs in code. In this paper, we present a comprehen… ▽ More The correctness of software systems is vital for their effective operation. It makes discovering and fixing software bugs an important development task. The increasing use of Artificial Intelligence (AI) techniques in Software Engineering led to the development of a number of techniques that can assist software developers in identifying potential bugs in code. In this paper, we present a comprehensible comparison and analysis of the efficacy of two AI-based approaches, namely single AI models and ensemble AI models, for predicting the probability of a Java class being buggy. We used two open-source Apache Commons Project's Java components for training and evaluating the models. Our experimental findings indicate that the ensemble of AI models can outperform the results of applying individual AI models. We also offer insight into the factors that contribute to the enhanced performance of the ensemble AI model. The presented results demonstrate the potential of using ensemble AI models to enhance bug prediction results, which could ultimately result in more reliable software systems. △ Less

Submitted 18 October, 2023; originally announced October 2023.

arXiv:2310.07898 [pdf, other]

FlorDB: Multiversion Hindsight Logging for Continuous Training

Authors: Rolando Garcia, Anusha Dandamudi, Gabriel Matute, Lehan Wan, Joseph Gonzalez, Joseph M. Hellerstein, Koushik Sen

Abstract: Production Machine Learning involves continuous training: hosting multiple versions of models over time, often with many model versions running at once. When model performance does not meet expectations, Machine Learning Engineers (MLEs) debug issues by exploring and analyzing numerous prior versions of code and training data to identify root causes and mitigate problems. Traditional debugging and… ▽ More Production Machine Learning involves continuous training: hosting multiple versions of models over time, often with many model versions running at once. When model performance does not meet expectations, Machine Learning Engineers (MLEs) debug issues by exploring and analyzing numerous prior versions of code and training data to identify root causes and mitigate problems. Traditional debugging and logging tools often fall short in managing this experimental, multi-version context. FlorDB introduces Multiversion Hindsight Logging, which allows engineers to use the most recent version's logging statements to query past versions, even when older versions logged different data. Log statement propagation enables consistent injection of logging statements into past code versions, regardless of changes to the codebase. Once log statements are propagated across code versions, the remaining challenge in Multiversion Hindsight Logging is to efficiently replay the new log statements based on checkpoints from previous runs. Finally, a coherent user experience is required to help MLEs debug across all versions of code and data. To this end, FlorDB presents a unified relational model for efficient handling of historical queries, offering a comprehensive view of the log history to simplify the exploration of past code iterations. We present a performance evaluation on diverse benchmarks confirming its scalability and the ability to deliver real-time query responses, leveraging query-based filtering and checkpoint-based parallelism for efficient replay. △ Less

Submitted 2 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

arXiv:2309.15596 [pdf, other]

PolarNet: 3D Point Clouds for Language-Guided Robotic Manipulation

Authors: Shizhe Chen, Ricardo Garcia, Cordelia Schmid, Ivan Laptev

Abstract: The ability for robots to comprehend and execute manipulation tasks based on natural language instructions is a long-term goal in robotics. The dominant approaches for language-guided manipulation use 2D image representations, which face difficulties in combining multi-view cameras and inferring precise 3D positions and relationships. To address these limitations, we propose a 3D point cloud based… ▽ More The ability for robots to comprehend and execute manipulation tasks based on natural language instructions is a long-term goal in robotics. The dominant approaches for language-guided manipulation use 2D image representations, which face difficulties in combining multi-view cameras and inferring precise 3D positions and relationships. To address these limitations, we propose a 3D point cloud based policy called PolarNet for language-guided manipulation. It leverages carefully designed point cloud inputs, efficient point cloud encoders, and multimodal transformers to learn 3D point cloud representations and integrate them with language instructions for action prediction. PolarNet is shown to be effective and data efficient in a variety of experiments conducted on the RLBench benchmark. It outperforms state-of-the-art 2D and 3D approaches in both single-task and multi-task learning. It also achieves promising results on a real robot. △ Less

Submitted 27 September, 2023; originally announced September 2023.

Comments: Accepted to CoRL 2023. Project website: https://www.di.ens.fr/willow/research/polarnet/

arXiv:2307.16244 [pdf, other]

A Review of Media Copyright Management using Blockchain Technologies from the Academic and Business Perspectives

Authors: Roberto García, Ana Cediel, Mercè Teixidó, Rosa Gil

Abstract: Blockchain technologies open new opportunities for media copyright management. To provide an overview of the main initiatives in this blockchain application area, we have first reviewed the existing academic literature. The review shows literature is still scarce and immature in many aspects, which is more evident when comparing it to initiatives coming from the industry. Blockchain has been recei… ▽ More Blockchain technologies open new opportunities for media copyright management. To provide an overview of the main initiatives in this blockchain application area, we have first reviewed the existing academic literature. The review shows literature is still scarce and immature in many aspects, which is more evident when comparing it to initiatives coming from the industry. Blockchain has been receiving significant inflows of venture capital and crowdfunding, which have boosted its progress in many fields, including its application to media management. Consequently, we have complemented the review with a business perspective. Existing reports about blockchain and media have been studied and consolidated into four prominent use cases. Moreover, each one has been illustrated through existing businesses already exploring them. Combining the academic and industry perspectives, we provide a more general and complete overview of current trends in media copyright management using blockchain technologies. △ Less

Submitted 30 July, 2023; originally announced July 2023.

Comments: 20 pages, 4 figures, 3 tables

arXiv:2307.15320 [pdf, other]

Robust Visual Sim-to-Real Transfer for Robotic Manipulation

Authors: Ricardo Garcia, Robin Strudel, Shizhe Chen, Etienne Arlaud, Ivan Laptev, Cordelia Schmid

Abstract: Learning visuomotor policies in simulation is much safer and cheaper than in the real world. However, due to discrepancies between the simulated and real data, simulator-trained policies often fail when transferred to real robots. One common approach to bridge the visual sim-to-real domain gap is domain randomization (DR). While previous work mainly evaluates DR for disembodied tasks, such as pose… ▽ More Learning visuomotor policies in simulation is much safer and cheaper than in the real world. However, due to discrepancies between the simulated and real data, simulator-trained policies often fail when transferred to real robots. One common approach to bridge the visual sim-to-real domain gap is domain randomization (DR). While previous work mainly evaluates DR for disembodied tasks, such as pose estimation and object detection, here we systematically explore visual domain randomization methods and benchmark them on a rich set of challenging robotic manipulation tasks. In particular, we propose an off-line proxy task of cube localization to select DR parameters for texture randomization, lighting randomization, variations of object colors and camera parameters. Notably, we demonstrate that DR parameters have similar impact on our off-line proxy task and on-line policies. We, hence, use off-line optimized DR parameters to train visuomotor policies in simulation and directly apply such policies to a real robot. Our approach achieves 93% success rate on average when tested on a diverse set of challenging manipulation tasks. Moreover, we evaluate the robustness of policies to visual variations in real scenes and show that our simulator-trained policies outperform policies learned using real but limited data. Code, simulation environment, real robot datasets and trained models are available at https://www.di.ens.fr/willow/research/robust_s2r/. △ Less

Submitted 28 July, 2023; originally announced July 2023.

arXiv:2303.13592 [pdf, other]

Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages

Authors: Zheng-Xin Yong, Ruochen Zhang, Jessica Zosa Forde, Skyler Wang, Arjun Subramonian, Holy Lovenia, Samuel Cahyawijaya, Genta Indra Winata, Lintang Sutawika, Jan Christian Blaise Cruz, Yin Lin Tan, Long Phan, Rowena Garcia, Thamar Solorio, Alham Fikri Aji

Abstract: While code-mixing is a common linguistic practice in many parts of the world, collecting high-quality and low-cost code-mixed data remains a challenge for natural language processing (NLP) research. The recent proliferation of Large Language Models (LLMs) compels one to ask: how capable are these systems in generating code-mixed data? In this paper, we explore prompting multilingual LLMs in a zero… ▽ More While code-mixing is a common linguistic practice in many parts of the world, collecting high-quality and low-cost code-mixed data remains a challenge for natural language processing (NLP) research. The recent proliferation of Large Language Models (LLMs) compels one to ask: how capable are these systems in generating code-mixed data? In this paper, we explore prompting multilingual LLMs in a zero-shot manner to generate code-mixed data for seven languages in South East Asia (SEA), namely Indonesian, Malay, Chinese, Tagalog, Vietnamese, Tamil, and Singlish. We find that publicly available multilingual instruction-tuned models such as BLOOMZ and Flan-T5-XXL are incapable of producing texts with phrases or clauses from different languages. ChatGPT exhibits inconsistent capabilities in generating code-mixed texts, wherein its performance varies depending on the prompt template and language pairing. For instance, ChatGPT generates fluent and natural Singlish texts (an English-based creole spoken in Singapore), but for English-Tamil language pair, the system mostly produces grammatically incorrect or semantically meaningless utterances. Furthermore, it may erroneously introduce languages not specified in the prompt. Based on our investigation, existing multilingual LLMs exhibit a wide range of proficiency in code-mixed data generation for SEA languages. As such, we advise against using LLMs in this context without extensive human checks. △ Less

Submitted 12 September, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

Comments: Updating Authors

arXiv:2303.09727 [pdf, other]

Towards Understanding the Open Source Interest in Gender-Related GitHub Projects

Authors: Rita Garcia, Christoph Treude, Wendy La

Abstract: The open-source community uses the GitHub platform to exchange and share software applications and services of interest. This paper aims to identify the open-source community's interest in gender-related projects on GitHub. Our findings create research opportunities and identify resources by the open-source community that promote diversity, equity, and inclusion. We use data mining to identify Git… ▽ More The open-source community uses the GitHub platform to exchange and share software applications and services of interest. This paper aims to identify the open-source community's interest in gender-related projects on GitHub. Our findings create research opportunities and identify resources by the open-source community that promote diversity, equity, and inclusion. We use data mining to identify GitHub projects that focus on gender-related topics. We apply quantitative and qualitative methodologies to examine the projects' attributes and to classify them within a gender social structure and a gender bias taxonomy. We aim to understand the open-source community's efforts and interests in gender topics through active projects. In this paper, we report on a preponderance of projects focusing on specific gender topics and identify those with a narrow focus. We examine projects focusing on gender bias and how they address this non-inclusive behaviour. Results show a propensity of GitHub projects focusing on recognising and detecting an individual's gender and a dearth of projects concentrating on the cultural expectations placed on women and men. In the gender bias domain, the projects mainly focus on occupational biases. These findings raise opportunities to address the limited focus of GitHub on gender-related topics through developing projects that mitigate exclusive behaviours. △ Less

Submitted 16 March, 2023; originally announced March 2023.

Comments: 16th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE 2023)

arXiv:2302.12416 [pdf, other]

doi 10.1016/j.oceaneng.2023.115647

A Convolutional Vision Transformer for Semantic Segmentation of Side-Scan Sonar Data

Authors: Hayat Rajani, Nuno Gracias, Rafael Garcia

Abstract: Distinguishing among different marine benthic habitat characteristics is of key importance in a wide set of seabed operations ranging from installations of oil rigs to laying networks of cables and monitoring the impact of humans on marine ecosystems. The Side-Scan Sonar (SSS) is a widely used imaging sensor in this regard. It produces high-resolution seafloor maps by logging the intensities of so… ▽ More Distinguishing among different marine benthic habitat characteristics is of key importance in a wide set of seabed operations ranging from installations of oil rigs to laying networks of cables and monitoring the impact of humans on marine ecosystems. The Side-Scan Sonar (SSS) is a widely used imaging sensor in this regard. It produces high-resolution seafloor maps by logging the intensities of sound waves reflected back from the seafloor. In this work, we leverage these acoustic intensity maps to produce pixel-wise categorization of different seafloor types. We propose a novel architecture adapted from the Vision Transformer (ViT) in an encoder-decoder framework. Further, in doing so, the applicability of ViTs is evaluated on smaller datasets. To overcome the lack of CNN-like inductive biases, thereby making ViTs more conducive to applications in low data regimes, we propose a novel feature extraction module to replace the Multi-layer Perceptron (MLP) block within transformer layers and a novel module to extract multiscale patch embeddings. A lightweight decoder is also proposed to complement this design in order to further boost multiscale feature extraction. With the modified architecture, we achieve state-of-the-art results and also meet real-time computational requirements. We make our code available at ~\url{https://github.com/hayatrajani/s3seg-vit △ Less

Submitted 23 February, 2023; originally announced February 2023.

Comments: Submitted to Ocean Engineering special issue "Autonomous Marine Robotics Operations"

ACM Class: I.2.6; I.4.6; I.5.1; I.5.4

Journal ref: Ocean Engineering Volume 286, Part 2, 15 October 2023, 115647

arXiv:2212.06088 [pdf, other]

MIRA: Mental Imagery for Robotic Affordances

Authors: Lin Yen-Chen, Pete Florence, Andy Zeng, Jonathan T. Barron, Yilun Du, Wei-Chiu Ma, Anthony Simeonov, Alberto Rodriguez Garcia, Phillip Isola

Abstract: Humans form mental images of 3D scenes to support counterfactual imagination, planning, and motor control. Our abilities to predict the appearance and affordance of the scene from previously unobserved viewpoints aid us in performing manipulation tasks (e.g., 6-DoF kitting) with a level of ease that is currently out of reach for existing robot learning frameworks. In this work, we aim to build art… ▽ More Humans form mental images of 3D scenes to support counterfactual imagination, planning, and motor control. Our abilities to predict the appearance and affordance of the scene from previously unobserved viewpoints aid us in performing manipulation tasks (e.g., 6-DoF kitting) with a level of ease that is currently out of reach for existing robot learning frameworks. In this work, we aim to build artificial systems that can analogously plan actions on top of imagined images. To this end, we introduce Mental Imagery for Robotic Affordances (MIRA), an action reasoning framework that optimizes actions with novel-view synthesis and affordance prediction in the loop. Given a set of 2D RGB images, MIRA builds a consistent 3D scene representation, through which we synthesize novel orthographic views amenable to pixel-wise affordances prediction for action optimization. We illustrate how this optimization process enables us to generalize to unseen out-of-plane rotations for 6-DoF robotic manipulation tasks given a limited number of demonstrations, paving the way toward machines that autonomously learn to understand the world around them for planning actions. △ Less

Submitted 12 December, 2022; originally announced December 2022.

Comments: CoRL 2022, webpage: https://yenchenlin.me/mira

arXiv:2210.04868 [pdf, other]

Deep object detection for waterbird monitoring using aerial imagery

Authors: Krish Kabra, Alexander Xiong, Wenbin Li, Minxuan Luo, William Lu, Raul Garcia, Dhananjay Vijay, Jiahui Yu, Maojie Tang, Tianjiao Yu, Hank Arnold, Anna Vallery, Richard Gibbons, Arko Barman

Abstract: Monitoring of colonial waterbird nesting islands is essential to tracking waterbird population trends, which are used for evaluating ecosystem health and informing conservation management decisions. Recently, unmanned aerial vehicles, or drones, have emerged as a viable technology to precisely monitor waterbird colonies. However, manually counting waterbirds from hundreds, or potentially thousands… ▽ More Monitoring of colonial waterbird nesting islands is essential to tracking waterbird population trends, which are used for evaluating ecosystem health and informing conservation management decisions. Recently, unmanned aerial vehicles, or drones, have emerged as a viable technology to precisely monitor waterbird colonies. However, manually counting waterbirds from hundreds, or potentially thousands, of aerial images is both difficult and time-consuming. In this work, we present a deep learning pipeline that can be used to precisely detect, count, and monitor waterbirds using aerial imagery collected by a commercial drone. By utilizing convolutional neural network-based object detectors, we show that we can detect 16 classes of waterbird species that are commonly found in colonial nesting islands along the Texas coast. Our experiments using Faster R-CNN and RetinaNet object detectors give mean interpolated average precision scores of 67.9% and 63.1% respectively. △ Less

Submitted 13 October, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

Comments: Longer version of accepted short paper at 21st IEEE International Conference on Machine Learning and Applications (ICMLA'22). 7 pages, 5 figures

arXiv:2210.02742 [pdf, other]

Towards the Multiple Constant Multiplication at Minimal Hardware Cost

Authors: Rémi Garcia, Anastasia Volkova

Abstract: Multiple Constant Multiplication (MCM) over integers is a frequent operation arising in embedded systems that require highly optimized hardware. An efficient way is to replace costly generic multiplication by bit-shifts and additions, i.e. a multiplierless circuit. In this work, we improve the state-of-the-art optimal approach for MCM, based on Integer Linear Programming (ILP). We introduce a new… ▽ More Multiple Constant Multiplication (MCM) over integers is a frequent operation arising in embedded systems that require highly optimized hardware. An efficient way is to replace costly generic multiplication by bit-shifts and additions, i.e. a multiplierless circuit. In this work, we improve the state-of-the-art optimal approach for MCM, based on Integer Linear Programming (ILP). We introduce a new lower-level hardware cost, based on counting the number of one-bit adders and demonstrate that it is strongly correlated with the LUT count. This new model for the multiplierless MCM circuits permitted us to consider intermediate truncations that permit to significantly save resources when a full output precision is not required. We incorporate the error propagation rules into our ILP model to guarantee a user-given error bound on the MCM results. The proposed ILP models for multiple flavors of MCM are implemented as an open-source tool and, combined with the FloPoCo code generator, provide a complete coefficient-to-VHDL flow. We evaluate our models in extensive experiments, and propose an in-depth analysis of the impact that design metrics have on actually synthesized hardware. △ Less

Submitted 10 October, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

Comments: 10 pages, 3 tables, 6 figures, journal submission

arXiv:2209.15078 [pdf, ps, other]

Online Weighted Q-Ensembles for Reduced Hyperparameter Tuning in Reinforcement Learning

Authors: Renata Garcia, Wouter Caarls

Abstract: Reinforcement learning is a promising paradigm for learning robot control, allowing complex control policies to be learned without requiring a dynamics model. However, even state of the art algorithms can be difficult to tune for optimum performance. We propose employing an ensemble of multiple reinforcement learning agents, each with a different set of hyperparameters, along with a mechanism for… ▽ More Reinforcement learning is a promising paradigm for learning robot control, allowing complex control policies to be learned without requiring a dynamics model. However, even state of the art algorithms can be difficult to tune for optimum performance. We propose employing an ensemble of multiple reinforcement learning agents, each with a different set of hyperparameters, along with a mechanism for choosing the best performing set(s) on-line. In the literature, the ensemble technique is used to improve performance in general, but the current work specifically addresses decreasing the hyperparameter tuning effort. Furthermore, our approach targets on-line learning on a single robotic system, and does not require running multiple simulators in parallel. Although the idea is generic, the Deep Deterministic Policy Gradient was the model chosen, being a representative deep learning actor-critic method with good performance in continuous action settings but known high variance. We compare our online weighted q-ensemble approach to q-average ensemble strategies addressed in literature using alternate policy training, as well as online training, demonstrating the advantage of the new approach in eliminating hyperparameter tuning. The applicability to real-world systems was validated in common robotic benchmark environments: the bipedal robot half cheetah and the swimmer. Online Weighted Q-Ensemble presented overall lower variance and superior results when compared with q-average ensembles using randomized parameterizations. △ Less

Submitted 29 September, 2022; originally announced September 2022.

MSC Class: 68T40; 68T07; 68T05

arXiv:2209.09125 [pdf, other]

Operationalizing Machine Learning: An Interview Study

Authors: Shreya Shankar, Rolando Garcia, Joseph M. Hellerstein, Aditya G. Parameswaran

Abstract: Organizations rely on machine learning engineers (MLEs) to operationalize ML, i.e., deploy and maintain ML pipelines in production. The process of operationalizing ML, or MLOps, consists of a continual loop of (i) data collection and labeling, (ii) experimentation to improve ML performance, (iii) evaluation throughout a multi-staged deployment process, and (iv) monitoring of performance drops in p… ▽ More Organizations rely on machine learning engineers (MLEs) to operationalize ML, i.e., deploy and maintain ML pipelines in production. The process of operationalizing ML, or MLOps, consists of a continual loop of (i) data collection and labeling, (ii) experimentation to improve ML performance, (iii) evaluation throughout a multi-staged deployment process, and (iv) monitoring of performance drops in production. When considered together, these responsibilities seem staggering -- how does anyone do MLOps, what are the unaddressed challenges, and what are the implications for tool builders? We conducted semi-structured ethnographic interviews with 18 MLEs working across many applications, including chatbots, autonomous vehicles, and finance. Our interviews expose three variables that govern success for a production ML deployment: Velocity, Validation, and Versioning. We summarize common practices for successful ML experimentation, deployment, and sustaining production performance. Finally, we discuss interviewees' pain points and anti-patterns, with implications for tool design. △ Less

Submitted 16 September, 2022; originally announced September 2022.

Comments: 20 pages, 4 figures

arXiv:2209.04899 [pdf, other]

Instruction-driven history-aware policies for robotic manipulations

Authors: Pierre-Louis Guhur, Shizhe Chen, Ricardo Garcia, Makarand Tapaswi, Ivan Laptev, Cordelia Schmid

Abstract: In human environments, robots are expected to accomplish a variety of manipulation tasks given simple natural language instructions. Yet, robotic manipulation is extremely challenging as it requires fine-grained motor control, long-term memory as well as generalization to previously unseen tasks and environments. To address these challenges, we propose a unified transformer-based approach that tak… ▽ More In human environments, robots are expected to accomplish a variety of manipulation tasks given simple natural language instructions. Yet, robotic manipulation is extremely challenging as it requires fine-grained motor control, long-term memory as well as generalization to previously unseen tasks and environments. To address these challenges, we propose a unified transformer-based approach that takes into account multiple inputs. In particular, our transformer architecture integrates (i) natural language instructions and (ii) multi-view scene observations while (iii) keeping track of the full history of observations and actions. Such an approach enables learning dependencies between history and instructions and improves manipulation precision using multiple views. We evaluate our method on the challenging RLBench benchmark and on a real-world robot. Notably, our approach scales to 74 diverse RLBench tasks and outperforms the state of the art. We also address instruction-conditioned tasks and demonstrate excellent generalization to previously unseen variations. △ Less

Submitted 17 December, 2022; v1 submitted 11 September, 2022; originally announced September 2022.

Comments: Accepted in CoRL 2022 (oral); project page at https://guhur.github.io/hiveformer/

arXiv:2209.02748 [pdf]

Educating Educators to Integrate Inclusive Design Across a 4-Year CS Degree Program

Authors: Lara Letaw, Rosalinda Garcia, Patricia Morreale, Gail Verdi, Heather Garcia, Geraldine Jimena Noa, Spencer P. Madsen, Maria Jesus Alzugaray-Orellana, Margaret Burnett

Abstract: How can an entire CS faculty, who together have been teaching the ACM standard CS curricula, shift to teaching elements of inclusive design across a 4-year undergraduate CS program? And will they even want to try? To investigate these questions, we developed an educate-the-educators curriculum to support this shift. The overall goal of the educate-the-educators curriculum was to enable CS faculty… ▽ More How can an entire CS faculty, who together have been teaching the ACM standard CS curricula, shift to teaching elements of inclusive design across a 4-year undergraduate CS program? And will they even want to try? To investigate these questions, we developed an educate-the-educators curriculum to support this shift. The overall goal of the educate-the-educators curriculum was to enable CS faculty to creatively engage with embedding inclusive design into their courses in "minimally invasive" ways. GenderMag, an inclusive design evaluation method, was selected for use. The curriculum targeted the following learning outcomes: to enable CS faculty: (1) to analyze the costs and benefits of integrating inclusive design into their own course(s); (2) to evaluate software using the GenderMag method, and recognize its use to identify meaningful issues in software; (3) to integrate inclusive design into existing course materials with provided resources and collaboration; and (4) to prepare to engage and guide students on learning GenderMag concepts. We conducted a field study over a spring/summer followed by end-of-fall interviews, during which we worked with 18 faculty members to integrate inclusive design into 13 courses. Ten of these faculty then taught 7 of these courses that were on the Fall 2021 schedule, across 16 sections. We present the new educate-the-educators curriculum and report on the faculty's experiences acting upon it over the three-month field study and subsequent interviews. Our results showed that, of the 18 faculty we worked with, 83% chose to modify their courses; by Fall 2021, faculty across all four years of a CS degree program had begun teaching inclusive design concepts. When we followed up with the 10 Fall 2021 faculty, 91% of their reported outcomes indicated that the incorporations of inclusive design concepts in their courses went as well as or better than expected. △ Less

Submitted 6 September, 2022; originally announced September 2022.

arXiv:2208.14174 [pdf, other]

doi 10.1145/3585387

Semantics and Non-Fungible Tokens for Copyright Management on the Metaverse and Beyond

Authors: Roberto García, Ana Cediel, Mercè Teixidó, Rosa Gil

Abstract: Recent initiatives related to the Metaverse focus on better visualisation, like augmented or virtual reality, but also persistent digital objects. To guarantee real ownership of these digital objects, open systems based on public blockchains and Non-Fungible Tokens (NFTs) are emerging together with a nascent decentralized and open creator economy. To manage this emerging economy in a more organise… ▽ More Recent initiatives related to the Metaverse focus on better visualisation, like augmented or virtual reality, but also persistent digital objects. To guarantee real ownership of these digital objects, open systems based on public blockchains and Non-Fungible Tokens (NFTs) are emerging together with a nascent decentralized and open creator economy. To manage this emerging economy in a more organised way, and fight the so common NFT plagiarism, we propose CopyrightLY, a decentralized application for authorship and copyright management. It provides means to claim content authorship, including supporting evidence. Content and metadata are stored in decentralized storage and registered on the blockchain. A token is used to curate these claims, and potential complaints, by staking it on them. Staking is incentivized by the fact that the token is minted using a bonding curve. The tokenomics include the resolution of complaints and enabling the monetization of curated claims. Monetization is achieved through licensing NFTs with metadata enhanced by semantic technologies. Semantic data makes explicit the reuse conditions transferred with the token while keeping the connection to the underlying copyright claims to improve the trustability of the NFTs. Moreover, the semantic metadata is flexible enough to enable licensing not just in the real world. Licenses can refer to reuses in specific locations in a metaverse, thus facilitating the emergence of creative economies in them. △ Less

Submitted 25 August, 2022; originally announced August 2022.

Comments: 16 pages, 6 figures, 2 listings

arXiv:2206.08903 [pdf, other]

doi 10.1016/j.media.2023.102956

Colonoscopy 3D Video Dataset with Paired Depth from 2D-3D Registration

Authors: Taylor L. Bobrow, Mayank Golhar, Rohan Vijayan, Venkata S. Akshintala, Juan R. Garcia, Nicholas J. Durr

Abstract: Screening colonoscopy is an important clinical application for several 3D computer vision techniques, including depth estimation, surface reconstruction, and missing region detection. However, the development, evaluation, and comparison of these techniques in real colonoscopy videos remain largely qualitative due to the difficulty of acquiring ground truth data. In this work, we present a Colonosc… ▽ More Screening colonoscopy is an important clinical application for several 3D computer vision techniques, including depth estimation, surface reconstruction, and missing region detection. However, the development, evaluation, and comparison of these techniques in real colonoscopy videos remain largely qualitative due to the difficulty of acquiring ground truth data. In this work, we present a Colonoscopy 3D Video Dataset (C3VD) acquired with a high definition clinical colonoscope and high-fidelity colon models for benchmarking computer vision methods in colonoscopy. We introduce a novel multimodal 2D-3D registration technique to register optical video sequences with ground truth rendered views of a known 3D model. The different modalities are registered by transforming optical images to depth maps with a Generative Adversarial Network and aligning edge features with an evolutionary optimizer. This registration method achieves an average translation error of 0.321 millimeters and an average rotation error of 0.159 degrees in simulation experiments where error-free ground truth is available. The method also leverages video information, improving registration accuracy by 55.6% for translation and 60.4% for rotation compared to single frame registration. 22 short video sequences were registered to generate 10,015 total frames with paired ground truth depth, surface normals, optical flow, occlusion, six degree-of-freedom pose, coverage maps, and 3D models. The dataset also includes screening videos acquired by a gastroenterologist with paired ground truth pose and 3D surface models. The dataset and registration source code are available at durr.jhu.edu/C3VD. △ Less

Submitted 5 September, 2023; v1 submitted 17 June, 2022; originally announced June 2022.

arXiv:2205.01241 [pdf, ps, other]

Propositional Equality for Gradual Dependently Typed Programming

Authors: Joseph Eremondi, Ronald Garcia, Éric Tanter

Abstract: Gradual dependent types can help with the incremental adoption of dependently typed code by providing a principled semantics for imprecise types and proofs, where some parts have been omitted. Current theories of gradual dependent types, though, lack a central feature of type theory: propositional equality. Lennon-Bertrand et al. show that, when the reflexive proof $\mathit{refl}$ is the only clos… ▽ More Gradual dependent types can help with the incremental adoption of dependently typed code by providing a principled semantics for imprecise types and proofs, where some parts have been omitted. Current theories of gradual dependent types, though, lack a central feature of type theory: propositional equality. Lennon-Bertrand et al. show that, when the reflexive proof $\mathit{refl}$ is the only closed value of an equality type, a gradual extension of CIC with propositional equality violates static observational equivalences. Extensionally-equal functions should be indistinguishable at run time, but the combination of equality and type imprecision allows for contexts that distinguish extensionally-equal but syntactically-different functions. This work presents a gradually typed language that supports propositional equality. We avoid the above issues by devising an equality type where $\mathit{refl}$ is not the only closed inhabitant. Instead, each equality proof carries a term that is at least as precise as the equated terms, acting as a witness of their plausible equality. These witnesses track partial type information as a program runs, raising errors when that information shows that two equated terms are undeniably inconsistent. Composition of type information is internalized as a construct of the language, and is deferred for function bodies whose evaluation is blocked by variables. By deferring, we ensure that extensionally equal functions compose without error, thereby preventing contexts from distinguishing them. We describe the challenges of designing consistency and precision relations for this system, along with solutions to these challenges. Finally, we prove important metatheory: type-safety, conservative embedding of CIC, canonicity, and the gradual guarantees of Siek et al. △ Less

Submitted 2 May, 2022; originally announced May 2022.

Comments: Under submission to ICFP 2022

arXiv:2204.12051 [pdf, ps, other]

doi 10.1007/s00220-024-05030-6

Complexity of quantum circuits via sensitivity, magic, and coherence

Authors: Kaifeng Bu, Roy J. Garcia, Arthur Jaffe, Dax Enshan Koh, Lu Li

Abstract: Quantum circuit complexity-a measure of the minimum number of gates needed to implement a given unitary transformation-is a fundamental concept in quantum computation, with widespread applications ranging from determining the running time of quantum algorithms to understanding the physics of black holes. In this work, we study the complexity of quantum circuits using the notions of sensitivity, av… ▽ More Quantum circuit complexity-a measure of the minimum number of gates needed to implement a given unitary transformation-is a fundamental concept in quantum computation, with widespread applications ranging from determining the running time of quantum algorithms to understanding the physics of black holes. In this work, we study the complexity of quantum circuits using the notions of sensitivity, average sensitivity (also called influence), magic, and coherence. We characterize the set of unitaries with vanishing sensitivity and show that it coincides with the family of matchgates. Since matchgates are tractable quantum circuits, we have proved that sensitivity is necessary for a quantum speedup. As magic is another measure to quantify quantum advantage, it is interesting to understand the relation between magic and sensitivity. We do this by introducing a quantum version of the Fourier entropy-influence relation. Our results are pivotal for understanding the role of sensitivity, magic, and coherence in quantum computation. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: 42 pages

Journal ref: Commun. Math. Phys. 405,161 (2024)

arXiv:2203.07548 [pdf, other]

Physical Neural Cellular Automata for 2D Shape Classification

Authors: Kathryn Walker, Rasmus Berg Palm, Rodrigo Moreno Garcia, Andres Faina, Kasper Stoy, Sebastian Risi

Abstract: Materials with the ability to self-classify their own shape have the potential to advance a wide range of engineering applications and industries. Biological systems possess the ability not only to self-reconfigure but also to self-classify themselves to determine a general shape and function. Previous work into modular robotics systems has only enabled self-recognition and self-reconfiguration in… ▽ More Materials with the ability to self-classify their own shape have the potential to advance a wide range of engineering applications and industries. Biological systems possess the ability not only to self-reconfigure but also to self-classify themselves to determine a general shape and function. Previous work into modular robotics systems has only enabled self-recognition and self-reconfiguration into a specific target shape, missing the inherent robustness present in nature to self-classify. In this paper we therefore take advantage of recent advances in deep learning and neural cellular automata, and present a simple modular 2D robotic system that can infer its own class of shape through the local communication of its components. Furthermore, we show that our system can be successfully transferred to hardware which thus opens opportunities for future self-classifying machines. Code available at https://github.com/kattwalker/projectcube. Video available at https://youtu.be/0TCOkE4keyc. △ Less

Submitted 31 July, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

arXiv:2202.02551 [pdf, other]

Exploring the Dynamics of the Circumcenter Map

Authors: Nicholas McDonald, Ronaldo Garcia, Dan Reznik

Abstract: Using experimental techniques, we study properties of the "circumcenter map", which, upon $n$ iterations sends an $n$-gon to a scaled and rotated copy of itself. We also explore the topology of area-expanding and area-contracting regions induced by this map. Using experimental techniques, we study properties of the "circumcenter map", which, upon $n$ iterations sends an $n$-gon to a scaled and rotated copy of itself. We also explore the topology of area-expanding and area-contracting regions induced by this map. △ Less

Submitted 14 May, 2022; v1 submitted 5 February, 2022; originally announced February 2022.

Comments: 14 pages, 14 figures

MSC Class: 51N20; 37B20

arXiv:2201.09300 [pdf, other]

Poncelet Spatio-Temporal Surfaces and Tangles

Authors: Claudio Esperança, Ronaldo Garcia, Dan Reznik

Abstract: We explore geometric and topological properties of 3d surfaces swept by Poncelet triangles, as well as tangles formed by associated points. We explore geometric and topological properties of 3d surfaces swept by Poncelet triangles, as well as tangles formed by associated points. △ Less

Submitted 23 January, 2022; originally announced January 2022.

Comments: 8 pages, 9 figures, 4 live apps

MSC Class: 57M25; 97R60; 51H25; 37A10

arXiv:2112.15232 [pdf, other]

Triads of Conics Associated with a Triangle

Authors: Ronaldo Garcia, Liliana Gheorghe, Peter Moses, Dan Reznik

Abstract: We revisit constructions based on triads of conics with foci at pairs of vertices of a reference triangle. We find that their 6 vertices lie on well-known conics, whose type we analyze. We give conditions for these to be circles and/or degenerate. In the latter case, we study the locus of their center. We revisit constructions based on triads of conics with foci at pairs of vertices of a reference triangle. We find that their 6 vertices lie on well-known conics, whose type we analyze. We give conditions for these to be circles and/or degenerate. In the latter case, we study the locus of their center. △ Less

Submitted 20 July, 2022; v1 submitted 30 December, 2021; originally announced December 2021.

Comments: 24 pages, 24 figures, 17 references

MSC Class: 51M04 and 51N20 and 51N35 and 68T20

arXiv:2112.13956 [pdf, other]

A Blockchain-based Data Governance Framework with Privacy Protection and Provenance for e-Prescription

Authors: Rodrigo Dutra Garcia, Gowri Sankar Ramachandran, Raja Jurdak, Jo Ueyama

Abstract: Real-world applications in healthcare and supply chain domains produce, exchange, and share data in a multi-stakeholder environment. Data owners want to control their data and privacy in such settings. On the other hand, data consumers demand methods to understand when, how, and who produced the data. These requirements necessitate data governance frameworks that guarantee data provenance, privacy… ▽ More Real-world applications in healthcare and supply chain domains produce, exchange, and share data in a multi-stakeholder environment. Data owners want to control their data and privacy in such settings. On the other hand, data consumers demand methods to understand when, how, and who produced the data. These requirements necessitate data governance frameworks that guarantee data provenance, privacy protection, and consent management. We introduce a decentralized data governance framework based on blockchain technology and proxy re-encryption to let data owners control and track their data through privacy-enhancing and consent management mechanisms. Besides, our framework allows the data consumers to understand data lineage through a blockchain-based provenance mechanism. We have used Digital e-prescription as the use case since it has multiple stakeholders and sensitive data while enabling the medical fraternity to manage patients' prescription data, involving patients as data owners, doctors and pharmacists as data consumers. Our proof-of-concept implementation and evaluation results based on CosmWasm, Ethereum, and pyUmbral PRE show that the proposed decentralized system guarantees transparency, privacy, and trust with minimal overhead. △ Less

Submitted 27 December, 2021; originally announced December 2021.

arXiv:2112.02545 [pdf, other]

New Properties and Invariants of Harmonic Polygons

Authors: Ronaldo Garcia, Dan Reznik, Pedro Roitman

Abstract: Via simulation, we discover and prove curious new Euclidean properties and invariants of the Poncelet family of harmonic polygons. Via simulation, we discover and prove curious new Euclidean properties and invariants of the Poncelet family of harmonic polygons. △ Less

Submitted 28 September, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

Comments: 18 pages, 9 figures, 3 tables, 8 videos

MSC Class: 51M04; 51N20; 51N35; 68T20

arXiv:2111.00979 [pdf, other]

Parabola-Inscribed Poncelet Polygons Derived from the Bicentric Family

Authors: Filipe Bellio, Ronaldo Garcia, Dan Reznik

Abstract: We study loci and properties of a Parabola-inscribed family of Poncelet polygons whose caustic is a focus-centered circle. This family is the polar image of a special case of the bicentric family with respect to its circumcircle. We describe closure conditions, curious loci, and new conserved quantities. We study loci and properties of a Parabola-inscribed family of Poncelet polygons whose caustic is a focus-centered circle. This family is the polar image of a special case of the bicentric family with respect to its circumcircle. We describe closure conditions, curious loci, and new conserved quantities. △ Less

Submitted 25 January, 2022; v1 submitted 1 November, 2021; originally announced November 2021.

Comments: 20 pages, 17 figures, 2 tables

MSC Class: 00A72; 00A08; 37-04; 37M05; 51M04; 51N20

arXiv:2110.06356 [pdf, other]

Poncelet Parabola Pirouettes

Authors: Dan Reznik, Ronaldo Garcia

Abstract: We describe some three-dozen curious phenomena manifested by parabolas inscribed or circumscribed about certain Poncelet triangle families. Despite their pirouetting motion, parabolas' focus, vertex, directrix, etc., will often sweep or envelop rather elementary loci such as lines, circles, or points. Most phenomena are unproven though supported by solid numerical evidence (proofs are welcome). So… ▽ More We describe some three-dozen curious phenomena manifested by parabolas inscribed or circumscribed about certain Poncelet triangle families. Despite their pirouetting motion, parabolas' focus, vertex, directrix, etc., will often sweep or envelop rather elementary loci such as lines, circles, or points. Most phenomena are unproven though supported by solid numerical evidence (proofs are welcome). Some yet unrealized experiments are posed as "challenges" (results are welcome!). △ Less

Submitted 10 October, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

Comments: 24 pages, 23 figures

MSC Class: 51M04; 51N20; 51N35; 68T20

arXiv:2109.08192 [pdf, ps, other]

BuDDI: A Declarative Bloom Language for CALM Programming

Authors: Rolando Garcia, Giulia Guidi

Abstract: Coordination protocols help programmers of distributed systems reason about the effects of transactions on the state of the system, but they're not cheap. Coordination protocols may involve multiple rounds of communication, which can hurt system responsiveness. There exist many efforts in distributed computing for managing the coordination-performance trade-off. More recent is a line of work that… ▽ More Coordination protocols help programmers of distributed systems reason about the effects of transactions on the state of the system, but they're not cheap. Coordination protocols may involve multiple rounds of communication, which can hurt system responsiveness. There exist many efforts in distributed computing for managing the coordination-performance trade-off. More recent is a line of work that characterizes the class of workloads for which coordination is not necessary for consistency: namely, logically monotonic programs. In this paper, we present a case study of logical monotonicity in workloads typical to computational biology. We leverage the Bloom language to write efficient distributed programs, and compare their performance to equivalent programs written in UPC++, a popular language for writing distributed programs. Additionally, we leverage Bloom's analysis tools to identify points-of-coordination, and use our own experience using Bloom to recommend some higher-level abstractions for users without strong distributed computing backgrounds. △ Less

Submitted 20 September, 2021; v1 submitted 16 September, 2021; originally announced September 2021.

arXiv:2108.05430 [pdf, other]

Loci of Poncelet Triangles in the General Closure Case

Authors: Ronaldo Garcia, Boris Odehnal, Dan Reznik

Abstract: We analyze loci of triangle centers over variants of two-well known triangle porisms: the bicentric and confocal families. Specifically, we evoke the general version of Poncelet's closure theorem whereby individual sides can be made tangent to separate in-pencil caustics. We show that despite the more complicated dynamic geometry, the locus of certain triangle centers and associated points remain… ▽ More We analyze loci of triangle centers over variants of two-well known triangle porisms: the bicentric and confocal families. Specifically, we evoke the general version of Poncelet's closure theorem whereby individual sides can be made tangent to separate in-pencil caustics. We show that despite the more complicated dynamic geometry, the locus of certain triangle centers and associated points remain conics and/or circles. △ Less

Submitted 24 January, 2022; v1 submitted 11 August, 2021; originally announced August 2021.

Comments: 24 pages, 18 figures, 4 tables, 18 video links

MSC Class: 51M04 and 51N20 and 51N35 and 68T20

arXiv:2108.01565 [pdf, other]

doi 10.1109/TSP.2022.3161158

Hardware-aware Design of Multiplierless Second-Order IIR Filters with Minimum Adders

Authors: Rémi Garcia, Anastasia Volkova, Martin Kumm, Alexandre Goldsztejn, Jonas Kühle

Abstract: In this work, we optimally solve the problem of multiplierless design of second-order Infinite Impulse Response filters with minimum number of adders. Given a frequency specification, we design a stable direct form filter with hardware-aware fixed-point coefficients that yielding minimal number of adders when replacing all the multiplications by bit shifts and additions. The coefficient design, qu… ▽ More In this work, we optimally solve the problem of multiplierless design of second-order Infinite Impulse Response filters with minimum number of adders. Given a frequency specification, we design a stable direct form filter with hardware-aware fixed-point coefficients that yielding minimal number of adders when replacing all the multiplications by bit shifts and additions. The coefficient design, quantization and implementation, typically conducted independently, are now gathered into one global optimization problem, modeled through integer linear programming and efficiently solved using generic solvers. We guarantee the frequency-domain specifications and stability, which together with optimal number of adders will significantly simplify design-space exploration for filter designers. The optimal filters are implemented within the FloPoCo IP core generator and synthesized for Field Programmable Gate Arrays. With respect to state-of-the-art three-step filter design methods, our one-step design approach achieves, on average, 42% reduction in the number of lookup tables and 21% improvement in delay. △ Less

Submitted 3 August, 2021; originally announced August 2021.

arXiv:2107.04859 [pdf, ps, other]

Approximate Normalization and Eager Equality Checking for Gradual Inductive Families

Authors: Joseph Eremondi, Ronald Garcia, Éric Tanter

Abstract: Harnessing the power of dependently typed languages can be difficult. Programmers must manually construct proofs to produce well-typed programs, which is not an easy task. In particular, migrating code to these languages is challenging. Gradual typing can make dependently-typed languages easier to use by mixing static and dynamic checking in a principled way. With gradual types, programmers can in… ▽ More Harnessing the power of dependently typed languages can be difficult. Programmers must manually construct proofs to produce well-typed programs, which is not an easy task. In particular, migrating code to these languages is challenging. Gradual typing can make dependently-typed languages easier to use by mixing static and dynamic checking in a principled way. With gradual types, programmers can incrementally migrate code to a dependently typed language. However, adding gradual types to dependent types creates a new challenge: mixing decidable type-checking and incremental migration in a full-featured language is a precarious balance. Programmers expect type-checking to terminate, but dependent type-checkers evaluate terms at compile time, which is problematic because gradual types can introduce non-termination into an otherwise terminating language. Steps taken to mitigate this non-termination must not jeopardize the smooth transitions between dynamic and static. We present a gradual dependently-typed language that supports inductive type families, has decidable type-checking, and provably supports smooth migration between static and dynamic, as codified by the refined criteria for gradual typing proposed by Siek et al. (2015). Like Eremondi et al. (2019), we use approximate normalization for terminating compile-time evaluation. Unlike Eremondi et al., our normalization does not require comparison of variables, allowing us to show termination with a syntactic model that accommodates inductive types. Moreover, we design a novel a technique for tracking constraints on type indices, so that dynamic constraint violations signal run-time errors eagerly. To facilitate these checks, we define an algebraic notion of gradual precision, axiomatizing certain semantic properties of gradual terms. △ Less

Submitted 10 July, 2021; originally announced July 2021.

arXiv:2106.06505 [pdf, other]

Efficient Deep Learning Architectures for Fast Identification of Bacterial Strains in Resource-Constrained Devices

Authors: R. Gallardo García, S. Jarquín Rodríguez, B. Beltrán Martínez, C. Hernández Gracidas, R. Martínez Torres

Abstract: This work presents twelve fine-tuned deep learning architectures to solve the bacterial classification problem over the Digital Image of Bacterial Species Dataset. The base architectures were mainly published as mobile or efficient solutions to the ImageNet challenge, and all experiments presented in this work consisted of making several modifications to the original designs, in order to make them… ▽ More This work presents twelve fine-tuned deep learning architectures to solve the bacterial classification problem over the Digital Image of Bacterial Species Dataset. The base architectures were mainly published as mobile or efficient solutions to the ImageNet challenge, and all experiments presented in this work consisted of making several modifications to the original designs, in order to make them able to solve the bacterial classification problem by using fine-tuning and transfer learning techniques. This work also proposes a novel data augmentation technique for this dataset, which is based on the idea of artificial zooming, strongly increasing the performance of every tested architecture, even doubling it in some cases. In order to get robust and complete evaluations, all experiments were performed with 10-fold cross-validation and evaluated with five different metrics: top-1 and top-5 accuracy, precision, recall, and F1 score. This paper presents a complete comparison of the twelve different architectures, cross-validated with the original and the augmented version of the dataset, the results are also compared with several literature methods. Overall, eight of the eleven architectures surpassed the 0.95 scores in top-1 accuracy with our data augmentation method, being 0.9738 the highest top-1 accuracy. The impact of the data augmentation technique is reported with relative improvement scores. △ Less

Submitted 11 June, 2021; originally announced June 2021.

Comments: 22 pages, 2 figures, 5 tables. Submitted to Multimedia Tools and Applications, issue 1218 - Engineering Tools and Applications in Medical Imaging (currently in reviewing process)

MSC Class: 68T07 (Primary); 68U10 (Secondary) ACM Class: I.4; J.3

arXiv:2106.04565 [pdf, other]

Translate, then Parse! A strong baseline for Cross-Lingual AMR Parsing

Authors: Sarah Uhrig, Yoalli Rezepka Garcia, Juri Opitz, Anette Frank

Abstract: In cross-lingual Abstract Meaning Representation (AMR) parsing, researchers develop models that project sentences from various languages onto their AMRs to capture their essential semantic structures: given a sentence in any language, we aim to capture its core semantic content through concepts connected by manifold types of semantic relations. Methods typically leverage large silver training data… ▽ More In cross-lingual Abstract Meaning Representation (AMR) parsing, researchers develop models that project sentences from various languages onto their AMRs to capture their essential semantic structures: given a sentence in any language, we aim to capture its core semantic content through concepts connected by manifold types of semantic relations. Methods typically leverage large silver training data to learn a single model that is able to project non-English sentences to AMRs. However, we find that a simple baseline tends to be over-looked: translating the sentences to English and projecting their AMR with a monolingual AMR parser (translate+parse,T+P). In this paper, we revisit this simple two-step base-line, and enhance it with a strong NMT system and a strong AMR parser. Our experiments show that T+P outperforms a recent state-of-the-art system across all tested languages: German, Italian, Spanish and Mandarin with +14.6, +12.6, +14.3 and +16.0 Smatch points. △ Less

Submitted 8 June, 2021; originally announced June 2021.

Comments: IWPT 2021

arXiv:2106.00715 [pdf, other]

Poncelet Triangles: a Theory for Locus Ellipticity

Authors: Mark Helman, Dominique Laurain, Dan Reznik, Ronaldo Garcia

Abstract: We present a theory which predicts if the locus of a triangle center over certain Poncelet triangle families is a conic or not. We consider families interscribed in (i) the confocal pair and (ii) an outer ellipse and an inner concentric circular caustic. Previously, determining if a locus was a conic was done on a case-by-case basis. In the confocal case, we also derive conditions under which a lo… ▽ More We present a theory which predicts if the locus of a triangle center over certain Poncelet triangle families is a conic or not. We consider families interscribed in (i) the confocal pair and (ii) an outer ellipse and an inner concentric circular caustic. Previously, determining if a locus was a conic was done on a case-by-case basis. In the confocal case, we also derive conditions under which a locus degenerates to a segment or a circle. We show the locus' turning number is +/- 3, while predicting its monotonicity with respect to the motion of a vertex of the triangle family. △ Less

Submitted 12 December, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

Comments: 10 pages, 5 figures, 2 tables, and 6 video links

MSC Class: 51M04; 51N20; 51N35; 68T20

arXiv:2105.05633 [pdf, other]

Segmenter: Transformer for Semantic Segmentation

Authors: Robin Strudel, Ricardo Garcia, Ivan Laptev, Cordelia Schmid

Abstract: Image segmentation is often ambiguous at the level of individual image patches and requires contextual information to reach label consensus. In this paper we introduce Segmenter, a transformer model for semantic segmentation. In contrast to convolution-based methods, our approach allows to model global context already at the first layer and throughout the network. We build on the recent Vision Tra… ▽ More Image segmentation is often ambiguous at the level of individual image patches and requires contextual information to reach label consensus. In this paper we introduce Segmenter, a transformer model for semantic segmentation. In contrast to convolution-based methods, our approach allows to model global context already at the first layer and throughout the network. We build on the recent Vision Transformer (ViT) and extend it to semantic segmentation. To do so, we rely on the output embeddings corresponding to image patches and obtain class labels from these embeddings with a point-wise linear decoder or a mask transformer decoder. We leverage models pre-trained for image classification and show that we can fine-tune them on moderate sized datasets available for semantic segmentation. The linear decoder allows to obtain excellent results already, but the performance can be further improved by a mask transformer generating class masks. We conduct an extensive ablation study to show the impact of the different parameters, in particular the performance is better for large models and small patch sizes. Segmenter attains excellent results for semantic segmentation. It outperforms the state of the art on both ADE20K and Pascal Context datasets and is competitive on Cityscapes. △ Less

Submitted 2 September, 2021; v1 submitted 12 May, 2021; originally announced May 2021.

Comments: ICCV 2021. Code available at https://github.com/rstrudel/segmenter

arXiv:2104.13174 [pdf, other]

doi 10.1007/s13366-021-00596-x

Poncelet Plectra: Harmonious Curves in Cosine Space

Authors: Daniel Jaud, Dan Reznik, Ronaldo Garcia

Abstract: It has been shown that the family of Poncelet N-gons in the confocal pair (elliptic billiard) conserves the sum of cosines of its internal angles. Curiously, this quantity is equal to the sum of cosines conserved by its affine image where the caustic is a circle. We show that furthermore, (i) when N=3, the cosine triples of both families sweep the same planar curve: an equilateral cubic resembling… ▽ More It has been shown that the family of Poncelet N-gons in the confocal pair (elliptic billiard) conserves the sum of cosines of its internal angles. Curiously, this quantity is equal to the sum of cosines conserved by its affine image where the caustic is a circle. We show that furthermore, (i) when N=3, the cosine triples of both families sweep the same planar curve: an equilateral cubic resembling a plectrum (guitar pick). We also show that (ii) the family of triangles excentral to the confocal family conserves the same product of cosines as the one conserved by its affine image inscribed in a circle; and that (iii) cosine triples of both families sweep the same spherical curve. When the triple of log-cosines is considered, this curve becomes a planar, plectrum-shaped curve, rounder than the one swept by its parent confocal family. △ Less

Submitted 29 August, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

Comments: 15 pages, 13 figures, 7 video links

MSC Class: 51N20; 51M04; 65-05

Journal ref: Beitraege zur Algebra und Geometrie 2022

arXiv:2103.11260 [pdf, other]

doi 10.1007/s40598-021-00188-6

New Invariants of Poncelet-Jacobi Bicentric Polygons

Authors: Pedro Roitman, Ronaldo Garcia, Dan Reznik

Abstract: The 1d family of Poncelet polygons interscribed between two circles is known as the Bicentric family. Using elliptic functions and Liouville's theorem, we show (i) that this family has invariant sum of internal angle cosines and (ii) that the pedal polygons with respect to the family's limiting points have invariant perimeter. Interestingly, both (i) and (ii) are also properties of elliptic billia… ▽ More The 1d family of Poncelet polygons interscribed between two circles is known as the Bicentric family. Using elliptic functions and Liouville's theorem, we show (i) that this family has invariant sum of internal angle cosines and (ii) that the pedal polygons with respect to the family's limiting points have invariant perimeter. Interestingly, both (i) and (ii) are also properties of elliptic billiard N-periodics. Furthermore, since the pedal polygons in (ii) are identical to inversions of elliptic billiard N-periodics with respect to a focus-centered circle, an important corollary is that (iii) elliptic billiard focus-inversive N-gons have constant perimeter. Interestingly, these also conserve their sum of cosines (except for the N=4 case). △ Less

Submitted 13 July, 2021; v1 submitted 20 March, 2021; originally announced March 2021.

Comments: 17 pages, 6 figures, 1 table with 18 video links

MSC Class: 51M04; 51N20; 51N35; 68T20

arXiv:2103.04691 [pdf, other]

Meta-Learning with MAML on Trees

Authors: Jezabel R. Garcia, Federica Freddi, Feng-Ting Liao, Jamie McGowan, Tim Nieradzik, Da-shan Shiu, Ye Tian, Alberto Bernacchia

Abstract: In meta-learning, the knowledge learned from previous tasks is transferred to new ones, but this transfer only works if tasks are related. Sharing information between unrelated tasks might hurt performance, and it is unclear how to transfer knowledge across tasks with a hierarchical structure. Our research extends a model agnostic meta-learning model, MAML, by exploiting hierarchical task relation… ▽ More In meta-learning, the knowledge learned from previous tasks is transferred to new ones, but this transfer only works if tasks are related. Sharing information between unrelated tasks might hurt performance, and it is unclear how to transfer knowledge across tasks with a hierarchical structure. Our research extends a model agnostic meta-learning model, MAML, by exploiting hierarchical task relationships. Our algorithm, TreeMAML, adapts the model to each task with a few gradient steps, but the adaptation follows the hierarchical tree structure: in each step, gradients are pooled across tasks clusters, and subsequent steps follow down the tree. We also implement a clustering algorithm that generates the tasks tree without previous knowledge of the task structure, allowing us to make use of implicit relationships between the tasks. We show that the new algorithm, which we term TreeMAML, performs better than MAML when the task structure is hierarchical for synthetic experiments. To study the performance of the method in real-world data, we apply this method to Natural Language Understanding, we use our algorithm to finetune Language Models taking advantage of the language phylogenetic tree. We show that TreeMAML improves the state of the art results for cross-lingual Natural Language Inference. This result is useful, since most languages in the world are under-resourced and the improvement on cross-lingual transfer allows the internationalization of NLP models. This results open the window to use this algorithm in other real-world hierarchical datasets. △ Less

Submitted 8 March, 2021; originally announced March 2021.

Comments: Updated version of paper in EACL workshop: Adapt-NLP 2021

arXiv:2102.09438 [pdf, ps, other]

Invariant Center Power and Elliptic Loci of Poncelet Triangles

Authors: Mark Helman, Dominique Laurain, Ronaldo Garcia, Dan Reznik

Abstract: We study center power with respect to circles derived from Poncelet 3-periodics (triangles) in a generic pair of ellipses as well as loci of their triangle centers. We show that (i) for any concentric pair, the power of the center with respect to either circumcircle or Euler's circle is invariant, and (ii) if a triangle center of a 3-periodic in a generic nested pair is a fixed affine combination… ▽ More We study center power with respect to circles derived from Poncelet 3-periodics (triangles) in a generic pair of ellipses as well as loci of their triangle centers. We show that (i) for any concentric pair, the power of the center with respect to either circumcircle or Euler's circle is invariant, and (ii) if a triangle center of a 3-periodic in a generic nested pair is a fixed affine combination of barycenter and circumcenter, its locus over the family is an ellipse. △ Less

Submitted 16 April, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

Comments: 25 pages, 16 figures, 6 tables, 8 video links, 7 live app links

MSC Class: 51M04; 51N20; 51N35; 68T20

arXiv:2102.04566 [pdf, other]

doi 10.1016/j.jag.2022.102690

Semantic Segmentation with Labeling Uncertainty and Class Imbalance

Authors: Patrik Olã Bressan, José Marcato Junior, José Augusto Correa Martins, Diogo Nunes Gonçalves, Daniel Matte Freitas, Lucas Prado Osco, Jonathan de Andrade Silva, Zhipeng Luo, Jonathan Li, Raymundo Cordero Garcia, Wesley Nunes Gonçalves

Abstract: Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pix… ▽ More Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pixel-wise weights are used during training to increase or decrease the importance of the pixels. Experimental results show that the proposed approach leads to significant improvements in three challenging segmentation tasks in comparison to baseline methods. It was also proved to be more invariant to noise. The approach presented here may be used within a wide range of semantic segmentation methods to improve their robustness. △ Less

Submitted 8 February, 2021; originally announced February 2021.

Comments: 15 pages, 9 figures, 3 tables

MSC Class: 68T07 ACM Class: I.2.1

Journal ref: International Journal of Applied Earth Observation and Geoinformation, 2022

Showing 1–50 of 97 results for author: Garcia, R