subscribe to arXiv mailings

Visual Action Planning with Multiple Heterogeneous Agents

Authors: Martina Lippi, Michael C. Welle, Marco Moletta, Alessandro Marino, Andrea Gasparri, Danica Kragic

Abstract: Visual planning methods are promising to handle complex settings where extracting the system state is challenging. However, none of the existing works tackles the case of multiple heterogeneous agents which are characterized by different capabilities and/or embodiment. In this work, we propose a method to realize visual action planning in multi-agent settings by exploiting a roadmap built in a low… ▽ More Visual planning methods are promising to handle complex settings where extracting the system state is challenging. However, none of the existing works tackles the case of multiple heterogeneous agents which are characterized by different capabilities and/or embodiment. In this work, we propose a method to realize visual action planning in multi-agent settings by exploiting a roadmap built in a low-dimensional structured latent space and used for planning. To enable multi-agent settings, we infer possible parallel actions from a dataset composed of tuples associated with individual actions. Next, we evaluate feasibility and cost of them based on the capabilities of the multi-agent system and endow the roadmap with this information, building a capability latent space roadmap (C-LSR). Additionally, a capability suggestion strategy is designed to inform the human operator about possible missing capabilities when no paths are found. The approach is validated in a simulated burger cooking task and a real-world box packing task. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.16764 [pdf, other]

Low-Cost Teleoperation with Haptic Feedback through Vision-based Tactile Sensors for Rigid and Soft Object Manipulation

Authors: Martina Lippi, Michael C. Welle, Maciej K. Wozniak, Andrea Gasparri, Danica Kragic

Abstract: Haptic feedback is essential for humans to successfully perform complex and delicate manipulation tasks. A recent rise in tactile sensors has enabled robots to leverage the sense of touch and expand their capability drastically. However, many tasks still need human intervention/guidance. For this reason, we present a teleoperation framework designed to provide haptic feedback to human operators ba… ▽ More Haptic feedback is essential for humans to successfully perform complex and delicate manipulation tasks. A recent rise in tactile sensors has enabled robots to leverage the sense of touch and expand their capability drastically. However, many tasks still need human intervention/guidance. For this reason, we present a teleoperation framework designed to provide haptic feedback to human operators based on the data from camera-based tactile sensors mounted on the robot gripper. Partial autonomy is introduced to prevent slippage of grasped objects during task execution. Notably, we rely exclusively on low-cost off-the-shelf hardware to realize an affordable solution. We demonstrate the versatility of the framework on nine different objects ranging from rigid to soft and fragile ones, using three different operators on real hardware. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: https://vision-tactile-manip.github.io/teleop/

arXiv:2402.17431 [pdf, other]

The KANDY Benchmark: Incremental Neuro-Symbolic Learning and Reasoning with Kandinsky Patterns

Authors: Luca Salvatore Lorello, Marco Lippi, Stefano Melacci

Abstract: Artificial intelligence is continuously seeking novel challenges and benchmarks to effectively measure performance and to advance the state-of-the-art. In this paper we introduce KANDY, a benchmarking framework that can be used to generate a variety of learning and reasoning tasks inspired by Kandinsky patterns. By creating curricula of binary classification tasks with increasing complexity and wi… ▽ More Artificial intelligence is continuously seeking novel challenges and benchmarks to effectively measure performance and to advance the state-of-the-art. In this paper we introduce KANDY, a benchmarking framework that can be used to generate a variety of learning and reasoning tasks inspired by Kandinsky patterns. By creating curricula of binary classification tasks with increasing complexity and with sparse supervisions, KANDY can be used to implement benchmarks for continual and semi-supervised learning, with a specific focus on symbol compositionality. Classification rules are also provided in the ground truth to enable analysis of interpretable solutions. Together with the benchmark generation pipeline, we release two curricula, an easier and a harder one, that we propose as new challenges for the research community. With a thorough experimental evaluation, we show how both state-of-the-art neural models and purely symbolic approaches struggle with solving most of the tasks, thus calling for the application of advanced neuro-symbolic methods trained over time. △ Less

Submitted 27 February, 2024; originally announced February 2024.

arXiv:2402.00013 [pdf, other]

No More Trade-Offs. GPT and Fully Informative Privacy Policies

Authors: Przemysław Pałka, Marco Lippi, Francesca Lagioia, Rūta Liepiņa, Giovanni Sartor

Abstract: The paper reports the results of an experiment aimed at testing to what extent ChatGPT 3.5 and 4 is able to answer questions regarding privacy policies designed in the new format that we propose. In a world of human-only interpreters, there was a trade-off between comprehensiveness and comprehensibility of privacy policies, leading to the actual policies not containing enough information for users… ▽ More The paper reports the results of an experiment aimed at testing to what extent ChatGPT 3.5 and 4 is able to answer questions regarding privacy policies designed in the new format that we propose. In a world of human-only interpreters, there was a trade-off between comprehensiveness and comprehensibility of privacy policies, leading to the actual policies not containing enough information for users to learn anything meaningful. Having shown that GPT performs relatively well with the new format, we provide experimental evidence supporting our policy suggestion, namely that the law should require fully comprehensive privacy policies, even if this means they become less concise. △ Less

Submitted 27 December, 2023; originally announced February 2024.

arXiv:2306.05791 [pdf, other]

Enabling Robot Manipulation of Soft and Rigid Objects with Vision-based Tactile Sensors

Authors: Michael C. Welle, Martina Lippi, Haofei Lu, Jens Lundell, Andrea Gasparri, Danica Kragic

Abstract: Endowing robots with tactile capabilities opens up new possibilities for their interaction with the environment, including the ability to handle fragile and/or soft objects. In this work, we equip the robot gripper with low-cost vision-based tactile sensors and propose a manipulation algorithm that adapts to both rigid and soft objects without requiring any knowledge of their properties. The algor… ▽ More Endowing robots with tactile capabilities opens up new possibilities for their interaction with the environment, including the ability to handle fragile and/or soft objects. In this work, we equip the robot gripper with low-cost vision-based tactile sensors and propose a manipulation algorithm that adapts to both rigid and soft objects without requiring any knowledge of their properties. The algorithm relies on a touch and slip detection method, which considers the variation in the tactile images with respect to reference ones. We validate the approach on seven different objects, with different properties in terms of rigidity and fragility, to perform unplugging and lifting tasks. Furthermore, to enhance applicability, we combine the manipulation algorithm with a grasp sampler for the task of finding and picking a grape from a bunch without damaging~it. △ Less

Submitted 9 June, 2023; originally announced June 2023.

Comments: Published in IEEE International Conference on Automation Science and Engineering (CASE2023)

arXiv:2303.15115 [pdf, other]

Ensemble Latent Space Roadmap for Improved Robustness in Visual Action Planning

Authors: Martina Lippi, Michael C. Welle, Andrea Gasparri, Danica Kragic

Abstract: Planning in learned latent spaces helps to decrease the dimensionality of raw observations. In this work, we propose to leverage the ensemble paradigm to enhance the robustness of latent planning systems. We rely on our Latent Space Roadmap (LSR) framework, which builds a graph in a learned structured latent space to perform planning. Given multiple LSR framework instances, that differ either on t… ▽ More Planning in learned latent spaces helps to decrease the dimensionality of raw observations. In this work, we propose to leverage the ensemble paradigm to enhance the robustness of latent planning systems. We rely on our Latent Space Roadmap (LSR) framework, which builds a graph in a learned structured latent space to perform planning. Given multiple LSR framework instances, that differ either on their latent spaces or on the parameters for constructing the graph, we use the action information as well as the embedded nodes of the produced plans to define similarity measures. These are then utilized to select the most promising plans. We validate the performance of our Ensemble LSR (ENS-LSR) on simulated box stacking and grape harvesting tasks as well as on a real-world robotic T-shirt folding experiment. △ Less

Submitted 27 March, 2023; originally announced March 2023.

arXiv:2210.14036 [pdf, ps, other]

A Task Allocation Framework for Human Multi-Robot Collaborative Settings

Authors: Martina Lippi, Paolo Di Lillo, Alessandro Marino

Abstract: The requirements of modern production systems together with more advanced robotic technologies have fostered the integration of teams comprising humans and autonomous robots. However, along with the potential benefits also comes the question of how to effectively handle these teams considering the different characteristics of the involved agents. For this reason, this paper presents a framework fo… ▽ More The requirements of modern production systems together with more advanced robotic technologies have fostered the integration of teams comprising humans and autonomous robots. However, along with the potential benefits also comes the question of how to effectively handle these teams considering the different characteristics of the involved agents. For this reason, this paper presents a framework for task allocation in a human multi-robot collaborative scenario. The proposed solution combines an optimal offline allocation with an online reallocation strategy which accounts for inaccuracies of the offline plan and/or unforeseen events, human subjective preferences and cost of switching from one task to another so as to increase human satisfaction and team efficiency. Experiments are presented for the case of two manipulators cooperating with a human operator for performing a box filling task. △ Less

Submitted 25 October, 2022; originally announced October 2022.

arXiv:2203.13034 [pdf, other]

Augment-Connect-Explore: a Paradigm for Visual Action Planning with Data Scarcity

Authors: Martina Lippi, Michael C. Welle, Petra Poklukar, Alessandro Marino, Danica Kragic

Abstract: Visual action planning particularly excels in applications where the state of the system cannot be computed explicitly, such as manipulation of deformable objects, as it enables planning directly from raw images. Even though the field has been significantly accelerated by deep learning techniques, a crucial requirement for their success is the availability of a large amount of data. In this work,… ▽ More Visual action planning particularly excels in applications where the state of the system cannot be computed explicitly, such as manipulation of deformable objects, as it enables planning directly from raw images. Even though the field has been significantly accelerated by deep learning techniques, a crucial requirement for their success is the availability of a large amount of data. In this work, we propose the Augment-Connect-Explore (ACE) paradigm to enable visual action planning in cases of data scarcity. We build upon the Latent Space Roadmap (LSR) framework which performs planning with a graph built in a low dimensional latent space. In particular, ACE is used to i) Augment the available training dataset by autonomously creating new pairs of datapoints, ii) create new unobserved Connections among representations of states in the latent graph, and iii) Explore new regions of the latent space in a targeted manner. We validate the proposed approach on both simulated box stacking and real-world folding task showing the applicability for rigid and deformable object manipulation tasks, respectively. △ Less

Submitted 1 August, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

arXiv:2110.00125 [pdf, other]

Combining Transformers with Natural Language Explanations

Authors: Federico Ruggeri, Marco Lippi, Paolo Torroni

Abstract: Many NLP applications require models to be interpretable. However, many successful neural architectures, including transformers, still lack effective interpretation methods. A possible solution could rely on building explanations from domain knowledge, which is often available as plain, natural language text. We thus propose an extension to transformer models that makes use of external memories to… ▽ More Many NLP applications require models to be interpretable. However, many successful neural architectures, including transformers, still lack effective interpretation methods. A possible solution could rely on building explanations from domain knowledge, which is often available as plain, natural language text. We thus propose an extension to transformer models that makes use of external memories to store natural language explanations and use them to explain classification outputs. We conduct an experimental evaluation on two domains, legal text analysis and argument mining, to show that our approach can produce relevant explanations while retaining or even improving classification performance. △ Less

Submitted 3 April, 2024; v1 submitted 2 September, 2021; originally announced October 2021.

arXiv:2110.00124 [pdf, other]

Tree-Constrained Graph Neural Networks For Argument Mining

Authors: Federico Ruggeri, Marco Lippi, Paolo Torroni

Abstract: We propose a novel architecture for Graph Neural Networks that is inspired by the idea behind Tree Kernels of measuring similarity between trees by taking into account their common substructures, named fragments. By imposing a series of regularization constraints to the learning problem, we exploit a pooling mechanism that incorporates such notion of fragments within the node soft assignment funct… ▽ More We propose a novel architecture for Graph Neural Networks that is inspired by the idea behind Tree Kernels of measuring similarity between trees by taking into account their common substructures, named fragments. By imposing a series of regularization constraints to the learning problem, we exploit a pooling mechanism that incorporates such notion of fragments within the node soft assignment function that produces the embeddings. We present an extensive experimental evaluation on a collection of sentence classification tasks conducted on several argument mining corpora, showing that the proposed approach performs well with respect to state-of-the-art techniques. △ Less

Submitted 2 September, 2021; originally announced October 2021.

arXiv:2109.11223 [pdf, other]

Individual and Collective Autonomous Development

Authors: Marco Lippi, Stefano Mariani, Matteo Martinelli, Franco Zambonelli

Abstract: The increasing complexity and unpredictability of many ICT scenarios let us envision that future systems will have to dynamically learn how to act and adapt to face evolving situations with little or no a priori knowledge, both at the level of individual components and at the collective level. In other words, such systems should become able to autonomously develop models of themselves and of their… ▽ More The increasing complexity and unpredictability of many ICT scenarios let us envision that future systems will have to dynamically learn how to act and adapt to face evolving situations with little or no a priori knowledge, both at the level of individual components and at the collective level. In other words, such systems should become able to autonomously develop models of themselves and of their environment. Autonomous development includes: learning models of own capabilities; learning how to act purposefully towards the achievement of specific goals; and learning how to act collectively, i.e., accounting for the presence of others. In this paper, we introduce the vision of autonomous development in ICT systems, by framing its key concepts and by illustrating suitable application domains. Then, we overview the many research areas that are contributing or can potentially contribute to the realization of the vision, and identify some key research challenges. △ Less

Submitted 3 October, 2021; v1 submitted 23 September, 2021; originally announced September 2021.

Comments: 8 pages, 2 figures

ACM Class: I.2.11

arXiv:2109.06737 [pdf, other]

Comparing Reconstruction- and Contrastive-based Models for Visual Task Planning

Authors: Constantinos Chamzas, Martina Lippi, Michael C. Welle, Anastasia Varava, Lydia E. Kavraki, Danica Kragic

Abstract: Learning state representations enables robotic planning directly from raw observations such as images. Most methods learn state representations by utilizing losses based on the reconstruction of the raw observations from a lower-dimensional latent space. The similarity between observations in the space of images is often assumed and used as a proxy for estimating similarity between the underlying… ▽ More Learning state representations enables robotic planning directly from raw observations such as images. Most methods learn state representations by utilizing losses based on the reconstruction of the raw observations from a lower-dimensional latent space. The similarity between observations in the space of images is often assumed and used as a proxy for estimating similarity between the underlying states of the system. However, observations commonly contain task-irrelevant factors of variation which are nonetheless important for reconstruction, such as varying lighting and different camera viewpoints. In this work, we define relevant evaluation metrics and perform a thorough study of different loss functions for state representation learning. We show that models exploiting task priors, such as Siamese networks with a simple contrastive loss, outperform reconstruction-based representations in visual task planning. △ Less

Submitted 14 September, 2021; originally announced September 2021.

Comments: for the associated project web page, see https://state-representation.github.io/web/

arXiv:2107.07921 [pdf, ps, other]

doi 10.1016/j.ifacol.2018.11.540

Safety in human-multi robot collaborative scenarios: a trajectory scaling approach

Authors: Martina Lippi, Alessandro Marino

Abstract: In this paper, a strategy to handle the human safety in a multi-robot scenario is devised. In the presented framework, it is foreseen that robots are in charge of performing any cooperative manipulation task which is parameterized by a proper task function. The devised architecture answers to the increasing demand of strict cooperation between humans and robots, since it equips a general multi-rob… ▽ More In this paper, a strategy to handle the human safety in a multi-robot scenario is devised. In the presented framework, it is foreseen that robots are in charge of performing any cooperative manipulation task which is parameterized by a proper task function. The devised architecture answers to the increasing demand of strict cooperation between humans and robots, since it equips a general multi-robot cell with the feature of making robots and human working together. The human safety is properly handled by defining a safety index which depends both on the relative position and velocity of the human operator and robots. Then, the multi-robot task trajectory is properly scaled in order to ensure that the human safety never falls below a given threshold which can be set in worst conditions according to a minimum allowed distance. Simulations results are presented in order to prove the effectiveness of the approach. △ Less

Submitted 16 July, 2021; originally announced July 2021.

Comments: Link to the paper: https://www.sciencedirect.com/science/article/pii/S2405896318332464

Journal ref: IFAC-PapersOnLine, Volume 51, Issue 22, Pages 190-196, 2018

arXiv:2106.06781 [pdf, ps, other]

A Data-Driven Approach for Contact Detection, Classification and Reaction in Physical Human-Robot Collaboration

Authors: Martina Lippi, Giuseppe Gillini, Alessandro Marino, Filippo Arrichiello

Abstract: This paper considers a scenario where a robot and a human operator share the same workspace, and the robot is able to both carry out autonomous tasks and physically interact with the human in order to achieve common goals. In this context, both intentional and accidental contacts between human and robot might occur due to the complexity of tasks and environment, to the uncertainty of human behavio… ▽ More This paper considers a scenario where a robot and a human operator share the same workspace, and the robot is able to both carry out autonomous tasks and physically interact with the human in order to achieve common goals. In this context, both intentional and accidental contacts between human and robot might occur due to the complexity of tasks and environment, to the uncertainty of human behavior, and to the typical lack of awareness of each other actions. Here, a two stage strategy based on Recurrent Neural Networks (RNNs) is designed to detect intentional and accidental contacts: the occurrence of a contact with the human is detected at the first stage, while the classification between intentional and accidental is performed at the second stage. An admittance control strategy or an evasive action is then performed by the robot, respectively. The approach also works in the case the robot simultaneously interacts with the human and the environment, where the interaction wrench of the latter is modeled via Gaussian Mixture Models (GMMs). Control Barrier Functions (CBFs) are included, at the control level, to guarantee the satisfaction of robot and task constraints while performing the proper interaction strategy. The approach has been validated on a real setup composed of a Kinova Jaco2 robot. △ Less

Submitted 12 June, 2021; originally announced June 2021.

Comments: Accepted to 2021 IEEE International Conference on Robotics and Automation

arXiv:2106.06772 [pdf, ps, other]

doi 10.1109/RO-MAN50785.2021.9515362

A Mixed-Integer Linear Programming Formulation for Human Multi-Robot Task Allocation

Authors: Martina Lippi, Alessandro Marino

Abstract: In this work, we address a task allocation problem for human multi-robot settings. Given a set of tasks to perform, we formulate a general Mixed-Integer Linear Programming (MILP) problem aiming at minimizing the overall execution time while optimizing the quality of the executed tasks as well as human and robotic workload. Different skills of the agents, both human and robotic, are taken into acco… ▽ More In this work, we address a task allocation problem for human multi-robot settings. Given a set of tasks to perform, we formulate a general Mixed-Integer Linear Programming (MILP) problem aiming at minimizing the overall execution time while optimizing the quality of the executed tasks as well as human and robotic workload. Different skills of the agents, both human and robotic, are taken into account and human operators are enabled to either directly execute tasks or play supervisory roles; moreover, multiple manipulators can tightly collaborate if required to carry out a task. Finally, as realistic in human contexts, human parameters are assumed to vary over time, e.g., due to increasing human level of fatigue. Therefore, online monitoring is required and re-allocation is performed if needed. Simulations in a realistic scenario with two manipulators and a human operator performing an assembly task validate the effectiveness of the approach. △ Less

Submitted 12 June, 2021; originally announced June 2021.

Comments: Accepted to 2021 IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)

arXiv:2103.02554 [pdf, other]

Enabling Visual Action Planning for Object Manipulation through Latent Space Roadmap

Authors: Martina Lippi, Petra Poklukar, Michael C. Welle, Anastasia Varava, Hang Yin, Alessandro Marino, Danica Kragic

Abstract: We present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces, focusing on manipulation of deformable objects. We propose a Latent Space Roadmap (LSR) for task planning which is a graph-based structure globally capturing the system dynamics in a low-dimensional latent space. Our framework consists of three parts: (1) a Mapping Module (MM) that m… ▽ More We present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces, focusing on manipulation of deformable objects. We propose a Latent Space Roadmap (LSR) for task planning which is a graph-based structure globally capturing the system dynamics in a low-dimensional latent space. Our framework consists of three parts: (1) a Mapping Module (MM) that maps observations given in the form of images into a structured latent space extracting the respective states as well as generates observations from the latent states, (2) the LSR which builds and connects clusters containing similar states in order to find the latent plans between start and goal states extracted by MM, and (3) the Action Proposal Module that complements the latent plan found by the LSR with the corresponding actions. We present a thorough investigation of our framework on simulated box stacking and rope/box manipulation tasks, and a folding task executed on a real robot. △ Less

Submitted 30 June, 2022; v1 submitted 3 March, 2021; originally announced March 2021.

arXiv:2102.12227 [pdf]

doi 10.1109/TASLP.2023.3275040

Multi-Task Attentive Residual Networks for Argument Mining

Authors: Andrea Galassi, Marco Lippi, Paolo Torroni

Abstract: We explore the use of residual networks and neural attention for multiple argument mining tasks. We propose a residual architecture that exploits attention, multi-task learning, and makes use of ensemble, without any assumption on document or argument structure. We present an extensive experimental evaluation on five different corpora of user-generated comments, scientific publications, and persua… ▽ More We explore the use of residual networks and neural attention for multiple argument mining tasks. We propose a residual architecture that exploits attention, multi-task learning, and makes use of ensemble, without any assumption on document or argument structure. We present an extensive experimental evaluation on five different corpora of user-generated comments, scientific publications, and persuasive essays. Our results show that our approach is a strong competitor against state-of-the-art architectures with a higher computational footprint or corpus-specific design, representing an interesting compromise between generality, performance accuracy and reduced model size. △ Less

Submitted 25 May, 2023; v1 submitted 24 February, 2021; originally announced February 2021.

Comments: 16 pages, 3 figures

Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol 31, pp 1877-1892, 2023

arXiv:2009.12416 [pdf]

Process mining classification with a weightless neural network

Authors: Rafael Garcia Barbastefano, Maria Clara Lippi, Diego Carvalho

Abstract: Using a weightless neural network architecture WiSARD we propose a straightforward graph to retina codification to represent business process graph flows avoiding kernels, and we present how WiSARD outperforms the classification performance with small training sets in the process mining context. Using a weightless neural network architecture WiSARD we propose a straightforward graph to retina codification to represent business process graph flows avoiding kernels, and we present how WiSARD outperforms the classification performance with small training sets in the process mining context. △ Less

Submitted 25 September, 2020; originally announced September 2020.

arXiv:2008.07346 [pdf, other]

Memory networks for consumer protection:unfairness exposed

Authors: Federico Ruggeri, Francesca Lagioia, Marco Lippi, Paolo Torroni

Abstract: Recent work has demonstrated how data-driven AI methods can leverage consumer protection by supporting the automated analysis of legal documents. However, a shortcoming of data-driven approaches is poor explainability. We posit that in this domain useful explanations of classifier outcomes can be provided by resorting to legal rationales. We thus consider several configurations of memory-augmented… ▽ More Recent work has demonstrated how data-driven AI methods can leverage consumer protection by supporting the automated analysis of legal documents. However, a shortcoming of data-driven approaches is poor explainability. We posit that in this domain useful explanations of classifier outcomes can be provided by resorting to legal rationales. We thus consider several configurations of memory-augmented neural networks where rationales are given a special role in the modeling of context knowledge. Our results show that rationales not only contribute to improve the classification accuracy, but are also able to offer meaningful, natural language explanations of otherwise opaque classifier outcomes. △ Less

Submitted 24 July, 2020; originally announced August 2020.

arXiv:2005.14080 [pdf, other]

doi 10.1016/j.future.2019.11.042

Parallelizing Machine Learning as a Service for the End-User

Authors: Daniela Loreti, Marco Lippi, Paolo Torroni

Abstract: As ML applications are becoming ever more pervasive, fully-trained systems are made increasingly available to a wide public, allowing end-users to submit queries with their own data, and to efficiently retrieve results. With increasingly sophisticated such services, a new challenge is how to scale up to evergrowing user bases. In this paper, we present a distributed architecture that could be expl… ▽ More As ML applications are becoming ever more pervasive, fully-trained systems are made increasingly available to a wide public, allowing end-users to submit queries with their own data, and to efficiently retrieve results. With increasingly sophisticated such services, a new challenge is how to scale up to evergrowing user bases. In this paper, we present a distributed architecture that could be exploited to parallelize a typical ML system pipeline. We propose a case study consisting of a text mining service and discuss how the method can be generalized to many similar applications. We demonstrate the significance of the computational gain boosted by the distributed architecture by way of an extensive experimental evaluation. △ Less

Submitted 29 May, 2020; v1 submitted 28 May, 2020; originally announced May 2020.

Journal ref: Future Generation Computer Systems 105 (2020) 275-286

arXiv:2003.08974 [pdf, other]

Latent Space Roadmap for Visual Action Planning of Deformable and Rigid Object Manipulation

Authors: Martina Lippi, Petra Poklukar, Michael C. Welle, Anastasiia Varava, Hang Yin, Alessandro Marino, Danica Kragic

Abstract: We present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces such as manipulation of deformable objects. Planning is performed in a low-dimensional latent state space that embeds images. We define and implement a Latent Space Roadmap (LSR) which is a graph-based structure that globally captures the latent system dynamics. Our framework consists… ▽ More We present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces such as manipulation of deformable objects. Planning is performed in a low-dimensional latent state space that embeds images. We define and implement a Latent Space Roadmap (LSR) which is a graph-based structure that globally captures the latent system dynamics. Our framework consists of two main components: a Visual Foresight Module (VFM) that generates a visual plan as a sequence of images, and an Action Proposal Network (APN) that predicts the actions between them. We show the effectiveness of the method on a simulated box stacking task as well as a T-shirt folding task performed with a real robot. △ Less

Submitted 19 March, 2020; originally announced March 2020.

Comments: Project website: https://visual-action-planning.github.io/lsr/

arXiv:1905.09103 [pdf]

doi 10.3389/fdata.2019.00052

Neural-Symbolic Argumentation Mining: an Argument in Favor of Deep Learning and Reasoning

Authors: Andrea Galassi, Kristian Kersting, Marco Lippi, Xiaoting Shao, Paolo Torroni

Abstract: Deep learning is bringing remarkable contributions to the field of argumentation mining, but the existing approaches still need to fill the gap toward performing advanced reasoning tasks. In this position paper, we posit that neural-symbolic and statistical relational learning could play a crucial role in the integration of symbolic and sub-symbolic methods to achieve this goal. Deep learning is bringing remarkable contributions to the field of argumentation mining, but the existing approaches still need to fill the gap toward performing advanced reasoning tasks. In this position paper, we posit that neural-symbolic and statistical relational learning could play a crucial role in the integration of symbolic and sub-symbolic methods to achieve this goal. △ Less

Submitted 28 January, 2020; v1 submitted 22 May, 2019; originally announced May 2019.

Journal ref: Frontiers in Big Data 2 (2020) 52

arXiv:1902.02181 [pdf]

doi 10.1109/TNNLS.2020.3019893

Attention in Natural Language Processing

Authors: Andrea Galassi, Marco Lippi, Paolo Torroni

Abstract: Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those desig… ▽ More Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain. △ Less

Submitted 11 October, 2021; v1 submitted 4 February, 2019; originally announced February 2019.

Comments: 18 pages, 8 figures

MSC Class: 68T50; 68T05; 68T07 ACM Class: I.2; I.7

Journal ref: IEEE Transactions on Neural Networks and Learning Systems, vol 32, n 10, pp 4291-4308, 2021

arXiv:1809.08145 [pdf, other]

Predicting the Usefulness of Amazon Reviews Using Off-The-Shelf Argumentation Mining

Authors: Marco Passon, Marco Lippi, Giuseppe Serra, Carlo Tasso

Abstract: Internet users generate content at unprecedented rates. Building intelligent systems capable of discriminating useful content within this ocean of information is thus becoming a urgent need. In this paper, we aim to predict the usefulness of Amazon reviews, and to do this we exploit features coming from an off-the-shelf argumentation mining system. We argue that the usefulness of a review, in fact… ▽ More Internet users generate content at unprecedented rates. Building intelligent systems capable of discriminating useful content within this ocean of information is thus becoming a urgent need. In this paper, we aim to predict the usefulness of Amazon reviews, and to do this we exploit features coming from an off-the-shelf argumentation mining system. We argue that the usefulness of a review, in fact, is strictly related to its argumentative content, whereas the use of an already trained system avoids the costly need of relabeling a novel dataset. Results obtained on a large publicly available corpus support this hypothesis. △ Less

Submitted 21 September, 2018; originally announced September 2018.

Comments: 5 pages, 1 figure

arXiv:1805.01217 [pdf, other]

doi 10.1007/s10506-019-09243-2

CLAUDETTE: an Automated Detector of Potentially Unfair Clauses in Online Terms of Service

Authors: Marco Lippi, Przemyslaw Palka, Giuseppe Contissa, Francesca Lagioia, Hans-Wolfgang Micklitz, Giovanni Sartor, Paolo Torroni

Abstract: Terms of service of on-line platforms too often contain clauses that are potentially unfair to the consumer. We present an experimental study where machine learning is employed to automatically detect such potentially unfair clauses. Results show that the proposed system could provide a valuable tool for lawyers and consumers alike. Terms of service of on-line platforms too often contain clauses that are potentially unfair to the consumer. We present an experimental study where machine learning is employed to automatically detect such potentially unfair clauses. Results show that the proposed system could provide a valuable tool for lawyers and consumers alike. △ Less

Submitted 18 February, 2019; v1 submitted 3 May, 2018; originally announced May 2018.

Journal ref: Artif. Intell. Law 27 (2019) 117-139

arXiv:1804.04087 [pdf, other]

doi 10.1109/TNNLS.2019.2890970

Natural Language Statistical Features of LSTM-generated Texts

Authors: Marco Lippi, Marcelo A Montemurro, Mirko Degli Esposti, Giampaolo Cristadoro

Abstract: Long Short-Term Memory (LSTM) networks have recently shown remarkable performance in several tasks dealing with natural language generation, such as image captioning or poetry composition. Yet, only few works have analyzed text generated by LSTMs in order to quantitatively evaluate to which extent such artificial texts resemble those generated by humans. We compared the statistical structure of LS… ▽ More Long Short-Term Memory (LSTM) networks have recently shown remarkable performance in several tasks dealing with natural language generation, such as image captioning or poetry composition. Yet, only few works have analyzed text generated by LSTMs in order to quantitatively evaluate to which extent such artificial texts resemble those generated by humans. We compared the statistical structure of LSTM-generated language to that of written natural language, and to those produced by Markov models of various orders. In particular, we characterized the statistical structure of language by assessing word-frequency statistics, long-range correlations, and entropy measures. Our main finding is that while both LSTM and Markov-generated texts can exhibit features similar to real ones in their word-frequency statistics and entropy measures, LSTM-texts are shown to reproduce long-range correlations at scales comparable to those found in natural language. Moreover, for LSTM networks a temperature-like parameter controlling the generation process shows an optimal value---for which the produced texts are closest to real language---consistent across all the different statistical features investigated. △ Less

Submitted 15 April, 2019; v1 submitted 10 April, 2018; originally announced April 2018.

Journal ref: IEEE Transactions on Neural Networks and Learning Systems, 2019

arXiv:1408.2478 [pdf, other]

Learning to see like children: proof of concept

Authors: Marco Gori, Marco Lippi, Marco Maggini, Stefano Melacci

Abstract: In the last few years we have seen a growing interest in machine learning approaches to computer vision and, especially, to semantic labeling. Nowadays state of the art systems use deep learning on millions of labeled images with very successful results on benchmarks, though it is unlikely to expect similar results in unrestricted visual environments. Most learning schemes essentially ignore the i… ▽ More In the last few years we have seen a growing interest in machine learning approaches to computer vision and, especially, to semantic labeling. Nowadays state of the art systems use deep learning on millions of labeled images with very successful results on benchmarks, though it is unlikely to expect similar results in unrestricted visual environments. Most learning schemes essentially ignore the inherent sequential structure of videos: this might be a critical issue, since any visual recognition process is remarkably more complex when shuffling video frames. Based on this remark, we propose a re-foundation of the communication protocol between visual agents and the environment, which is referred to as learning to see like children. Like for human interaction, visual concepts are acquired by the agents solely by processing their own visual stream along with human supervisions on selected pixels. We give a proof of concept that remarkable semantic labeling can emerge within this protocol by using only a few supervised examples. This is made possible by exploiting a constraint of motion coherent labeling that virtually offers tons of supervisions. Additional visual constraints, including those associated with object supervisions, are used within the context of learning from constraints. The framework is extended in the direction of lifelong learning, so as our visual agents live in their own visual environment without distinguishing learning and test set. Learning takes place in deep architectures under a progressive developmental scheme. In order to evaluate our Developmental Visual Agents (DVAs), in addition to classic benchmarks, we open the doors of our lab, allowing people to evaluate DVAs by crowd-sourcing. Such assessment mechanism might result in a paradigm shift in methodologies and algorithms for computer vision, encouraging truly novel solutions within the proposed framework. △ Less

Submitted 11 August, 2014; originally announced August 2014.

Showing 1–27 of 27 results for author: Lippi, M