subscribe to arXiv mailings

Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models

Authors: Daniela Occhipinti, Michele Marchi, Irene Mondella, Huiyuan Lai, Felice Dell'Orletta, Malvina Nissim, Marco Guerini

Abstract: Automatic methods for generating and gathering linguistic data have proven effective for fine-tuning Language Models (LMs) in languages less resourced than English. Still, while there has been emphasis on data quantity, less attention has been given to its quality. In this work, we investigate the impact of human intervention on machine-generated data when fine-tuning dialogical models. In particu… ▽ More Automatic methods for generating and gathering linguistic data have proven effective for fine-tuning Language Models (LMs) in languages less resourced than English. Still, while there has been emphasis on data quantity, less attention has been given to its quality. In this work, we investigate the impact of human intervention on machine-generated data when fine-tuning dialogical models. In particular, we study (1) whether post-edited dialogues exhibit higher perceived quality compared to the originals that were automatically generated; (2) whether fine-tuning with post-edited dialogues results in noticeable differences in the generated outputs; and (3) whether post-edited dialogues influence the outcomes when considering the parameter size of the LMs. To this end we created HED-IT, a large-scale dataset where machine-generated dialogues are paired with the version post-edited by humans. Using both the edited and unedited portions of HED-IT, we fine-tuned three different sizes of an LM. Results from both human and automatic evaluation show that the different quality of training data is clearly perceived and it has an impact also on the models trained on such data. Additionally, our findings indicate that larger models are less sensitive to data quality, whereas this has a crucial impact on smaller models. These results enhance our comprehension of the impact of human intervention on training data in the development of high-quality LMs. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2405.14061 [pdf, other]

Meanings and Feelings of Large Language Models: Observability of Latent States in Generative AI

Authors: Tian Yu Liu, Stefano Soatto, Matteo Marchi, Pratik Chaudhari, Paulo Tabuada

Abstract: We tackle the question of whether Large Language Models (LLMs), viewed as dynamical systems with state evolving in the embedding space of symbolic tokens, are observable. That is, whether there exist multiple 'mental' state trajectories that yield the same sequence of generated tokens, or sequences that belong to the same Nerode equivalence class ('meaning'). If not observable, mental state trajec… ▽ More We tackle the question of whether Large Language Models (LLMs), viewed as dynamical systems with state evolving in the embedding space of symbolic tokens, are observable. That is, whether there exist multiple 'mental' state trajectories that yield the same sequence of generated tokens, or sequences that belong to the same Nerode equivalence class ('meaning'). If not observable, mental state trajectories ('experiences') evoked by an input ('perception') or by feedback from the model's own state ('thoughts') could remain self-contained and evolve unbeknown to the user while being potentially accessible to the model provider. Such "self-contained experiences evoked by perception or thought" are akin to what the American Psychological Association (APA) defines as 'feelings'. Beyond the lexical curiosity, we show that current LLMs implemented by autoregressive Transformers cannot have 'feelings' according to this definition: The set of state trajectories indistinguishable from the tokenized output is a singleton. But if there are 'system prompts' not visible to the user, then the set of indistinguishable trajectories becomes non-trivial, and there can be multiple state trajectories that yield the same verbalized output. We prove these claims analytically, and show examples of modifications to standard LLMs that engender such 'feelings.' Our analysis sheds light on possible designs that would enable a model to perform non-trivial computation that is not visible to the user, as well as on controls that the provider of services using the model could take to prevent unintended behavior. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2404.02325 [pdf, ps, other]

Heat Death of Generative Models in Closed-Loop Learning

Authors: Matteo Marchi, Stefano Soatto, Pratik Chaudhari, Paulo Tabuada

Abstract: Improvement and adoption of generative machine learning models is rapidly accelerating, as exemplified by the popularity of LLMs (Large Language Models) for text, and diffusion models for image generation.As generative models become widespread, data they generate is incorporated into shared content through the public web. This opens the question of what happens when data generated by a model is fe… ▽ More Improvement and adoption of generative machine learning models is rapidly accelerating, as exemplified by the popularity of LLMs (Large Language Models) for text, and diffusion models for image generation.As generative models become widespread, data they generate is incorporated into shared content through the public web. This opens the question of what happens when data generated by a model is fed back to the model in subsequent training campaigns. This is a question about the stability of the training process, whether the distribution of publicly accessible content, which we refer to as "knowledge", remains stable or collapses. Small scale empirical experiments reported in the literature show that this closed-loop training process is prone to degenerating. Models may start producing gibberish data, or sample from only a small subset of the desired data distribution (a phenomenon referred to as mode collapse). So far there has been only limited theoretical understanding of this process, in part due to the complexity of the deep networks underlying these generative models. The aim of this paper is to provide insights into this process (that we refer to as "generative closed-loop learning") by studying the learning dynamics of generative models that are fed back their own produced content in addition to their original training dataset. The sampling of many of these models can be controlled via a "temperature" parameter. Using dynamical systems tools, we show that, unless a sufficient amount of external data is introduced at each iteration, any non-trivial temperature leads the model to asymptotically degenerate. In fact, either the generative distribution collapses to a small set of outputs, or becomes uniform over a large set of outputs. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2309.01612 [pdf, other]

On the Query Strategies for Efficient Online Active Distillation

Authors: Michele Boldo, Enrico Martini, Mirco De Marchi, Stefano Aldegheri, Nicola Bombieri

Abstract: Deep Learning (DL) requires lots of time and data, resulting in high computational demands. Recently, researchers employ Active Learning (AL) and online distillation to enhance training efficiency and real-time model adaptation. This paper evaluates a set of query strategies to achieve the best training results. It focuses on Human Pose Estimation (HPE) applications, assessing the impact of select… ▽ More Deep Learning (DL) requires lots of time and data, resulting in high computational demands. Recently, researchers employ Active Learning (AL) and online distillation to enhance training efficiency and real-time model adaptation. This paper evaluates a set of query strategies to achieve the best training results. It focuses on Human Pose Estimation (HPE) applications, assessing the impact of selected frames during training using two approaches: a classical offline method and a online evaluation through a continual learning approach employing knowledge distillation, on a popular state-of-the-art HPE dataset. The paper demonstrates the possibility of enabling training at the edge lightweight models, adapting them effectively to new contexts in real-time. △ Less

Submitted 4 September, 2023; originally announced September 2023.

arXiv:2304.06793 [pdf, other]

doi 10.1038/s41467-024-47811-6

Speck: A Smart event-based Vision Sensor with a low latency 327K Neuron Convolutional Neuronal Network Processing Pipeline

Authors: Ole Richter, Yannan Xing, Michele De Marchi, Carsten Nielsen, Merkourios Katsimpris, Roberto Cattaneo, Yudi Ren, Yalun Hu, Qian Liu, Sadique Sheik, Tugba Demirci, Ning Qiao

Abstract: Edge computing solutions that enable the extraction of high-level information from a variety of sensors is in increasingly high demand. This is due to the increasing number of smart devices that require sensory processing for their application on the edge. To tackle this problem, we present a smart vision sensor System on Chip (SoC), featuring an event-based camera and a low-power asynchronous spi… ▽ More Edge computing solutions that enable the extraction of high-level information from a variety of sensors is in increasingly high demand. This is due to the increasing number of smart devices that require sensory processing for their application on the edge. To tackle this problem, we present a smart vision sensor System on Chip (SoC), featuring an event-based camera and a low-power asynchronous spiking Convolutional Neural Network (sCNN) computing architecture embedded on a single chip. By combining both sensor and processing on a single die, we can lower unit production costs significantly. Moreover, the simple end-to-end nature of the SoC facilitates small stand-alone applications as well as functioning as an edge node in larger systems. The event-driven nature of the vision sensor delivers high-speed signals in a sparse data stream. This is reflected in the processing pipeline, which focuses on optimising highly sparse computation and minimising latency for 9 sCNN layers to 3.36μs for an incoming event. Overall, this results in an extremely low-latency visual processing pipeline deployed on a small form factor with a low energy budget and sensor cost. We present the asynchronous architecture, the individual blocks, and the sCNN processing principle and benchmark against other sCNN capable processors. △ Less

Submitted 27 May, 2024; v1 submitted 13 April, 2023; originally announced April 2023.

Comments: accepted and presented at 28th IEEE International Symposium On Asynchronous Circuits and Systems (ASYNC) 2023

Journal ref: IEEE ASYNC 2023

arXiv:2209.15248 [pdf]

doi 10.1007/978-3-031-17439-1_17

Hyperspectral and LiDAR data for the prediction via machine learning of tree species, volume and biomass: a possible contribution for updating forest management plans

Authors: Daniele Michelini, Michele Dalponte, Angelo Carriero, Erico Kutchart, Salvatore Eugenio Pappalardo, Massimo De Marchi, Francesco Pirotti

Abstract: This work intends to lay the foundations for identifying the prevailing forest types and the delineation of forest units within private forest inventories in the Autonomous Province of Trento (PAT), using currently available remote sensing solutions. In particular, data from LiDAR and hyperspectral surveys of 2014 made available by PAT were acquired and processed. Such studies are very important i… ▽ More This work intends to lay the foundations for identifying the prevailing forest types and the delineation of forest units within private forest inventories in the Autonomous Province of Trento (PAT), using currently available remote sensing solutions. In particular, data from LiDAR and hyperspectral surveys of 2014 made available by PAT were acquired and processed. Such studies are very important in the context of forest management scenarios. The method includes defining tree species ground-truth by outlining single tree crowns with polygons and labeling them. Successively two supervised machine learning classifiers, K-Nearest Neighborhood and Support Vector Machine (SVM) were used. The results show that, by setting specific hyperparameters, the SVM methodology gave the best results in classification of tree species. Biomass was estimated using canopy parameters and the Jucker equation for the above ground biomass (AGB) and that of Scrinzi for the tariff volume. Predicted values were compared with 11 field plots of fixed radius where volume and biomass were field-estimated in 2017. Results show significant coefficients of correlation: 0.94 for stem volume and 0.90 for total aboveground tree biomass. △ Less

Submitted 30 September, 2022; originally announced September 2022.

arXiv:2105.13278 [pdf, other]

One Step Preference Elicitation in Multi-Objective Bayesian Optimization

Authors: Juan Ungredda, Mariapia Marchi, Teresa Montrone, Juergen Branke

Abstract: We consider a multi-objective optimization problem with objective functions that are expensive to evaluate. The decision maker (DM) has unknown preferences, and so the standard approach is to generate an approximation of the Pareto front and let the DM choose from the generated non-dominated designs. However, especially for expensive to evaluate problems where the number of designs that can be eva… ▽ More We consider a multi-objective optimization problem with objective functions that are expensive to evaluate. The decision maker (DM) has unknown preferences, and so the standard approach is to generate an approximation of the Pareto front and let the DM choose from the generated non-dominated designs. However, especially for expensive to evaluate problems where the number of designs that can be evaluated is very limited, the true best solution according to the DM's unknown preferences is unlikely to be among the small set of non-dominated solutions found, even if these solutions are truly Pareto optimal. We address this issue by using a multi-objective Bayesian optimization algorithm and allowing the DM to select a preferred solution from a predicted continuous Pareto front just once before the end of the algorithm rather than selecting a solution after the end. This allows the algorithm to understand the DM's preferences and make a final attempt to identify a more preferred solution. We demonstrate the idea using ParEGO, and show empirically that the found solutions are significantly better in terms of true DM preferences than if the DM would simply pick a solution at the end. △ Less

Submitted 27 May, 2021; originally announced May 2021.

arXiv:2011.05547 [pdf, other]

Identifying Properties of Real-World Optimisation Problems through a Questionnaire

Authors: Koen van der Blom, Timo M. Deist, Vanessa Volz, Mariapia Marchi, Yusuke Nojima, Boris Naujoks, Akira Oyama, Tea Tušar

Abstract: Optimisation algorithms are commonly compared on benchmarks to get insight into performance differences. However, it is not clear how closely benchmarks match the properties of real-world problems because these properties are largely unknown. This work investigates the properties of real-world problems through a questionnaire to enable the design of future benchmark problems that more closely rese… ▽ More Optimisation algorithms are commonly compared on benchmarks to get insight into performance differences. However, it is not clear how closely benchmarks match the properties of real-world problems because these properties are largely unknown. This work investigates the properties of real-world problems through a questionnaire to enable the design of future benchmark problems that more closely resemble those found in the real world. The results, while not representative as they are based on only 45 responses, indicate that many problems possess at least one of the following properties: they are constrained, deterministic, have only continuous variables, require substantial computation times for both the objectives and the constraints, or allow a limited number of evaluations. Properties like known optimal solutions and analytical gradients are rarely available, limiting the options in guiding the optimisation process. These are all important aspects to consider when designing realistic benchmark problems. At the same time, the design of realistic benchmarks is difficult, because objective functions are often reported to be black-box and many problem properties are unknown. To further improve the understanding of real-world problems, readers working on a real-world optimisation problem are encouraged to fill out the questionnaire: https://tinyurl.com/opt-survey △ Less

Submitted 14 July, 2021; v1 submitted 11 November, 2020; originally announced November 2020.

Comments: Book Chapter (Under review, revised version)

arXiv:2004.06395 [pdf, other]

doi 10.1145/3377929.3389974

Towards Realistic Optimization Benchmarks: A Questionnaire on the Properties of Real-World Problems

Authors: Koen van der Blom, Timo M. Deist, Tea Tušar, Mariapia Marchi, Yusuke Nojima, Akira Oyama, Vanessa Volz, Boris Naujoks

Abstract: Benchmarks are a useful tool for empirical performance comparisons. However, one of the main shortcomings of existing benchmarks is that it remains largely unclear how they relate to real-world problems. What does an algorithm's performance on a benchmark say about its potential on a specific real-world problem? This work aims to identify properties of real-world problems through a questionnaire o… ▽ More Benchmarks are a useful tool for empirical performance comparisons. However, one of the main shortcomings of existing benchmarks is that it remains largely unclear how they relate to real-world problems. What does an algorithm's performance on a benchmark say about its potential on a specific real-world problem? This work aims to identify properties of real-world problems through a questionnaire on real-world single-, multi-, and many-objective optimization problems. Based on initial responses, a few challenges that have to be considered in the design of realistic benchmarks can already be identified. A key point for future work is to gather more responses to the questionnaire to allow an analysis of common combinations of properties. In turn, such common combinations can then be included in improved benchmark suites. To gather more data, the reader is invited to participate in the questionnaire at: https://tinyurl.com/opt-survey △ Less

Submitted 14 April, 2020; originally announced April 2020.

Comments: 2 pages, GECCO2020 Poster Paper

Showing 1–9 of 9 results for author: Marchi, M