Skip to main content

Showing 1–28 of 28 results for author: Foresti, G L

  1. U-DIADS-Bib: a full and few-shot pixel-precise dataset for document layout analysis of ancient manuscripts

    Authors: Silvia Zottin, Axel De Nardin, Emanuela Colombi, Claudio Piciarelli, Filippo Pavan, Gian Luca Foresti

    Abstract: Document Layout Analysis, which is the task of identifying different semantic regions inside of a document page, is a subject of great interest for both computer scientists and humanities scholars as it represents a fundamental step towards further analysis tasks for the former and a powerful tool to improve and facilitate the study of the documents for the latter. However, many of the works curre… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Neural Comput & Applic (2024)

  2. arXiv:2308.00155  [pdf, other

    cs.CV cs.LG

    Federated Learning for Data and Model Heterogeneity in Medical Imaging

    Authors: Hussain Ahmad Madni, Rao Muhammad Umer, Gian Luca Foresti

    Abstract: Federated Learning (FL) is an evolving machine learning method in which multiple clients participate in collaborative learning without sharing their data with each other and the central server. In real-world applications such as hospitals and industries, FL counters the challenges of data heterogeneity and model heterogeneity as an inevitable part of the collaborative training. More specifically,… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

    Comments: Published in ICIAP2023 Workshop on Federated Learning in Medical Imaging and Vision

  3. Efficient few-shot learning for pixel-precise handwritten document layout analysis

    Authors: Axel De Nardin, Silvia Zottin, Matteo Paier, Gian Luca Foresti, Emanuela Colombi, Claudio Piciarelli

    Abstract: Layout analysis is a task of uttermost importance in ancient handwritten document analysis and represents a fundamental step toward the simplification of subsequent tasks such as optical character recognition and automatic transcription. However, many of the approaches adopted to solve this problem rely on a fully supervised learning paradigm. While these systems achieve very good performance on t… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: Accepted for publication at IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023

  4. Masked Transformer for image Anomaly Localization

    Authors: Axel De Nardin, Pankaj Mishra, Gian Luca Foresti, Claudio Piciarelli

    Abstract: Image anomaly detection consists in detecting images or image portions that are visually different from the majority of the samples in a dataset. The task is of practical importance for various real-life applications like biomedical image analysis, visual inspection in industrial production, banking, traffic management, etc. Most of the current deep learning approaches rely on image reconstruction… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Journal ref: Int J Neural Syst. 2022;32(7):2250030

  5. arXiv:2203.14031  [pdf, other

    cs.CV

    Medicinal Boxes Recognition on a Deep Transfer Learning Augmented Reality Mobile Application

    Authors: Danilo Avola, Luigi Cinque, Alessio Fagioli, Gian Luca Foresti, Marco Raoul Marini, Alessio Mecca, Daniele Pannone

    Abstract: Taking medicines is a fundamental aspect to cure illnesses. However, studies have shown that it can be hard for patients to remember the correct posology. More aggravating, a wrong dosage generally causes the disease to worsen. Although, all relevant instructions for a medicine are summarized in the corresponding patient information leaflet, the latter is generally difficult to navigate and unders… ▽ More

    Submitted 26 March, 2022; originally announced March 2022.

    Comments: 12 pages, 7 figures

  6. arXiv:2203.10009  [pdf, other

    cs.LG cs.CV

    Analyzing EEG Data with Machine and Deep Learning: A Benchmark

    Authors: Danilo Avola, Marco Cascio, Luigi Cinque, Alessio Fagioli, Gian Luca Foresti, Marco Raoul Marini, Daniele Pannone

    Abstract: Nowadays, machine and deep learning techniques are widely used in different areas, ranging from economics to biology. In general, these techniques can be used in two ways: trying to adapt well-known models and architectures to the available data, or designing custom architectures. In both cases, to speed up the research process, it is useful to know which type of models work best for a specific pr… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

    Comments: conference, 11 pages, 5 figures

  7. Human Silhouette and Skeleton Video Synthesis through Wi-Fi signals

    Authors: Danilo Avola, Marco Cascio, Luigi Cinque, Alessio Fagioli, Gian Luca Foresti

    Abstract: The increasing availability of wireless access points (APs) is leading towards human sensing applications based on Wi-Fi signals as support or alternative tools to the widespread visual sensors, where the signals enable to address well-known vision-related problems such as illumination changes or occlusions. Indeed, using image synthesis techniques to translate radio frequencies to the visible spe… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

    Journal ref: International Journal of Neural Systems, 2022, 2250015

  8. SIRe-Networks: Convolutional Neural Networks Architectural Extension for Information Preservation via Skip/Residual Connections and Interlaced Auto-Encoders

    Authors: Danilo Avola, Luigi Cinque, Alessio Fagioli, Gian Luca Foresti

    Abstract: Improving existing neural network architectures can involve several design choices such as manipulating the loss functions, employing a diverse learning strategy, exploiting gradient evolution at training time, optimizing the network hyper-parameters, or increasing the architecture depth. The latter approach is a straightforward solution, since it directly enhances the representation capabilities… ▽ More

    Submitted 26 October, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Journal ref: Neural Networks 153 (2022): 386-398

  9. 3D Hand Pose and Shape Estimation from RGB Images for Keypoint-Based Hand Gesture Recognition

    Authors: Danilo Avola, Luigi Cinque, Alessio Fagioli, Gian Luca Foresti, Adriano Fragomeni, Daniele Pannone

    Abstract: Estimating the 3D pose of a hand from a 2D image is a well-studied problem and a requirement for several real-life applications such as virtual reality, augmented reality, and hand gesture recognition. Currently, reasonable estimations can be computed from single RGB images, especially when a multi-task learning approach is used to force the system to consider the shape of the hand when its pose i… ▽ More

    Submitted 9 May, 2022; v1 submitted 28 September, 2021; originally announced September 2021.

  10. Drone swarm patrolling with uneven coverage requirements

    Authors: Claudio Piciarelli, Gian Luca Foresti

    Abstract: Swarms of drones are being more and more used in many practical scenarios, such as surveillance, environmental monitoring, search and rescue in hardly-accessible areas, etc.. While a single drone can be guided by a human operator, the deployment of a swarm of multiple drones requires proper algorithms for automatic task-oriented control. In this paper, we focus on visual coverage optimization with… ▽ More

    Submitted 1 July, 2021; originally announced July 2021.

    Comments: This paper has been published on IET Computer Vision. Please cite it accordingly (see journal reference below)

    Journal ref: IET Computer Vision, 14: 452-461 (2020)

  11. VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization

    Authors: Pankaj Mishra, Riccardo Verk, Daniele Fornasier, Claudio Piciarelli, Gian Luca Foresti

    Abstract: We present a transformer-based image anomaly detection and localization network. Our proposed model is a combination of a reconstruction-based approach and patch embedding. The use of transformer networks helps to preserve the spatial information of the embedded patches, which are later processed by a Gaussian mixture density network to localize the anomalous areas. In addition, we also publish BT… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

    Comments: 6 Pages, 4 images, conference published paper

    Report number: KD-003638

    Journal ref: IEEE 30th International Symposium on Industrial Electronics (ISIE), 2021

  12. arXiv:2012.02478  [pdf, other

    cs.CV cs.LG

    Is It a Plausible Colour? UCapsNet for Image Colourisation

    Authors: Rita Pucci, Christian Micheloni, Gian Luca Foresti, Niki Martinel

    Abstract: Human beings can imagine the colours of a grayscale image with no particular effort thanks to their ability of semantic feature extraction. Can an autonomous system achieve that? Can it hallucinate plausible and vibrant colours? This is the colourisation problem. Different from existing works relying on convolutional neural network models pre-trained with supervision, we cast such colourisation pr… ▽ More

    Submitted 4 December, 2020; originally announced December 2020.

  13. arXiv:2011.06288  [pdf

    cs.CV cs.AI

    Image Anomaly Detection by Aggregating Deep Pyramidal Representations

    Authors: Pankaj Mishra, Claudio Piciarelli, Gian Luca Foresti

    Abstract: Anomaly detection consists in identifying, within a dataset, those samples that significantly differ from the majority of the data, representing the normal class. It has many practical applications, e.g. ranging from defective product detection in industrial systems to medical imaging. This paper focuses on image anomaly detection using a deep neural network with multiple pyramid levels to analyze… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

    Comments: Published in First International Conference of Industrial Machine Learning ICPR2020

  14. arXiv:2009.04809  [pdf, other

    eess.IV cs.CV

    Deep Iterative Residual Convolutional Network for Single Image Super-Resolution

    Authors: Rao Muhammad Umer, Gian Luca Foresti, Christian Micheloni

    Abstract: Deep convolutional neural networks (CNNs) have recently achieved great success for single image super-resolution (SISR) task due to their powerful feature representation capabilities. The most recent deep learning based SISR methods focus on designing deeper / wider models to learn the non-linear mapping between low-resolution (LR) inputs and high-resolution (HR) outputs. These existing SR methods… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.

    Comments: To be appeared in proceedings of the 25th IEEE International Conference on Pattern Recognition (ICPR). arXiv admin note: text overlap with arXiv:2005.00953, arXiv:2009.03693

  15. arXiv:2005.00953  [pdf, other

    eess.IV cs.CV

    Deep Generative Adversarial Residual Convolutional Networks for Real-World Super-Resolution

    Authors: Rao Muhammad Umer, Gian Luca Foresti, Christian Micheloni

    Abstract: Most current deep learning based single image super-resolution (SISR) methods focus on designing deeper / wider models to learn the non-linear mapping between low-resolution (LR) inputs and the high-resolution (HR) outputs from a large number of paired (LR/HR) training data. They usually take as assumption that the LR image is a bicubic down-sampled version of the HR image. However, such degradati… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

  16. arXiv:2004.06154  [pdf, other

    cs.CV cs.RO

    An Efficient UAV-based Artificial Intelligence Framework for Real-Time Visual Tasks

    Authors: Enkhtogtokh Togootogtokh, Christian Micheloni, Gian Luca Foresti, Niki Martinel

    Abstract: Modern Unmanned Aerial Vehicles equipped with state of the art artificial intelligence (AI) technologies are opening to a wide plethora of novel and interesting applications. While this field received a strong impact from the recent AI breakthroughs, most of the provided solutions either entirely rely on commercial software or provide a weak integration interface which denies the development of ad… ▽ More

    Submitted 13 April, 2020; originally announced April 2020.

  17. arXiv:1910.04856  [pdf, other

    cs.CV cs.LG stat.ML

    Video-Based Convolutional Attention for Person Re-Identification

    Authors: Marco Zamprogno, Marco Passon, Niki Martinel, Giuseppe Serra, Giuseppe Lancioni, Christian Micheloni, Carlo Tasso, Gian Luca Foresti

    Abstract: In this paper we consider the problem of video-based person re-identification, which is the task of associating videos of the same person captured by different and non-overlapping cameras. We propose a Siamese framework in which video frames of the person to re-identify and of the candidate one are processed by two identical networks which produce a similarity score. We introduce an attention mech… ▽ More

    Submitted 26 September, 2019; originally announced October 2019.

    Comments: 11 pages, 2 figures. Accepted by ICIAP2019, 20th International Conference on IMAGE ANALYSIS AND PROCESSING, Trento, Italy, 9-13 September, 2019

  18. Visual Tracking by means of Deep Reinforcement Learning and an Expert Demonstrator

    Authors: Matteo Dunnhofer, Niki Martinel, Gian Luca Foresti, Christian Micheloni

    Abstract: In the last decade many different algorithms have been proposed to track a generic object in videos. Their execution on recent large-scale video datasets can produce a great amount of various tracking behaviours. New trends in Reinforcement Learning showed that demonstrations of an expert agent can be efficiently used to speed-up the process of policy learning. Taking inspiration from such works a… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

    Comments: in 2019 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) - VOT2019 Challenge Workshop

  19. Deep Super-Resolution Network for Single Image Super-Resolution with Realistic Degradations

    Authors: Rao Muhammad Umer, Gian Luca Foresti, Christian Micheloni

    Abstract: Single Image Super-Resolution (SISR) aims to generate a high-resolution (HR) image of a given low-resolution (LR) image. The most of existing convolutional neural network (CNN) based SISR methods usually take an assumption that a LR image is only bicubicly down-sampled version of an HR image. However, the true degradation (i.e. the LR image is a bicubicly downsampled, blurred and noisy version of… ▽ More

    Submitted 9 September, 2019; originally announced September 2019.

    Comments: 7 pages

    Journal ref: 13th International Conference on Distributed Smart Cameras (ICDSC 2019)

  20. arXiv:1909.02755  [pdf

    cs.CV

    Image anomaly detection with capsule networks and imbalanced datasets

    Authors: Claudio Piciarelli, Pankaj Mishra, Gian Luca Foresti

    Abstract: Image anomaly detection consists in finding images with anomalous, unusual patterns with respect to a set of normal data. Anomaly detection can be applied to several fields and has numerous practical applications, e.g. in industrial inspection, medical imaging, security enforcement, etc.. However, anomaly detection techniques often still rely on traditional approaches such as one-class Support Vec… ▽ More

    Submitted 6 September, 2019; originally announced September 2019.

    Comments: Published in conference ICIAP 2019

    Journal ref: [978-3-030-30641-0, ICIAP 2019, Part I, LNCS 11751, paper approval (489497_1_En, Chapter 23)]

  21. Deep Temporal Analysis for Non-Acted Body Affect Recognition

    Authors: Danilo Avola, Luigi Cinque, Alessio Fagioli, Gian Luca Foresti, Cristiano Massaroni

    Abstract: Affective computing is a field of great interest in many computer vision applications, including video surveillance, behaviour analysis, and human-robot interaction. Most of the existing literature has addressed this field by analysing different sets of face features. However, in the last decade, several studies have shown how body movements can play a key role even in emotion recognition. The maj… ▽ More

    Submitted 23 July, 2019; originally announced July 2019.

    Journal ref: IEEE Transactions on Affective Computing 2020

  22. arXiv:1812.01521  [pdf, other

    cs.SD eess.AS

    Localization and Tracking of an Acoustic Source using a Diagonal Unloading Beamforming and a Kalman Filter

    Authors: Daniele Salvati, Carlo Drioli, Gian Luca Foresti

    Abstract: We present the signal processing framework and some results for the IEEE AASP challenge on acoustic source localization and tracking (LOCATA). The system is designed for the direction of arrival (DOA) estimation in single-source scenarios. The proposed framework consists of four main building blocks: pre-processing, voice activity detection (VAD), localization, tracking. The signal pre-processing… ▽ More

    Submitted 4 December, 2018; originally announced December 2018.

    Comments: In Proceedings of the LOCATA Challenge Workshop - a satellite event of IWAENC 2018 (arXiv:1811.08482)

    Report number: LOCATAchallenge/2018/07

  23. Exploiting Recurrent Neural Networks and Leap Motion Controller for Sign Language and Semaphoric Gesture Recognition

    Authors: Danilo Avola, Marco Bernardi, Luigi Cinque, Gian Luca Foresti, Cristiano Massaroni

    Abstract: In human interactions, hands are a powerful way of expressing information that, in some cases, can be used as a valid substitute for voice, as it happens in Sign Language. Hand gesture recognition has always been an interesting topic in the areas of computer vision and multimedia. These gestures can be represented as sets of feature vectors that change over time. Recurrent Neural Networks (RNNs) a… ▽ More

    Submitted 28 March, 2018; originally announced March 2018.

    Journal ref: IEEE Transactions on Multimedia 21 (2019) 234-245

  24. arXiv:1707.09173  [pdf, other

    cs.CV

    Group Re-Identification via Unsupervised Transfer of Sparse Features Encoding

    Authors: Giuseppe Lisanti, Niki Martinel, Alberto Del Bimbo, Gian Luca Foresti

    Abstract: Person re-identification is best known as the problem of associating a single person that is observed from one or more disjoint cameras. The existing literature has mainly addressed such an issue, neglecting the fact that people usually move in groups, like in crowded scenarios. We believe that the additional information carried by neighboring individuals provides a relevant visual context that ca… ▽ More

    Submitted 28 July, 2017; originally announced July 2017.

    Comments: This paper has been accepted for publication at ICCV 2017

  25. The UMCD Dataset

    Authors: Danilo Avola, Gian Luca Foresti, Niki Martinel, Daniele Pannone, Claudio Piciarelli

    Abstract: In recent years, the technological improvements of low-cost small-scale Unmanned Aerial Vehicles (UAVs) are promoting an ever-increasing use of them in different tasks. In particular, the use of small-scale UAVs is useful in all these low-altitude tasks in which common UAVs cannot be adopted, such as recurrent comprehensive view of wide environments, frequent monitoring of military areas, real-tim… ▽ More

    Submitted 5 April, 2017; originally announced April 2017.

    Comments: 3 pages, 5 figures

    Journal ref: IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2018

  26. arXiv:1612.06543  [pdf, other

    cs.CV

    Wide-Slice Residual Networks for Food Recognition

    Authors: Niki Martinel, Gian Luca Foresti, Christian Micheloni

    Abstract: Food diary applications represent a tantalizing market. Such applications, based on image food recognition, opened to new challenges for computer vision and pattern recognition algorithms. Recent works in the field are focusing either on hand-crafted representations or on learning these by exploiting deep neural networks. Despite the success of such a last family of works, these generally exploit… ▽ More

    Submitted 20 December, 2016; originally announced December 2016.

  27. Diagonal Unloading Beamforming for Source Localization

    Authors: Daniele Salvati, Carlo Drioli, Gian Luca Foresti

    Abstract: In sensor array beamforming methods, a class of algorithms commonly used to estimate the position of a radiating source, the diagonal loading of the beamformer covariance matrix is generally used to improve computational accuracy and localization robustness. This paper proposes a diagonal unloading (DU) method which extends the conventional response power beamforming method by imposing an addition… ▽ More

    Submitted 3 May, 2016; originally announced May 2016.

    Journal ref: IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 25, Issue 3, Pages 609-622 (2018)

  28. Exploiting a Geometrically Sampled Grid in the SRP-PHAT for Localization Improvement and Power Response Sensitivity Analysis

    Authors: Daniele Salvati, Carlo Drioli, Gian Luca Foresti

    Abstract: The steered response power phase transform (SRP-PHAT) is a beamformer method very attractive in acoustic localization applications due to its robustness in reverberant environments. This paper presents a spatial grid design procedure, called the geometrically sampled grid (GSG), which aims at computing the spatial grid by taking into account the discrete sampling of time difference of arrival (TDO… ▽ More

    Submitted 7 March, 2018; v1 submitted 10 December, 2015; originally announced December 2015.

    Journal ref: Journal of the Acoustical Society of America, Volume 141, Issue 1, Pages 586-601 (2017)