Skip to main content

Showing 1–9 of 9 results for author: Solovyev, R

  1. arXiv:2308.06981  [pdf, other

    eess.AS cs.SD

    The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track

    Authors: Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji

    Abstract: This paper summarizes the cinematic demixing (CDX) track of the Sound Demixing Challenge 2023 (SDX'23). We provide a comprehensive summary of the challenge setup, detailing the structure of the competition and the datasets used. Especially, we detail CDXDB23, a new hidden dataset constructed from real movies that was used to rank the submissions. The paper also offers insights into the most succes… ▽ More

    Submitted 18 April, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: Accepted for Transactions of the International Society for Music Information Retrieval

  2. arXiv:2308.06979  [pdf, other

    eess.AS cs.SD

    The Sound Demixing Challenge 2023 $\unicode{x2013}$ Music Demixing Track

    Authors: Giorgio Fabbro, Stefan Uhlich, Chieh-Hsin Lai, Woosung Choi, Marco Martínez-Ramírez, Weihsiang Liao, Igor Gadelha, Geraldo Ramos, Eddie Hsu, Hugo Rodrigues, Fabian-Robert Stöter, Alexandre Défossez, Yi Luo, Jianwei Yu, Dipam Chakraborty, Sharada Mohanty, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Nabarun Goswami, Tatsuya Harada, Minseok Kim, Jun Hyung Lee, Yuanliang Dong, Xinran Zhang , et al. (2 additional authors not shown)

    Abstract: This paper summarizes the music demixing (MDX) track of the Sound Demixing Challenge (SDX'23). We provide a summary of the challenge setup and introduce the task of robust music source separation (MSS), i.e., training MSS models in the presence of errors in the training data. We propose a formalization of the errors that can occur in the design of a training dataset for MSS systems and introduce t… ▽ More

    Submitted 19 April, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: Published in Transactions of the International Society for Music Information Retrieval (https://transactions.ismir.net/articles/10.5334/tismir.171)

    Journal ref: Transactions of the International Society for Music Information Retrieval, 7(1), pp.63-84, 2024

  3. arXiv:2305.07489  [pdf, ps, other

    cs.SD cs.LG eess.AS

    Benchmarks and leaderboards for sound demixing tasks

    Authors: Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva

    Abstract: Music demixing is the task of separating different tracks from the given single audio signal into components, such as drums, bass, and vocals from the rest of the accompaniment. Separation of sources is useful for a range of areas, including entertainment and hearing aids. In this paper, we introduce two new benchmarks for the sound source separation tasks and compare popular models for sound demi… ▽ More

    Submitted 7 May, 2024; v1 submitted 12 May, 2023; originally announced May 2023.

  4. 3D Convolutional Neural Networks for Stalled Brain Capillary Detection

    Authors: Roman Solovyev, Alexandr A. Kalinin, Tatiana Gabruseva

    Abstract: Adequate blood supply is critical for normal brain function. Brain vasculature dysfunctions such as stalled blood flow in cerebral capillaries are associated with cognitive decline and pathogenesis in Alzheimer's disease. Recent advances in imaging technology enabled generation of high-quality 3D images that can be used to visualize stalled blood vessels. However, localization of stalled vessels i… ▽ More

    Submitted 14 February, 2022; v1 submitted 4 April, 2021; originally announced April 2021.

    Journal ref: Computers in biology and medicine. 2022

  5. arXiv:2004.11482  [pdf, other

    cs.CV eess.IV

    Roof material classification from aerial imagery

    Authors: Roman Solovyev

    Abstract: This paper describes an algorithm for classification of roof materials using aerial photographs. Main advantages of the algorithm are proposed methods to improve prediction accuracy. Proposed methods includes: method of converting ImageNet weights of neural networks for using multi-channel images; special set of features of second level models that are used in addition to specific predictions of n… ▽ More

    Submitted 23 April, 2020; originally announced April 2020.

  6. Weighted boxes fusion: Ensembling boxes from different object detection models

    Authors: Roman Solovyev, Weimin Wang, Tatiana Gabruseva

    Abstract: In this work, we present a novel method for combining predictions of object detection models: weighted boxes fusion. Our algorithm utilizes confidence scores of all proposed bounding boxes to constructs the averaged boxes. We tested method on several datasets and evaluated it in the context of the Open Images and COCO Object Detection tracks, achieving top results in these challenges. The source c… ▽ More

    Submitted 6 February, 2021; v1 submitted 29 October, 2019; originally announced October 2019.

    Journal ref: Image and Vision Computing (2021): 104117

  7. arXiv:1908.02924  [pdf, other

    eess.IV cs.CV

    Bayesian Feature Pyramid Networks for Automatic Multi-Label Segmentation of Chest X-rays and Assessment of Cardio-Thoratic Ratio

    Authors: Roman Solovyev, Iaroslav Melekhov, Timo Lesonen, Elias Vaattovaara, Osmo Tervonen, Aleksei Tiulpin

    Abstract: Cardiothoratic ratio (CTR) estimated from chest radiographs is a marker indicative of cardiomegaly, the presence of which is in the criteria for heart failure diagnosis. Existing methods for automatic assessment of CTR are driven by Deep Learning-based segmentation. However, these techniques produce only point estimates of CTR but clinical decision making typically assumes the uncertainty. In this… ▽ More

    Submitted 8 August, 2019; originally announced August 2019.

    Comments: Roman Solovyev and Iaroslav Melekhov contributed equally. Timo Lesonen and Elias Vaattovaara contributed equally

  8. arXiv:1810.02364  [pdf, other

    cs.SD cs.HC eess.AS

    Deep Learning Approaches for Understanding Simple Speech Commands

    Authors: Roman A. Solovyev, Maxim Vakhrushev, Alexander Radionov, Vladimir Aliev, Alexey A. Shvets

    Abstract: Automatic classification of sound commands is becoming increasingly important, especially for mobile and embedded devices. Many of these devices contain both cameras and microphones, and companies that develop them would like to use the same technology for both of these classification tasks. One way of achieving this is to represent sound commands as images, and use convolutional neural networks w… ▽ More

    Submitted 4 October, 2018; originally announced October 2018.

    Comments: 12 page, 4 figures, 1 table

  9. Fixed-Point Convolutional Neural Network for Real-Time Video Processing in FPGA

    Authors: Roman Solovyev, Alexander Kustov, Dmitry Telpukhov, Vladimir Rukhlov, Alexandr Kalinin

    Abstract: Modern mobile neural networks with a reduced number of weights and parameters do a good job with image classification tasks, but even they may be too complex to be implemented in an FPGA for video processing tasks. The article proposes neural network architecture for the practical task of recognizing images from a camera, which has several advantages in terms of speed. This is achieved by reducing… ▽ More

    Submitted 3 December, 2020; v1 submitted 29 August, 2018; originally announced August 2018.

    Comments: 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus)