Skip to main content

Showing 1–6 of 6 results for author: Sergeev, A

  1. arXiv:2401.10777  [pdf

    cs.CV

    Determination of efficiency indicators of the stand for intelligent control of manual operations in industrial production

    Authors: Anton Sergeev, Victor Minchenkov, Aleksei Soldatov

    Abstract: Systems of intelligent control of manual operations in industrial production are being implemented in many industries nowadays. Such systems use high-resolution cameras and computer vision algorithms to automatically track the operator's manipulations and prevent technological errors in the assembly process. At the same time compliance with safety regulations in the workspace is monitored. As a re… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  2. arXiv:1909.11150  [pdf, other

    cs.LG cond-mat.mtrl-sci cs.DC physics.comp-ph stat.ML

    Exascale Deep Learning for Scientific Inverse Problems

    Authors: Nouamane Laanait, Joshua Romero, Junqi Yin, M. Todd Young, Sean Treichler, Vitalii Starchenko, Albina Borisevich, Alex Sergeev, Michael Matheson

    Abstract: We introduce novel communication strategies in synchronous distributed Deep Learning consisting of decentralized gradient reduction orchestration and computational graph-aware grouping of gradient tensors. These new techniques produce an optimal overlap between computation and communication and result in near-linear scaling (0.93) of distributed training up to 27,600 NVIDIA V100 GPUs on the Summit… ▽ More

    Submitted 24 September, 2019; originally announced September 2019.

    Comments: 13 pages, 9 figures. Under review by the Systems and Machine Learning (SysML) Conference (SysML '20)

  3. arXiv:1905.04035  [pdf, other

    cs.LG cs.CL cs.DC

    Densifying Assumed-sparse Tensors: Improving Memory Efficiency and MPI Collective Performance during Tensor Accumulation for Parallelized Training of Neural Machine Translation Models

    Authors: Derya Cavdar, Valeriu Codreanu, Can Karakus, John A. Lockman III, Damian Podareanu, Vikram Saletore, Alexander Sergeev, Don D. Smith II, Victor Suthichai, Quy Ta, Srinivas Varadharajan, Lucas A. Wilson, Rengan Xu, Pei Yang

    Abstract: Neural machine translation - using neural networks to translate human language - is an area of active research exploring new neuron types and network topologies with the goal of dramatically improving machine translation performance. Current state-of-the-art approaches, such as the multi-head attention-based transformer, require very large translation corpuses and many epochs to produce models of… ▽ More

    Submitted 10 May, 2019; originally announced May 2019.

    Comments: 18 pages, 10 figures, accepted at the 2019 International Supercomputing Conference

  4. arXiv:1807.03247  [pdf, other

    cs.CV cs.LG stat.ML

    An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution

    Authors: Rosanne Liu, Joel Lehman, Piero Molino, Felipe Petroski Such, Eric Frank, Alex Sergeev, Jason Yosinski

    Abstract: Few ideas have enjoyed as large an impact on deep learning as convolution. For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate. In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in… ▽ More

    Submitted 3 December, 2018; v1 submitted 9 July, 2018; originally announced July 2018.

    Comments: Published in NeurIPS 2018

  5. arXiv:1804.00551  [pdf, other

    cs.CL cs.LG

    The Training of Neuromodels for Machine Comprehension of Text. Brain2Text Algorithm

    Authors: A. Artemov, A. Sergeev, A. Khasenevich, A. Yuzhakov, M. Chugunov

    Abstract: Nowadays, the Internet represents a vast informational space, growing exponentially and the problem of search for relevant data becomes essential as never before. The algorithm proposed in the article allows to perform natural language queries on content of the document and get comprehensive meaningful answers. The problem is partially solved for English as SQuAD contains enough data to learn on,… ▽ More

    Submitted 30 March, 2018; originally announced April 2018.

    Comments: 5 pages, 2 figures, 6 tables

    ACM Class: I.2.6; I.2.7

  6. arXiv:1802.05799  [pdf, other

    cs.LG stat.ML

    Horovod: fast and easy distributed deep learning in TensorFlow

    Authors: Alexander Sergeev, Mike Del Balso

    Abstract: Training modern deep learning models requires large amounts of computation, often provided by GPUs. Scaling computation from one GPU to many can enable much faster training and research progress but entails two complications. First, the training library must support inter-GPU communication. Depending on the particular methods employed, this communication may entail anywhere from negligible to sign… ▽ More

    Submitted 20 February, 2018; v1 submitted 15 February, 2018; originally announced February 2018.