Skip to main content

Showing 1–23 of 23 results for author: Honari, S

  1. arXiv:2402.11036  [pdf, other

    cs.CV cs.LG

    Occlusion Resilient 3D Human Pose Estimation

    Authors: Soumava Kumar Roy, Ilia Badanin, Sina Honari, Pascal Fua

    Abstract: Occlusions remain one of the key challenges in 3D body pose estimation from single-camera video sequences. Temporal consistency has been extensively used to mitigate their impact but the existing algorithms in the literature do not explicitly model them. Here, we apply this by representing the deforming body as a spatio-temporal graph. We then introduce a refinement network that performs graph c… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  2. arXiv:2212.14397  [pdf, other

    cs.CV

    AttEntropy: Segmenting Unknown Objects in Complex Scenes using the Spatial Attention Entropy of Semantic Segmentation Transformers

    Authors: Krzysztof Lis, Matthias Rottmann, Sina Honari, Pascal Fua, Mathieu Salzmann

    Abstract: Vision transformers have emerged as powerful tools for many computer vision tasks. It has been shown that their features and class tokens can be used for salient object segmentation. However, the properties of segmentation transformers remain largely unstudied. In this work we conduct an in-depth study of the spatial attentions of different backbone layers of semantic segmentation transformers and… ▽ More

    Submitted 29 December, 2022; originally announced December 2022.

    ACM Class: I.4.6; I.4.8; I.5.4

  3. arXiv:2212.01639  [pdf, other

    stat.ML cs.CV cs.LG

    Visual Question Answering From Another Perspective: CLEVR Mental Rotation Tests

    Authors: Christopher Beckham, Martin Weiss, Florian Golemo, Sina Honari, Derek Nowrouzezahrai, Christopher Pal

    Abstract: Different types of mental rotation tests have been used extensively in psychology to understand human visual reasoning and perception. Understanding what an object or visual scene would look like from another viewpoint is a challenging problem that is made even harder if it must be performed from a single image. We explore a controlled setting whereby questions are posed about the properties of a… ▽ More

    Submitted 3 December, 2022; originally announced December 2022.

    Comments: Accepted for publication to Pattern Recognition journal

  4. arXiv:2211.12829  [pdf, other

    cs.CV cs.LG

    Unsupervised 3D Keypoint Discovery with Multi-View Geometry

    Authors: Sina Honari, Chen Zhao, Mathieu Salzmann, Pascal Fua

    Abstract: Analyzing and training 3D body posture models depend heavily on the availability of joint labels that are commonly acquired through laborious manual annotation of body joints or via marker-based joint localization using carefully curated markers and capturing systems. However, such annotations are not always available, especially for people performing unusual activities. In this paper, we propose… ▽ More

    Submitted 7 February, 2024; v1 submitted 23 November, 2022; originally announced November 2022.

    Comments: Accepted in "3DV 2024"

  5. Perspective Aware Road Obstacle Detection

    Authors: Krzysztof Lis, Sina Honari, Pascal Fua, Mathieu Salzmann

    Abstract: While road obstacle detection techniques have become increasingly effective, they typically ignore the fact that, in practice, the apparent size of the obstacles decreases as their distance to the vehicle increases. In this paper, we account for this by computing a scale map encoding the apparent size of a hypothetical object at every image location. We then leverage this perspective map to (i) ge… ▽ More

    Submitted 19 June, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

    ACM Class: I.4.6; I.4.8; I.5.4

    Journal ref: IEEE Robotics and Automation Letters ( Volume: 8, Issue: 4, April 2023, Pages: 2150-2157)

  6. arXiv:2203.15865  [pdf, other

    cs.CV

    On Triangulation as a Form of Self-Supervision for 3D Human Pose Estimation

    Authors: Soumava Kumar Roy, Leonardo Citraro, Sina Honari, Pascal Fua

    Abstract: Supervised approaches to 3D pose estimation from single images are remarkably effective when labeled data is abundant. However, as the acquisition of ground-truth 3D labels is labor intensive and time consuming, recent attention has shifted towards semi- and weakly-supervised learning. Generating an effective form of supervision with little annotations still poses major challenge in crowded scenes… ▽ More

    Submitted 28 June, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

  7. arXiv:2112.04203  [pdf, other

    cs.CV

    Adversarial Parametric Pose Prior

    Authors: Andrey Davydov, Anastasia Remizova, Victor Constantin, Sina Honari, Mathieu Salzmann, Pascal Fua

    Abstract: The Skinned Multi-Person Linear (SMPL) model can represent a human body by mapping pose and shape parameters to body meshes. This has been shown to facilitate inferring 3D human pose and shape from images via different learning models. However, not all pose and shape parameter values yield physically-plausible or even realistic body meshes. In other words, SMPL is under-constrained and may thus le… ▽ More

    Submitted 8 December, 2021; originally announced December 2021.

  8. arXiv:2112.01176  [pdf, other

    cs.CV

    Overcoming the Domain Gap in Neural Action Representations

    Authors: Semih Günel, Florian Aymanns, Sina Honari, Pavan Ramdya, Pascal Fua

    Abstract: Relating animal behaviors to brain activity is a fundamental goal in neuroscience, with practical applications in building robust brain-machine interfaces. However, the domain gap between individuals is a major issue that prevents the training of general models that work on unlabeled subjects. Since 3D pose data can now be reliably extracted from multi-view video sequences without manual interve… ▽ More

    Submitted 19 January, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

  9. arXiv:2111.14595  [pdf, other

    cs.CV

    Overcoming the Domain Gap in Contrastive Learning of Neural Action Representations

    Authors: Semih Günel, Florian Aymanns, Sina Honari, Pavan Ramdya, Pascal Fua

    Abstract: A fundamental goal in neuroscience is to understand the relationship between neural activity and behavior. For example, the ability to extract behavioral intentions from neural data, or neural decoding, is critical for developing effective brain machine interfaces. Although simple linear models have been applied to this challenge, they cannot identify important non-linear relationships. Thus, a se… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

    Comments: Accepted into NeurIPS 2021 Workshop: Self-Supervised Learning - Theory and Practice

  10. arXiv:2104.14812  [pdf, other

    cs.CV

    SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation

    Authors: Robin Chan, Krzysztof Lis, Svenja Uhlemeyer, Hermann Blum, Sina Honari, Roland Siegwart, Pascal Fua, Mathieu Salzmann, Matthias Rottmann

    Abstract: State-of-the-art semantic or instance segmentation deep neural networks (DNNs) are usually trained on a closed set of semantic classes. As such, they are ill-equipped to handle previously-unseen objects. However, detecting and localizing such objects is crucial for safety-critical applications such as perception for automated driving, especially if they appear on the road ahead. While some methods… ▽ More

    Submitted 9 November, 2021; v1 submitted 30 April, 2021; originally announced April 2021.

    Comments: 35 pages, 18 figures, 16 tables, website https://segmentmeifyoucan.com/, NeurIPS 2021 Track on Datasets and Benchmarks

    MSC Class: 68T45; 62-07 ACM Class: I.4.6; I.4.9

  11. arXiv:2012.13633  [pdf, other

    cs.CV

    Detecting Road Obstacles by Erasing Them

    Authors: Krzysztof Lis, Sina Honari, Pascal Fua, Mathieu Salzmann

    Abstract: Vehicles can encounter a myriad of obstacles on the road, and it is impossible to record them all beforehand to train a detector. Instead, we select image patches and inpaint them with the surrounding road texture, which tends to remove obstacles from those patches. We then use a network trained to recognize discrepancies between the original patch and the inpainted one, which signals an erased ob… ▽ More

    Submitted 8 October, 2023; v1 submitted 25 December, 2020; originally announced December 2020.

    ACM Class: I.4.6; I.4.9; J.7

  12. Temporal Representation Learning on Monocular Videos for 3D Human Pose Estimation

    Authors: Sina Honari, Victor Constantin, Helge Rhodin, Mathieu Salzmann, Pascal Fua

    Abstract: In this paper we propose an unsupervised feature extraction method to capture temporal information on monocular videos, where we detect and encode subject of interest in each frame and leverage contrastive self-supervised (CSS) learning to extract rich latent vectors. Instead of simply treating the latent features of nearby frames as positive pairs and those of temporally-distant ones as negative… ▽ More

    Submitted 25 November, 2022; v1 submitted 2 December, 2020; originally announced December 2020.

    Comments: Accepted in "IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)"

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022

  13. arXiv:2004.02186  [pdf, other

    cs.CV

    Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled Representation

    Authors: Edoardo Remelli, Shangchen Han, Sina Honari, Pascal Fua, Robert Wang

    Abstract: We present a lightweight solution to recover 3D pose from multi-view images captured with spatially calibrated cameras. Building upon recent advances in interpretable representation learning, we exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points. This allows us to reason effectively about 3D pose across different vie… ▽ More

    Submitted 20 June, 2020; v1 submitted 5 April, 2020; originally announced April 2020.

    Journal ref: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 6040-6049

  14. arXiv:1908.01073  [pdf, other

    eess.IV cs.LG stat.ML

    U-Net Fixed-Point Quantization for Medical Image Segmentation

    Authors: MohammadHossein AskariHemmat, Sina Honari, Lucas Rouhier, Christian S. Perone, Julien Cohen-Adad, Yvon Savaria, Jean-Pierre David

    Abstract: Model quantization is leveraged to reduce the memory consumption and the computation time of deep neural networks. This is achieved by representing weights and activations with a lower bit resolution when compared to their high precision floating point counterparts. The suitable level of quantization is directly related to the model performance. Lowering the quantization precision (e.g. 2 bits), r… ▽ More

    Submitted 9 September, 2019; v1 submitted 2 August, 2019; originally announced August 2019.

    Comments: Accepted to MICCAI 2019's Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention

  15. arXiv:1903.02709  [pdf, other

    stat.ML cs.LG

    On Adversarial Mixup Resynthesis

    Authors: Christopher Beckham, Sina Honari, Vikas Verma, Alex Lamb, Farnoosh Ghadiri, R Devon Hjelm, Yoshua Bengio, Christopher Pal

    Abstract: In this paper, we explore new approaches to combining information encoded within the learned representations of auto-encoders. We explore models that are capable of combining the attributes of multiple inputs such that a resynthesised output is trained to fool an adversarial discriminator for real versus synthesised data. Furthermore, we explore the use of such an architecture in the context of se… ▽ More

    Submitted 23 October, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

    Comments: 'Camera-ready draft'

  16. arXiv:1805.08841  [pdf, other

    cs.CV cs.LG

    Distribution Matching Losses Can Hallucinate Features in Medical Image Translation

    Authors: Joseph Paul Cohen, Margaux Luck, Sina Honari

    Abstract: This paper discusses how distribution matching losses, such as those used in CycleGAN, when used to synthesize medical images can lead to mis-diagnosis of medical conditions. It seems appealing to use these new image synthesis methods for translating images from a source to a target domain because they can produce high quality images and some even do not require paired data. However, the basis of… ▽ More

    Submitted 3 October, 2018; v1 submitted 22 May, 2018; originally announced May 2018.

    Comments: Published at Medical Image Computing & Computer Assisted Intervention (MICCAI 2018). An abstract is published at the Medical Imaging with Deep Learning Conference (MIDL 2018) as "How to Cure Cancer (in images) with Unpaired Image Translation"

    Journal ref: Medical Image Computing & Computer Assisted Intervention (MICCAI 2018 Oral)

  17. arXiv:1803.09202  [pdf, other

    cs.CV stat.ML

    Unsupervised Depth Estimation, 3D Face Rotation and Replacement

    Authors: Joel Ruben Antony Moniz, Christopher Beckham, Simon Rajotte, Sina Honari, Christopher Pal

    Abstract: We present an unsupervised approach for learning to estimate three dimensional (3D) facial structure from a single image while also predicting 3D viewpoint transformations that match a desired pose and facial geometry. We achieve this by inferring the depth of facial keypoints of an input image in an unsupervised manner, without using any form of ground-truth depth information. We show how it is p… ▽ More

    Submitted 23 December, 2018; v1 submitted 25 March, 2018; originally announced March 2018.

    Comments: Depth Estimation, Face Rotation, Face Swap, 32nd Conference on Neural Information Processing Systems (NIPS 2018)

  18. arXiv:1712.03917  [pdf, other

    cs.CV

    Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals

    Authors: Shanxin Yuan, Guillermo Garcia-Hernando, Bjorn Stenger, Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee, Pavlo Molchanov, Jan Kautz, Sina Honari, Liuhao Ge, Junsong Yuan, Xinghao Chen, Guijin Wang, Fan Yang, Kai Akiyama, Yang Wu, Qingfu Wan, Meysam Madadi, Sergio Escalera, Shile Li, Dongheui Lee, Iason Oikonomidis, Antonis Argyros, Tae-Kyun Kim

    Abstract: In this paper, we strive to answer two questions: What is the current state of 3D hand pose estimation from depth images? And, what are the next challenges that need to be tackled? Following the successful Hands In the Million Challenge (HIM2017), we investigate the top 10 state-of-the-art methods on three tasks: single frame 3D pose estimation, 3D hand tracking, and hand pose estimation during ob… ▽ More

    Submitted 29 March, 2018; v1 submitted 11 December, 2017; originally announced December 2017.

  19. arXiv:1709.01591  [pdf, other

    cs.CV

    Improving Landmark Localization with Semi-Supervised Learning

    Authors: Sina Honari, Pavlo Molchanov, Stephen Tyree, Pascal Vincent, Christopher Pal, Jan Kautz

    Abstract: We present two techniques to improve landmark localization in images from partially annotated datasets. Our primary goal is to leverage the common situation where precise landmark locations are only provided for a small data subset, but where class labels for classification or regression tasks related to the landmarks are more abundantly available. First, we propose the framework of sequential mul… ▽ More

    Submitted 28 October, 2018; v1 submitted 5 September, 2017; originally announced September 2017.

    Comments: Published as a conference paper in CVPR 2018

  20. arXiv:1703.06975  [pdf, other

    stat.ML cs.LG

    Learning to Generate Samples from Noise through Infusion Training

    Authors: Florian Bordes, Sina Honari, Pascal Vincent

    Abstract: In this work, we investigate a novel training procedure to learn a generative model as the transition operator of a Markov chain, such that, when applied repeatedly on an unstructured random noise sample, it will denoise it into a sample that matches the target distribution from the training set. The novel training procedure to learn this progressive denoising operation involves sampling from a sl… ▽ More

    Submitted 20 March, 2017; originally announced March 2017.

    Comments: Published as a conference paper at ICLR 2017

  21. arXiv:1605.02688  [pdf, other

    cs.SC cs.LG cs.MS

    Theano: A Python framework for fast computation of mathematical expressions

    Authors: The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano , et al. (88 additional authors not shown)

    Abstract: Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, mu… ▽ More

    Submitted 9 May, 2016; originally announced May 2016.

    Comments: 19 pages, 5 figures

  22. arXiv:1512.08212  [pdf, other

    cs.CV

    Improving Facial Analysis and Performance Driven Animation through Disentangling Identity and Expression

    Authors: David Rim, Sina Honari, Md Kamrul Hasan, Chris Pal

    Abstract: We present techniques for improving performance driven facial animation, emotion recognition, and facial key-point or landmark prediction using learned identity invariant representations. Established approaches to these problems can work well if sufficient examples and labels for a particular identity are available and factors of variation are highly controlled. However, labeled examples of facial… ▽ More

    Submitted 22 May, 2016; v1 submitted 27 December, 2015; originally announced December 2015.

    Comments: to appear in Image and Vision Computing Journal (IMAVIS)

  23. arXiv:1511.07356  [pdf, other

    cs.CV

    Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation

    Authors: Sina Honari, Jason Yosinski, Pascal Vincent, Christopher Pal

    Abstract: Deep neural networks with alternating convolutional, max-pooling and decimation layers are widely used in state of the art architectures for computer vision. Max-pooling purposefully discards precise spatial information in order to create features that are more robust, and typically organized as lower resolution spatial feature maps. On some tasks, such as whole-image classification, max-pooling d… ▽ More

    Submitted 17 April, 2016; v1 submitted 23 November, 2015; originally announced November 2015.

    Comments: accepted in CVPR 2016