Skip to main content

Showing 1–50 of 77 results for author: Golyanik, V

  1. arXiv:2406.17988  [pdf, other

    cs.CV

    DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image

    Authors: Qingxuan Wu, Zhiyang Dou, Sirui Xu, Soshi Shimada, Chen Wang, Zhengming Yu, Yuan Liu, Cheng Lin, Zeyu Cao, Taku Komura, Vladislav Golyanik, Christian Theobalt, Wenping Wang, Lingjie Liu

    Abstract: Reconstructing 3D hand-face interactions with deformations from a single image is a challenging yet crucial task with broad applications in AR, VR, and gaming. The challenges stem from self-occlusions during single-view hand-face interactions, diverse spatial relationships between hands and face, complex deformations, and the ambiguity of the single-view setting. The first and only method for hand… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 23 pages, 9 figures, 3 tables

  2. arXiv:2406.10078  [pdf, other

    cs.CV cs.GR cs.LG

    D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video

    Authors: Moritz Kappel, Florian Hahlbohm, Timon Scholz, Susana Castillo, Christian Theobalt, Martin Eisemann, Vladislav Golyanik, Marcus Magnor

    Abstract: Dynamic reconstruction and spatiotemporal novel-view synthesis of non-rigidly deforming scenes recently gained increased attention. While existing work achieves impressive quality and performance on multi-view or teleporting camera setups, most methods fail to efficiently and faithfully recover motion and appearance from casual monocular captures. This paper contributes to the field by introducing… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 16 pages, 5 figures, 10 tables. Project page: https://moritzkappel.github.io/projects/dnpc

  3. arXiv:2404.08640  [pdf, other

    cs.CV

    EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams

    Authors: Christen Millerdurai, Hiroyasu Akada, Jian Wang, Diogo Luvizon, Christian Theobalt, Vladislav Golyanik

    Abstract: Monocular egocentric 3D human motion capture is a challenging and actively researched problem. Existing methods use synchronously operating visual sensors (e.g. RGB cameras) and often fail under low lighting and fast motions, which can be restricting in many applications involving head-mounted devices. In response to the existing limitations, this paper 1) introduces a new problem, i.e., 3D human… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 14 pages, 11 figures and 6 tables; project page: https://4dqv.mpi-inf.mpg.de/EventEgo3D/; Computer Vision and Pattern Recognition (CVPR) 2024

  4. arXiv:2403.15064  [pdf, other

    cs.CV cs.GR

    Recent Trends in 3D Reconstruction of General Non-Rigid Scenes

    Authors: Raza Yunus, Jan Eric Lenssen, Michael Niemeyer, Yiyi Liao, Christian Rupprecht, Christian Theobalt, Gerard Pons-Moll, Jia-Bin Huang, Vladislav Golyanik, Eddy Ilg

    Abstract: Reconstructing models of the real world, including 3D geometry, appearance, and motion of real scenes, is essential for computer graphics and computer vision. It enables the synthesizing of photorealistic novel views, useful for the movie industry and AR/VR applications. It also facilitates the content creation necessary in computer games and AR/VR by avoiding laborious manual design processes. Fu… ▽ More

    Submitted 6 May, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: 42 pages, 18 figures, 5 tables; State-of-the-Art Report at EUROGRAPHICS 2024. Project page: https://razayunus.github.io/non-rigid-star

  5. arXiv:2401.00889  [pdf, other

    cs.CV

    3D Human Pose Perception from Egocentric Stereo Videos

    Authors: Hiroyasu Akada, Jian Wang, Vladislav Golyanik, Christian Theobalt

    Abstract: While head-mounted devices are becoming more compact, they provide egocentric views with significant self-occlusions of the device user. Hence, existing methods often fail to accurately estimate complex 3D poses from egocentric views. In this work, we propose a new transformer-based framework to improve egocentric stereo 3D human pose estimation, which leverages the scene information and temporal… ▽ More

    Submitted 15 May, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

  6. arXiv:2312.16118  [pdf, other

    cs.CV

    Quantum-Hybrid Stereo Matching With Nonlinear Regularization and Spatial Pyramids

    Authors: Cameron Braunstein, Eddy Ilg, Vladislav Golyanik

    Abstract: Quantum visual computing is advancing rapidly. This paper presents a new formulation for stereo matching with nonlinear regularizers and spatial pyramids on quantum annealers as a maximum a posteriori inference problem that minimizes the energy of a Markov Random Field. Our approach is hybrid (i.e., quantum-classical) and is compatible with modern D-Wave quantum annealers, i.e., it includes a quad… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: 26 pages, 15 figures. To be published in the International Conference on 3D Vision (3DV) 2024

  7. arXiv:2312.14929  [pdf, other

    cs.CV cs.GR

    MACS: Mass Conditioned 3D Hand and Object Motion Synthesis

    Authors: Soshi Shimada, Franziska Mueller, Jan Bednarik, Bardia Doosti, Bernd Bickel, Danhang Tang, Vladislav Golyanik, Jonathan Taylor, Christian Theobalt, Thabo Beeler

    Abstract: The physical properties of an object, such as mass, significantly affect how we manipulate it with our hands. Surprisingly, this aspect has so far been neglected in prior work on 3D motion synthesis. To improve the naturalness of the synthesized 3D hand object motions, this work proposes MACS the first MAss Conditioned 3D hand and object motion Synthesis approach. Our approach is based on cascaded… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  8. arXiv:2312.14157  [pdf, other

    cs.CV

    3D Pose Estimation of Two Interacting Hands from a Monocular Event Camera

    Authors: Christen Millerdurai, Diogo Luvizon, Viktor Rudnev, André Jonas, Jiayi Wang, Christian Theobalt, Vladislav Golyanik

    Abstract: 3D hand tracking from a monocular video is a very challenging problem due to hand interactions, occlusions, left-right hand ambiguity, and fast motion. Most existing methods rely on RGB inputs, which have severe limitations under low-light conditions and suffer from motion blur. In contrast, event cameras capture local brightness changes instead of full image frames and do not suffer from the desc… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: 17 pages, 12 figures, 7 tables; project page: https://4dqv.mpi-inf.mpg.de/Ev2Hands/

    Journal ref: International Conference on 3D Vision (3DV) 2024

  9. arXiv:2312.11587  [pdf, other

    cs.CV

    Relightable Neural Actor with Intrinsic Decomposition and Pose Control

    Authors: Diogo Luvizon, Vladislav Golyanik, Adam Kortylewski, Marc Habermann, Christian Theobalt

    Abstract: Creating a digital human avatar that is relightable, drivable, and photorealistic is a challenging and important problem in Vision and Graphics. Humans are highly articulated creating pose-dependent appearance effects like self-shadows and wrinkles, and skin as well as clothing require complex and space-varying BRDF models. While recent human relighting approaches can recover plausible material-li… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: Project page: https://people.mpi-inf.mpg.de/~dluvizon/relightable-neural-actor/

  10. arXiv:2312.07423  [pdf, other

    cs.CV

    Holoported Characters: Real-time Free-viewpoint Rendering of Humans from Sparse RGB Cameras

    Authors: Ashwath Shetty, Marc Habermann, Guoxing Sun, Diogo Luvizon, Vladislav Golyanik, Christian Theobalt

    Abstract: We present the first approach to render highly realistic free-viewpoint videos of a human actor in general apparel, from sparse multi-view recording to display, in real-time at an unprecedented 4K resolution. At inference, our method only requires four camera views of the moving actor and the respective 3D skeletal pose. It handles actors in wide clothing, and reproduces even fine-scale dynamic de… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: Project page: https://vcai.mpi-inf.mpg.de/projects/holochar/

  11. arXiv:2311.17057  [pdf, other

    cs.CV

    ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions

    Authors: Anindita Ghosh, Rishabh Dabral, Vladislav Golyanik, Christian Theobalt, Philipp Slusallek

    Abstract: Current approaches for 3D human motion synthesis generate high-quality animations of digital humans performing a wide variety of actions and gestures. However, a notable technological gap exists in addressing the complex dynamics of multi-human interactions within this paradigm. In this work, we present ReMoS, a denoising diffusion-based model that synthesizes full-body reactive motion of a person… ▽ More

    Submitted 26 March, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: 17 pages, 7 figures, 5 tables

  12. arXiv:2311.05604  [pdf, other

    cs.CV

    3D-QAE: Fully Quantum Auto-Encoding of 3D Point Clouds

    Authors: Lakshika Rathi, Edith Tretschk, Christian Theobalt, Rishabh Dabral, Vladislav Golyanik

    Abstract: Existing methods for learning 3D representations are deep neural networks trained and tested on classical hardware. Quantum machine learning architectures, despite their theoretically predicted advantages in terms of speed and the representational capacity, have so far not been considered for this problem nor for tasks involving 3D data in general. This paper thus introduces the first quantum auto… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: 20 pages, 11 figures, 5 tables

    Journal ref: British Machine Vision Conference (BMVC) 2023

  13. arXiv:2310.15128  [pdf, other

    cs.CV cs.LG quant-ph

    Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients

    Authors: Maximilian Krahn, Michelle Sasdelli, Fengyi Yang, Vladislav Golyanik, Juho Kannala, Tat-Jun Chin, Tolga Birdal

    Abstract: We present, QP-SBGD, a novel layer-wise stochastic optimiser tailored towards training neural networks with binary weights, known as binary neural networks (BNNs), on quantum hardware. BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy. However, training them in practice remains to be an open challenge. Most known BNN-optimisers… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  14. Discovering Fatigued Movements for Virtual Character Animation

    Authors: Noshaba Cheema, Rui Xu, Nam Hee Kim, Perttu Hämäläinen, Vladislav Golyanik, Marc Habermann, Christian Theobalt, Philipp Slusallek

    Abstract: Virtual character animation and movement synthesis have advanced rapidly during recent years, especially through a combination of extensive motion capture datasets and machine learning. A remaining challenge is interactively simulating characters that fatigue when performing extended motions, which is indispensable for the realism of generated animations. However, capturing such movements is probl… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: 16 pages, 22 figures. To be published in ACM SIGGRAPH Asia Conference Papers 2023. ACM ISBN 979-8-4007-0315-7/23/12

    ACM Class: I.3.7

    Journal ref: ACM SIGGRAPH Asia Conference Papers 2023

  15. arXiv:2310.07204  [pdf, other

    cs.AI cs.CV cs.GR cs.LG

    State of the Art on Diffusion Models for Visual Computing

    Authors: Ryan Po, Wang Yifan, Vladislav Golyanik, Kfir Aberman, Jonathan T. Barron, Amit H. Bermano, Eric Ryan Chan, Tali Dekel, Aleksander Holynski, Angjoo Kanazawa, C. Karen Liu, Lingjie Liu, Ben Mildenhall, Matthias Nießner, Björn Ommer, Christian Theobalt, Peter Wonka, Gordon Wetzstein

    Abstract: The field of visual computing is rapidly advancing due to the emergence of generative artificial intelligence (AI), which unlocks unprecedented capabilities for the generation, editing, and reconstruction of images, videos, and 3D scenes. In these domains, diffusion models are the generative AI architecture of choice. Within the last year alone, the literature on diffusion-based tools and applicat… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  16. arXiv:2309.16670  [pdf, other

    cs.CV cs.GR cs.HC

    Decaf: Monocular Deformation Capture for Face and Hand Interactions

    Authors: Soshi Shimada, Vladislav Golyanik, Patrick Pérez, Christian Theobalt

    Abstract: Existing methods for 3D tracking from monocular RGB videos predominantly consider articulated and rigid objects. Modelling dense non-rigid object deformations in this setting remained largely unaddressed so far, although such effects can improve the realism of the downstream applications such as AR/VR and avatar communications. This is due to the severe ill-posedness of the monocular view setting… ▽ More

    Submitted 13 October, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

  17. arXiv:2308.12970  [pdf, other

    cs.GR cs.LG

    NeuralClothSim: Neural Deformation Fields Meet the Thin Shell Theory

    Authors: Navami Kairanda, Marc Habermann, Christian Theobalt, Vladislav Golyanik

    Abstract: Despite existing 3D cloth simulators producing realistic results, they predominantly operate on discrete surface representations (e.g. points and meshes) with a fixed spatial resolution, which often leads to large memory consumption and resolution-dependent simulations. Moreover, back-propagating gradients through the existing solvers is difficult, and they cannot be easily integrated into modern… ▽ More

    Submitted 14 June, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: 33 pages, 21 figures and 3 tables; project page: https://4dqv.mpi-inf.mpg.de/NeuralClothSim/

  18. arXiv:2308.12969  [pdf, other

    cs.CV

    ROAM: Robust and Object-Aware Motion Generation Using Neural Pose Descriptors

    Authors: Wanyue Zhang, Rishabh Dabral, Thomas Leimkühler, Vladislav Golyanik, Marc Habermann, Christian Theobalt

    Abstract: Existing automatic approaches for 3D virtual character motion synthesis supporting scene interactions do not generalise well to new objects outside training distributions, even when trained on extensive motion capture datasets with diverse objects and annotated interactions. This paper addresses this limitation and shows that robustness and generalisation to novel scene objects in 3D object-aware… ▽ More

    Submitted 15 February, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: 14 pages, 11 figures; project page: https://vcai.mpi-inf.mpg.de/projects/ROAM/

    Journal ref: International Conference on 3D Vision 2024

  19. arXiv:2308.08258  [pdf, other

    cs.CV cs.GR

    SceNeRFlow: Time-Consistent Reconstruction of General Dynamic Scenes

    Authors: Edith Tretschk, Vladislav Golyanik, Michael Zollhoefer, Aljaz Bozic, Christoph Lassner, Christian Theobalt

    Abstract: Existing methods for the 4D reconstruction of general, non-rigidly deforming objects focus on novel-view synthesis and neglect correspondences. However, time consistency enables advanced downstream tasks like 3D editing, motion analysis, or virtual-asset creation. We propose SceNeRFlow to reconstruct a general, non-rigid scene in a time-consistent manner. Our dynamic-NeRF method takes multi-view R… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: Project page: https://vcai.mpi-inf.mpg.de/projects/scenerflow/

  20. arXiv:2307.00842  [pdf, other

    cs.CV

    VINECS: Video-based Neural Character Skinning

    Authors: Zhouyingcheng Liao, Vladislav Golyanik, Marc Habermann, Christian Theobalt

    Abstract: Rigging and skinning clothed human avatars is a challenging task and traditionally requires a lot of manual work and expertise. Recent methods addressing it either generalize across different characters or focus on capturing the dynamics of a single character observed under different pose configurations. However, the former methods typically predict solely static skinning weights, which perform po… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  21. arXiv:2306.00547  [pdf, other

    cs.CV cs.GR

    AvatarStudio: Text-driven Editing of 3D Dynamic Human Head Avatars

    Authors: Mohit Mendiratta, Xingang Pan, Mohamed Elgharib, Kartik Teotia, Mallikarjun B R, Ayush Tewari, Vladislav Golyanik, Adam Kortylewski, Christian Theobalt

    Abstract: Capturing and editing full head performances enables the creation of virtual characters with various applications such as extended reality and media production. The past few years witnessed a steep rise in the photorealism of human head avatars. Such avatars can be controlled through different input data modalities, including RGB, audio, depth, IMUs and others. While these data modalities provide… ▽ More

    Submitted 2 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: 17 pages, 17 figures. Project page: https://vcai.mpi-inf.mpg.de/projects/AvatarStudio/

  22. arXiv:2305.05026  [pdf, other

    cs.CV

    Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding

    Authors: Li Jiang, Zetong Yang, Shaoshuai Shi, Vladislav Golyanik, Dengxin Dai, Bernt Schiele

    Abstract: Masked signal modeling has greatly advanced self-supervised pre-training for language and 2D images. However, it is still not fully explored in 3D scene understanding. Thus, this paper introduces Masked Shape Prediction (MSP), a new framework to conduct masked signal modeling in 3D scenes. MSP uses the essential 3D semantic cue, i.e., geometric shape, as the prediction target for masked points. Th… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: CVPR 2023

  23. arXiv:2305.01599  [pdf, other

    cs.CV cs.GR

    EgoLocate: Real-time Motion Capture, Localization, and Mapping with Sparse Body-mounted Sensors

    Authors: Xinyu Yi, Yuxiao Zhou, Marc Habermann, Vladislav Golyanik, Shaohua Pan, Christian Theobalt, Feng Xu

    Abstract: Human and environment sensing are two important topics in Computer Vision and Graphics. Human motion is often captured by inertial sensors, while the environment is mostly reconstructed using cameras. We integrate the two techniques together in EgoLocate, a system that simultaneously performs human motion capture (mocap), localization, and mapping in real time from sparse body-mounted sensors, inc… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: Accepted by SIGGRAPH 2023. Project page: https://xinyu-yi.github.io/EgoLocate/

  24. arXiv:2303.16202  [pdf, other

    cs.CV

    CCuantuMM: Cycle-Consistent Quantum-Hybrid Matching of Multiple Shapes

    Authors: Harshil Bhatia, Edith Tretschk, Zorah Lähner, Marcel Seelbach Benkner, Michael Moeller, Christian Theobalt, Vladislav Golyanik

    Abstract: Jointly matching multiple, non-rigidly deformed 3D shapes is a challenging, $\mathcal{NP}$-hard problem. A perfect matching is necessarily cycle-consistent: Following the pairwise point correspondences along several shapes must end up at the starting vertex of the original shape. Unfortunately, existing quantum shape-matching methods do not support multiple shapes and even less cycle consistency.… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: Computer Vision and Pattern Recognition (CVPR) 2023; 22 pages, 24 figures and 5 tables; Project page: https://4dqv.mpi-inf.mpg.de/CCuantuMM/

  25. arXiv:2303.15444  [pdf, other

    cs.CV

    Quantum Multi-Model Fitting

    Authors: Matteo Farina, Luca Magri, Willi Menapace, Elisa Ricci, Vladislav Golyanik, Federica Arrigoni

    Abstract: Geometric model fitting is a challenging but fundamental computer vision problem. Recently, quantum optimization has been shown to enhance robust fitting for the case of a single model, while leaving the question of multi-model fitting open. In response to this challenge, this paper shows that the latter case can significantly benefit from quantum hardware and proposes the first quantum approach t… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: In Computer Vision and Pattern Recognition (CVPR) 2023; Highlight

  26. arXiv:2303.13472  [pdf, other

    cs.CV cs.AI

    Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion Models

    Authors: Willi Menapace, Aliaksandr Siarohin, Stéphane Lathuilière, Panos Achlioptas, Vladislav Golyanik, Sergey Tulyakov, Elisa Ricci

    Abstract: Neural video game simulators emerged as powerful tools to generate and edit videos. Their idea is to represent games as the evolution of an environment's state driven by the actions of its agents. While such a paradigm enables users to play a game action-by-action, its rigidity precludes more semantic forms of control. To overcome this limitation, we augment game models with prompts specified as a… ▽ More

    Submitted 21 January, 2024; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: ACM Transactions on Graphics \c{opyright} Copyright is held by the owner/author(s) 2023. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM Transactions on Graphics, http://dx.doi.org/10.1145/3635705

  27. arXiv:2301.05175  [pdf, other

    cs.CV

    Scene-Aware 3D Multi-Human Motion Capture from a Single Camera

    Authors: Diogo Luvizon, Marc Habermann, Vladislav Golyanik, Adam Kortylewski, Christian Theobalt

    Abstract: In this work, we consider the problem of estimating the 3D position of multiple humans in a scene as well as their body shape and articulation from a single RGB video recorded with a static camera. In contrast to expensive marker-based or multi-view systems, our lightweight setup is ideal for private users as it enables an affordable 3D motion capture that is easy to install and does not require e… ▽ More

    Submitted 27 March, 2023; v1 submitted 12 January, 2023; originally announced January 2023.

    Comments: Accepted to Eurographics 2023. See also github: https://github.com/dluvizon/scene-aware-3d-multi-human project page: https://vcai.mpi-inf.mpg.de/projects/scene-aware-3d-multi-human/

  28. arXiv:2212.07555  [pdf, other

    cs.CV cs.GR cs.LG

    IMos: Intent-Driven Full-Body Motion Synthesis for Human-Object Interactions

    Authors: Anindita Ghosh, Rishabh Dabral, Vladislav Golyanik, Christian Theobalt, Philipp Slusallek

    Abstract: Can we make virtual characters in a scene interact with their surrounding objects through simple instructions? Is it possible to synthesize such motion plausibly with a diverse set of objects and instructions? Inspired by these questions, we present the first framework to synthesize the full-body motion of virtual human characters performing specified actions with 3D objects placed within their re… ▽ More

    Submitted 26 February, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

    Comments: 10 pages, 9 figures

  29. arXiv:2212.04495  [pdf, other

    cs.CV

    MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis

    Authors: Rishabh Dabral, Muhammad Hamza Mughal, Vladislav Golyanik, Christian Theobalt

    Abstract: Conventional methods for human motion synthesis are either deterministic or struggle with the trade-off between motion diversity and motion quality. In response to these limitations, we introduce MoFusion, i.e., a new denoising-diffusion-based framework for high-quality conditional human motion synthesis that can generate long, temporally plausible, and semantically accurate motions based on a ran… ▽ More

    Submitted 15 May, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: CVPR23, 11 pages, 6 figures, 2 tables; project page: https://vcai.mpi-inf.mpg.de/projects/MoFusion

  30. arXiv:2212.01368  [pdf, other

    cs.CV cs.GR cs.LG

    Fast Non-Rigid Radiance Fields from Monocularized Data

    Authors: Moritz Kappel, Vladislav Golyanik, Susana Castillo, Christian Theobalt, Marcus Magnor

    Abstract: The reconstruction and novel view synthesis of dynamic scenes recently gained increased attention. As reconstruction from large-scale multi-view data involves immense memory and computational requirements, recent benchmark datasets provide collections of single monocular views per timestamp sampled from multiple (virtual) cameras. We refer to this form of inputs as "monocularized" data. Existing w… ▽ More

    Submitted 13 November, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

    Comments: 18 pages, 14 figures; project page: https://graphics.tu-bs.de/publications/kappel2022fast

  31. arXiv:2210.15664  [pdf, other

    cs.CV cs.GR

    State of the Art in Dense Monocular Non-Rigid 3D Reconstruction

    Authors: Edith Tretschk, Navami Kairanda, Mallikarjun B R, Rishabh Dabral, Adam Kortylewski, Bernhard Egger, Marc Habermann, Pascal Fua, Christian Theobalt, Vladislav Golyanik

    Abstract: 3D reconstruction of deformable (or non-rigid) scenes from a set of monocular 2D image observations is a long-standing and actively researched area of computer vision and graphics. It is an ill-posed inverse problem, since -- without additional prior assumptions -- it permits infinitely many solutions leading to accurate projection to the input 2D images. Non-rigid reconstruction is a foundational… ▽ More

    Submitted 24 March, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: 36 pages, 18 figures, 3 tables; State-of-the-Art Report at EUROGRAPHICS 2023

    Journal ref: Computer Graphics Forum, 2023

  32. arXiv:2210.08114  [pdf, other

    quant-ph cs.CV cs.LG

    QuAnt: Quantum Annealing with Learnt Couplings

    Authors: Marcel Seelbach Benkner, Maximilian Krahn, Edith Tretschk, Zorah Lähner, Michael Moeller, Vladislav Golyanik

    Abstract: Modern quantum annealers can find high-quality solutions to combinatorial optimisation objectives given as quadratic unconstrained binary optimisation (QUBO) problems. Unfortunately, obtaining suitable QUBO forms in computer vision remains challenging and currently requires problem-specific analytical derivations. Moreover, such explicit formulations impose tangible constraints on solution encodin… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: incl. appendix

  33. arXiv:2210.05665  [pdf, other

    cs.CV cs.AI cs.GR

    HiFECap: Monocular High-Fidelity and Expressive Capture of Human Performances

    Authors: Yue Jiang, Marc Habermann, Vladislav Golyanik, Christian Theobalt

    Abstract: Monocular 3D human performance capture is indispensable for many applications in computer graphics and vision for enabling immersive experiences. However, detailed capture of humans requires tracking of multiple aspects, including the skeletal pose, the dynamic surface, which includes clothing, hand gestures as well as facial expressions. No existing monocular method allows joint tracking of all t… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: Got accepted by BMVC2022

  34. arXiv:2208.08439  [pdf, other

    cs.CV

    MoCapDeform: Monocular 3D Human Motion Capture in Deformable Scenes

    Authors: Zhi Li, Soshi Shimada, Bernt Schiele, Christian Theobalt, Vladislav Golyanik

    Abstract: 3D human motion capture from monocular RGB images respecting interactions of a subject with complex and possibly deformable environments is a very challenging, ill-posed and under-explored problem. Existing methods address it only weakly and do not model possible surface deformations often occurring when humans interact with scene surfaces. In contrast, this paper proposes MoCapDeform, i.e., a new… ▽ More

    Submitted 17 August, 2022; originally announced August 2022.

    Comments: 11 pages, 8 figures, 3 tables; project page: https://4dqv.mpi-inf.mpg.de/MoCapDeform/

    Journal ref: International Conference on 3D Vision 2022 (Oral)

  35. arXiv:2208.01633  [pdf, other

    cs.CV

    UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture

    Authors: Hiroyasu Akada, Jian Wang, Soshi Shimada, Masaki Takahashi, Christian Theobalt, Vladislav Golyanik

    Abstract: We present UnrealEgo, i.e., a new large-scale naturalistic dataset for egocentric 3D human pose estimation. UnrealEgo is based on an advanced concept of eyeglasses equipped with two fisheye cameras that can be used in unconstrained environments. We design their virtual prototype and attach them to 3D human models for stereo view capture. We next generate a large corpus of human motions. As a conse… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    Comments: 21 pages, 10 figures, 10 tables; project page: https://4dqv.mpi-inf.mpg.de/UnrealEgo/

    Journal ref: European Conference on Computer Vision (ECCV) 2022

  36. arXiv:2206.11896  [pdf, other

    cs.CV

    EventNeRF: Neural Radiance Fields from a Single Colour Event Camera

    Authors: Viktor Rudnev, Mohamed Elgharib, Christian Theobalt, Vladislav Golyanik

    Abstract: Asynchronously operating event cameras find many applications due to their high dynamic range, vanishingly low motion blur, low latency and low data bandwidth. The field saw remarkable progress during the last few years, and existing event-based 3D reconstruction approaches recover sparse point clouds of the scene. However, such sparsity is a limiting factor in many cases, especially in computer v… ▽ More

    Submitted 24 March, 2023; v1 submitted 23 June, 2022; originally announced June 2022.

    Comments: 19 pages, 21 figures, 3 tables; CVPR 2023

    Journal ref: Computer Vision and Pattern Recognition (CVPR) 2023

  37. arXiv:2206.08368  [pdf, other

    cs.CV

    Unbiased 4D: Monocular 4D Reconstruction with a Neural Deformation Model

    Authors: Erik C. M. Johnson, Marc Habermann, Soshi Shimada, Vladislav Golyanik, Christian Theobalt

    Abstract: Capturing general deforming scenes from monocular RGB video is crucial for many computer graphics and vision applications. However, current approaches suffer from drawbacks such as struggling with large scene deformations, inaccurate shape completion or requiring 2D point tracks. In contrast, our method, Ub4D, handles large deformations, performs shape completion in occluded regions, and can opera… ▽ More

    Submitted 4 May, 2023; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: 26 pages, 17 figures, 8 tables

  38. arXiv:2205.05677  [pdf, other

    cs.CV cs.GR cs.HC

    HULC: 3D Human Motion Capture with Pose Manifold Sampling and Dense Contact Guidance

    Authors: Soshi Shimada, Vladislav Golyanik, Zhi Li, Patrick Pérez, Weipeng Xu, Christian Theobalt

    Abstract: Marker-less monocular 3D human motion capture (MoCap) with scene interactions is a challenging research topic relevant for extended reality, robotics and virtual avatar generation. Due to the inherent depth ambiguity of monocular settings, 3D motions captured with existing methods often contain severe artefacts such as incorrect body-scene inter-penetrations, jitter and body floating. To tackle th… ▽ More

    Submitted 26 July, 2022; v1 submitted 11 May, 2022; originally announced May 2022.

  39. arXiv:2203.13185  [pdf, other

    cs.CV

    Quantum Motion Segmentation

    Authors: Federica Arrigoni, Willi Menapace, Marcel Seelbach Benkner, Elisa Ricci, Vladislav Golyanik

    Abstract: Motion segmentation is a challenging problem that seeks to identify independent motions in two or several input images. This paper introduces the first algorithm for motion segmentation that relies on adiabatic quantum optimization of the objective function. The proposed method achieves on-par performance with the state of the art on problem instances which can be mapped to modern quantum annealer… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

  40. arXiv:2203.12633  [pdf, other

    cs.CV cs.LG math.OC

    Q-FW: A Hybrid Classical-Quantum Frank-Wolfe for Quadratic Binary Optimization

    Authors: Alp Yurtsever, Tolga Birdal, Vladislav Golyanik

    Abstract: We present a hybrid classical-quantum framework based on the Frank-Wolfe algorithm, Q-FW, for solving quadratic, linearly-constrained, binary optimization problems on quantum annealers (QA). The computational premise of quantum computers has cultivated the re-design of various existing vision problems into quantum-friendly forms. Experimental QA realizations can solve a particular non-convex probl… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: 26 pages with supplementary material

  41. arXiv:2203.11938  [pdf, other

    cs.CV cs.GR

    φ-SfT: Shape-from-Template with a Physics-Based Deformation Model

    Authors: Navami Kairanda, Edith Tretschk, Mohamed Elgharib, Christian Theobalt, Vladislav Golyanik

    Abstract: Shape-from-Template (SfT) methods estimate 3D surface deformations from a single monocular RGB camera while assuming a 3D state known in advance (a template). This is an important yet challenging problem due to the under-constrained nature of the monocular setting. Existing SfT techniques predominantly use geometric and simplified deformation models, which often limits their reconstruction abiliti… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: 11 pages, 8 figures and one table; Computer Vision and Pattern Recognition (CVPR) 2022

  42. arXiv:2203.08528  [pdf, other

    cs.GR

    Physical Inertial Poser (PIP): Physics-aware Real-time Human Motion Tracking from Sparse Inertial Sensors

    Authors: Xinyu Yi, Yuxiao Zhou, Marc Habermann, Soshi Shimada, Vladislav Golyanik, Christian Theobalt, Feng Xu

    Abstract: Motion capture from sparse inertial sensors has shown great potential compared to image-based approaches since occlusions do not lead to a reduced tracking quality and the recording space is not restricted to be within the viewing frustum of the camera. However, capturing the motion and global position only from a sparse set of inertial sensors is inherently ambiguous and challenging. In consequen… ▽ More

    Submitted 16 March, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: Accepted by CVPR 2022 with 3 strong accepts. Project page: https://xinyu-yi.github.io/PIP/

  43. arXiv:2203.01914  [pdf, other

    cs.CV cs.AI

    Playable Environments: Video Manipulation in Space and Time

    Authors: Willi Menapace, Stéphane Lathuilière, Aliaksandr Siarohin, Christian Theobalt, Sergey Tulyakov, Vladislav Golyanik, Elisa Ricci

    Abstract: We present Playable Environments - a new representation for interactive video generation and manipulation in space and time. With a single image at inference time, our novel framework allows the user to move objects in 3D while generating a video by providing a sequence of desired actions. The actions are learnt in an unsupervised manner. The camera can be controlled to get the desired viewpoint.… ▽ More

    Submitted 15 March, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: CVPR 2022

  44. arXiv:2112.05140  [pdf, other

    cs.CV cs.GR

    NeRF for Outdoor Scene Relighting

    Authors: Viktor Rudnev, Mohamed Elgharib, William Smith, Lingjie Liu, Vladislav Golyanik, Christian Theobalt

    Abstract: Photorealistic editing of outdoor scenes from photographs requires a profound understanding of the image formation process and an accurate estimation of the scene geometry, reflectance and illumination. A delicate manipulation of the lighting can then be performed while keeping the scene albedo and geometry unaltered. We present NeRF-OSR, i.e., the first approach for outdoor scene relighting based… ▽ More

    Submitted 21 July, 2022; v1 submitted 9 December, 2021; originally announced December 2021.

    Comments: 22 pages, 10 figures, 2 tables; ECCV 2022; project web page: https://4dqv.mpi-inf.mpg.de/NeRF-OSR/

    Journal ref: European Conference on Computer Vision (ECCV) 2022

  45. arXiv:2111.05849  [pdf, other

    cs.GR cs.CV

    Advances in Neural Rendering

    Authors: Ayush Tewari, Justus Thies, Ben Mildenhall, Pratul Srinivasan, Edgar Tretschk, Yifan Wang, Christoph Lassner, Vincent Sitzmann, Ricardo Martin-Brualla, Stephen Lombardi, Tomas Simon, Christian Theobalt, Matthias Niessner, Jonathan T. Barron, Gordon Wetzstein, Michael Zollhoefer, Vladislav Golyanik

    Abstract: Synthesizing photo-realistic images and videos is at the heart of computer graphics and has been the focus of decades of research. Traditionally, synthetic images of a scene are generated using rendering algorithms such as rasterization or ray tracing, which take specifically defined representations of geometry and material properties as input. Collectively, these inputs define the actual scene an… ▽ More

    Submitted 30 March, 2022; v1 submitted 10 November, 2021; originally announced November 2021.

    Comments: 33 pages, 14 figures, 5 tables; State of the Art Report at EUROGRAPHICS 2022

  46. arXiv:2110.11335  [pdf, other

    cs.CV

    Convex Joint Graph Matching and Clustering via Semidefinite Relaxations

    Authors: Maximilian Krahn, Florian Bernard, Vladislav Golyanik

    Abstract: This paper proposes a new algorithm for simultaneous graph matching and clustering. For the first time in the literature, these two problems are solved jointly and synergetically without relying on any training data, which brings advantages for identifying similar arbitrary objects in compound 3D scenes and matching them. For joint reasoning, we first rephrase graph matching as a rigid point set r… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: 12 pages, 8 figures; source code available; project webpage: https://4dqv.mpi-inf.mpg.de/JointGMC/

    Journal ref: International Conference on 3D Vision (3DV) 2021

  47. arXiv:2108.08844  [pdf, other

    cs.CV

    Gravity-Aware Monocular 3D Human-Object Reconstruction

    Authors: Rishabh Dabral, Soshi Shimada, Arjun Jain, Christian Theobalt, Vladislav Golyanik

    Abstract: This paper proposes GraviCap, i.e., a new approach for joint markerless 3D human motion capture and object trajectory estimation from monocular RGB videos. We focus on scenes with objects partially observed during a free flight. In contrast to existing monocular methods, we can recover scale, object trajectories as well as human bone lengths in meters and the ground plane's orientation, thanks to… ▽ More

    Submitted 19 August, 2021; originally announced August 2021.

    Comments: 12 pages, six figures, five tables; project webpage: http://4dqv.mpi-inf.mpg.de/GraviCap/

    Journal ref: International Conference on Computer Vision (ICCV) 2021

  48. arXiv:2107.04032  [pdf, other

    cs.CV

    Adiabatic Quantum Graph Matching with Permutation Matrix Constraints

    Authors: Marcel Seelbach Benkner, Vladislav Golyanik, Christian Theobalt, Michael Moeller

    Abstract: Matching problems on 3D shapes and images are challenging as they are frequently formulated as combinatorial quadratic assignment problems (QAPs) with permutation matrix constraints, which are NP-hard. In this work, we address such problems with emerging quantum computing technology and propose several reformulations of QAPs as unconstrained problems suitable for efficient execution on quantum har… ▽ More

    Submitted 8 July, 2021; originally announced July 2021.

    Comments: 18 pages, 14 figures, 2 tables; project webpage: http://gvv.mpi-inf.mpg.de/projects/QGM/

    Journal ref: Published at 3DV 2020

  49. arXiv:2107.03109  [pdf, other

    cs.GR cs.CV

    Egocentric Videoconferencing

    Authors: Mohamed Elgharib, Mohit Mendiratta, Justus Thies, Matthias Nießner, Hans-Peter Seidel, Ayush Tewari, Vladislav Golyanik, Christian Theobalt

    Abstract: We introduce a method for egocentric videoconferencing that enables hands-free video calls, for instance by people wearing smart glasses or other mixed-reality devices. Videoconferencing portrays valuable non-verbal communication and face expression cues, but usually requires a front-facing camera. Using a frontal camera in a hands-free setting when a person is on the move is impractical. Even hol… ▽ More

    Submitted 7 July, 2021; originally announced July 2021.

    Comments: Mohamed Elgharib and Mohit Mendiratta contributed equally to this work. http://gvv.mpi-inf.mpg.de/projects/EgoChat/

    Journal ref: ACM Transactions on Graphics, volume = 39, number = 6, articleno = 268, year = 2020

  50. arXiv:2107.01205  [pdf, other

    cs.CV

    HandVoxNet++: 3D Hand Shape and Pose Estimation using Voxel-Based Neural Networks

    Authors: Jameel Malik, Soshi Shimada, Ahmed Elhayek, Sk Aziz Ali, Christian Theobalt, Vladislav Golyanik, Didier Stricker

    Abstract: 3D hand shape and pose estimation from a single depth map is a new and challenging computer vision problem with many applications. Existing methods addressing it directly regress hand meshes via 2D convolutional neural networks, which leads to artefacts due to perspective distortions in the images. To address the limitations of the existing methods, we develop HandVoxNet++, i.e., a voxel-based dee… ▽ More

    Submitted 5 December, 2021; v1 submitted 2 July, 2021; originally announced July 2021.

    Comments: 13 pages, 6 tables, 7 figures; project webpage: http://4dqv.mpi-inf.mpg.de/HandVoxNet++/. arXiv admin note: text overlap with arXiv:2004.01588

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021