Skip to main content

Showing 1–17 of 17 results for author: Orts-Escolano, S

  1. arXiv:2404.01296  [pdf, other

    cs.CV

    MagicMirror: Fast and High-Quality Avatar Generation with a Constrained Search Space

    Authors: Armand Comas-Massagué, Di Qiu, Menglei Chai, Marcel Bühler, Amit Raj, Ruiqi Gao, Qiangeng Xu, Mark Matthews, Paulo Gotardo, Octavia Camps, Sergio Orts-Escolano, Thabo Beeler

    Abstract: We introduce a novel framework for 3D human avatar generation and personalization, leveraging text prompts to enhance user engagement and customization. Central to our approach are key innovations aimed at overcoming the challenges in photo-realistic avatar synthesis. Firstly, we utilize a conditional Neural Radiance Fields (NeRF) model, trained on a large-scale unannotated multi-view dataset, to… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  2. arXiv:2309.16859  [pdf, other

    cs.CV cs.AI cs.LG

    Preface: A Data-driven Volumetric Prior for Few-shot Ultra High-resolution Face Synthesis

    Authors: Marcel C. Bühler, Kripasindhu Sarkar, Tanmay Shah, Gengyan Li, Daoye Wang, Leonhard Helminger, Sergio Orts-Escolano, Dmitry Lagun, Otmar Hilliges, Thabo Beeler, Abhimitra Meka

    Abstract: NeRFs have enabled highly realistic synthesis of human faces including complex appearance and reflectance effects of hair and skin. These methods typically require a large number of multi-view input images, making the process hardware intensive and cumbersome, limiting applicability to unconstrained settings. We propose a novel volumetric human face prior that enables the synthesis of ultra high-r… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

  3. arXiv:2305.04745  [pdf, other

    cs.CV cs.GR

    Controllable Light Diffusion for Portraits

    Authors: David Futschik, Kelvin Ritland, James Vecore, Sean Fanello, Sergio Orts-Escolano, Brian Curless, Daniel Sýkora, Rohit Pandey

    Abstract: We introduce light diffusion, a novel method to improve lighting in portraits, softening harsh shadows and specular highlights while preserving overall scene illumination. Inspired by professional photographers' diffusers and scrims, our method softens lighting given only a single portrait photo. Previous portrait relighting approaches focus on changing the entire lighting environment, removing sh… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: CVPR 2023

    ACM Class: I.4.3

  4. arXiv:2304.01436  [pdf, other

    cs.CV cs.GR

    Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos

    Authors: Ziqian Bai, Feitong Tan, Zeng Huang, Kripasindhu Sarkar, Danhang Tang, Di Qiu, Abhimitra Meka, Ruofei Du, Mingsong Dou, Sergio Orts-Escolano, Rohit Pandey, Ping Tan, Thabo Beeler, Sean Fanello, Yinda Zhang

    Abstract: We propose a method to learn a high-quality implicit 3D head avatar from a monocular RGB video captured in the wild. The learnt avatar is driven by a parametric face model to achieve user-controlled facial expressions and head poses. Our hybrid pipeline combines the geometry prior and dynamic tracking of a 3DMM with a neural radiance field to achieve fine-grained control and photorealism. To reduc… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: In CVPR2023. Project page: https://augmentedperception.github.io/monoavatar/

  5. arXiv:2201.04873  [pdf, other

    cs.CV

    VoLux-GAN: A Generative Model for 3D Face Synthesis with HDRI Relighting

    Authors: Feitong Tan, Sean Fanello, Abhimitra Meka, Sergio Orts-Escolano, Danhang Tang, Rohit Pandey, Jonathan Taylor, Ping Tan, Yinda Zhang

    Abstract: We propose VoLux-GAN, a generative framework to synthesize 3D-aware faces with convincing relighting. Our main contribution is a volumetric HDRI relighting method that can efficiently accumulate albedo, diffuse and specular lighting contributions along each 3D ray for any desired HDR environmental map. Additionally, we show the importance of supervising the image decomposition process using multip… ▽ More

    Submitted 13 January, 2022; originally announced January 2022.

  6. NVS-MonoDepth: Improving Monocular Depth Prediction with Novel View Synthesis

    Authors: Zuria Bauer, Zuoyue Li, Sergio Orts-Escolano, Miguel Cazorla, Marc Pollefeys, Martin R. Oswald

    Abstract: Building upon the recent progress in novel view synthesis, we propose its application to improve monocular depth estimation. In particular, we propose a novel training method split in three main steps. First, the prediction results of a monocular depth network are warped to an additional view point. Second, we apply an additional image synthesis network, which corrects and improves the quality of… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

    Comments: 8 pages (main paper), 9 pages (supplementary material), 14 figures, 4 tables

    Journal ref: 2021 International Conference on 3D Vision (3DV)

  7. arXiv:2104.11776  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    UnrealROX+: An Improved Tool for Acquiring Synthetic Data from Virtual 3D Environments

    Authors: Pablo Martinez-Gonzalez, Sergiu Oprea, John Alejandro Castro-Vargas, Alberto Garcia-Garcia, Sergio Orts-Escolano, Jose Garcia-Rodriguez, Markus Vincze

    Abstract: Synthetic data generation has become essential in last years for feeding data-driven algorithms, which surpassed traditional techniques performance in almost every computer vision problem. Gathering and labelling the amount of data needed for these data-hungry models in the real world may become unfeasible and error-prone, while synthetic data give us the possibility of generating huge amounts of… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

    Comments: Accepted at International Joint Conference on Neural Networks (IJCNN) 2021

  8. arXiv:2103.15017  [pdf, other

    cs.CV cs.AI cs.LG

    H-GAN: the power of GANs in your Hands

    Authors: Sergiu Oprea, Giorgos Karvounas, Pablo Martinez-Gonzalez, Nikolaos Kyriazis, Sergio Orts-Escolano, Iason Oikonomidis, Alberto Garcia-Garcia, Aggeliki Tsoli, Jose Garcia-Rodriguez, Antonis Argyros

    Abstract: We present HandGAN (H-GAN), a cycle-consistent adversarial learning approach implementing multi-scale perceptual discriminators. It is designed to translate synthetic images of hands to the real domain. Synthetic hands provide complete ground-truth annotations, yet they are not representative of the target distribution of real-world data. We strive to provide the perfect blend of a realistic hand… ▽ More

    Submitted 21 April, 2021; v1 submitted 27 March, 2021; originally announced March 2021.

    Comments: Paper accepted at The International Joint Conference on Neural Networks (IJCNN) 2021

  9. arXiv:2008.03806  [pdf, other

    cs.CV cs.GR

    Neural Light Transport for Relighting and View Synthesis

    Authors: Xiuming Zhang, Sean Fanello, Yun-Ta Tsai, Tiancheng Sun, Tianfan Xue, Rohit Pandey, Sergio Orts-Escolano, Philip Davidson, Christoph Rhemann, Paul Debevec, Jonathan T. Barron, Ravi Ramamoorthi, William T. Freeman

    Abstract: The light transport (LT) of a scene describes how it appears under different lighting and viewing directions, and complete knowledge of a scene's LT enables the synthesis of novel views under arbitrary lighting. In this paper, we focus on image-based LT acquisition, primarily for human bodies within a light stage setup. We propose a semi-parametric approach to learn a neural representation of LT t… ▽ More

    Submitted 20 January, 2021; v1 submitted 9 August, 2020; originally announced August 2020.

    Comments: Camera-ready version for TOG 2021. Project Page: http://nlt.csail.mit.edu/

  10. arXiv:2004.05214  [pdf, other

    cs.CV cs.LG eess.IV

    A Review on Deep Learning Techniques for Video Prediction

    Authors: Sergiu Oprea, Pablo Martinez-Gonzalez, Alberto Garcia-Garcia, John Alejandro Castro-Vargas, Sergio Orts-Escolano, Jose Garcia-Rodriguez, Antonis Argyros

    Abstract: The ability to predict, anticipate and reason about future outcomes is a key component of intelligent decision-making systems. In light of the success of deep learning in computer vision, deep-learning-based video prediction emerged as a promising research direction. Defined as a self-supervised learning task, video prediction represents a suitable framework for representation learning, as it demo… ▽ More

    Submitted 14 April, 2020; v1 submitted 10 April, 2020; originally announced April 2020.

    Comments: Submitted to TPAMI

  11. arXiv:2003.14299  [pdf, other

    cs.CV

    Du$^2$Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels

    Authors: Yinda Zhang, Neal Wadhwa, Sergio Orts-Escolano, Christian Häne, Sean Fanello, Rahul Garg

    Abstract: Computational stereo has reached a high level of accuracy, but degrades in the presence of occlusions, repeated textures, and correspondence errors along edges. We present a novel approach based on neural networks for depth estimation that combines stereo from dual cameras with stereo from a dual-pixel sensor, which is increasingly common on consumer cameras. Our network uses a novel architecture… ▽ More

    Submitted 31 March, 2020; originally announced March 2020.

  12. A Visually Plausible Grasping System for Object Manipulation and Interaction in Virtual Reality Environments

    Authors: Sergiu Oprea, Pablo Martinez-Gonzalez, Alberto Garcia-Garcia, John Alejandro Castro-Vargas, Sergio Orts-Escolano, Jose Garcia-Rodriguez

    Abstract: Interaction in virtual reality (VR) environments is essential to achieve a pleasant and immersive experience. Most of the currently existing VR applications, lack of robust object grasping and manipulation, which are the cornerstone of interactive systems. Therefore, we propose a realistic, flexible and robust grasping system that enables rich and real-time interactions in virtual environments. It… ▽ More

    Submitted 12 March, 2019; originally announced March 2019.

  13. arXiv:1901.06514  [pdf, other

    cs.CV cs.LG cs.RO

    The RobotriX: An eXtremely Photorealistic and Very-Large-Scale Indoor Dataset of Sequences with Robot Trajectories and Interactions

    Authors: Alberto Garcia-Garcia, Pablo Martinez-Gonzalez, Sergiu Oprea, John Alejandro Castro-Vargas, Sergio Orts-Escolano, Jose Garcia-Rodriguez, Alvaro Jover-Alvarez

    Abstract: Enter the RobotriX, an extremely photorealistic indoor dataset designed to enable the application of deep learning techniques to a wide variety of robotic vision problems. The RobotriX consists of hyperrealistic indoor scenes which are explored by robot agents which also interact with objects in a visually realistic manner in that simulated world. Photorealistic scenes and robots are rendered by U… ▽ More

    Submitted 19 January, 2019; originally announced January 2019.

  14. arXiv:1901.06181  [pdf, other

    cs.LG cs.RO stat.ML

    TactileGCN: A Graph Convolutional Network for Predicting Grasp Stability with Tactile Sensors

    Authors: Alberto Garcia-Garcia, Brayan Stiven Zapata-Impata, Sergio Orts-Escolano, Pablo Gil, Jose Garcia-Rodriguez

    Abstract: Tactile sensors provide useful contact data during the interaction with an object which can be used to accurately learn to determine the stability of a grasp. Most of the works in the literature represented tactile readings as plain feature vectors or matrix-like tactile images, using them to train machine learning models. In this work, we explore an alternative way of exploiting tactile informati… ▽ More

    Submitted 18 January, 2019; originally announced January 2019.

  15. arXiv:1810.06936  [pdf, other

    cs.RO cs.CV cs.MM

    UnrealROX: An eXtremely Photorealistic Virtual Reality Environment for Robotics Simulations and Synthetic Data Generation

    Authors: Pablo Martinez-Gonzalez, Sergiu Oprea, Alberto Garcia-Garcia, Alvaro Jover-Alvarez, Sergio Orts-Escolano, Jose Garcia-Rodriguez

    Abstract: Data-driven algorithms have surpassed traditional techniques in almost every aspect in robotic vision problems. Such algorithms need vast amounts of quality data to be able to work properly after their training process. Gathering and annotating that sheer amount of data in the real world is a time-consuming and error-prone task. Those problems limit scale and quality. Synthetic data generation has… ▽ More

    Submitted 8 November, 2019; v1 submitted 16 October, 2018; originally announced October 2018.

    Comments: Published in Virtual Reality journal

  16. arXiv:1707.03742  [pdf, other

    cs.HC cs.CV

    Large-scale Multiview 3D Hand Pose Dataset

    Authors: Francisco Gomez-Donoso, Sergio Orts-Escolano, Miguel Cazorla

    Abstract: Accurate hand pose estimation at joint level has several uses on human-robot interaction, user interfacing and virtual reality applications. Yet, it currently is not a solved problem. The novel deep learning techniques could make a great improvement on this matter but they need a huge amount of annotated data. The hand pose datasets released so far present some issues that make them impossible to… ▽ More

    Submitted 18 July, 2017; v1 submitted 12 July, 2017; originally announced July 2017.

  17. arXiv:1704.06857  [pdf, other

    cs.CV cs.AI

    A Review on Deep Learning Techniques Applied to Semantic Segmentation

    Authors: Alberto Garcia-Garcia, Sergio Orts-Escolano, Sergiu Oprea, Victor Villena-Martinez, Jose Garcia-Rodriguez

    Abstract: Image semantic segmentation is more and more being of interest for computer vision and machine learning researchers. Many applications on the rise need accurate and efficient segmentation mechanisms: autonomous driving, indoor navigation, and even virtual or augmented reality systems to name a few. This demand coincides with the rise of deep learning approaches in almost every field or application… ▽ More

    Submitted 22 April, 2017; originally announced April 2017.

    Comments: Submitted to TPAMI on Apr. 22, 2017