Skip to main content

Showing 1–26 of 26 results for author: Nagano, K

  1. arXiv:2405.00794  [pdf, other

    cs.CV

    Coherent 3D Portrait Video Reconstruction via Triplane Fusion

    Authors: Shengze Wang, Xueting Li, Chao Liu, Matthew Chan, Michael Stengel, Josef Spjut, Henry Fuchs, Shalini De Mello, Koki Nagano

    Abstract: Recent breakthroughs in single-image 3D portrait reconstruction have enabled telepresence systems to stream 3D portrait videos from a single camera in real-time, potentially democratizing telepresence. However, per-frame 3D reconstruction exhibits temporal inconsistency and forgets the user's appearance. On the other hand, self-reenactment methods can render coherent 3D portraits by driving a pers… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  2. arXiv:2405.00196  [pdf, other

    cs.CV

    Synthetic Image Verification in the Era of Generative AI: What Works and What Isn't There Yet

    Authors: Diangarti Tariang, Riccardo Corvi, Davide Cozzolino, Giovanni Poggi, Koki Nagano, Luisa Verdoliva

    Abstract: In this work we present an overview of approaches for the detection and attribution of synthetic images and highlight their strengths and weaknesses. We also point out and discuss hot topics in this field and outline promising directions for future research.

    Submitted 30 April, 2024; originally announced May 2024.

  3. arXiv:2401.02411  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs

    Authors: Alex Trevithick, Matthew Chan, Towaki Takikawa, Umar Iqbal, Shalini De Mello, Manmohan Chandraker, Ravi Ramamoorthi, Koki Nagano

    Abstract: 3D-aware Generative Adversarial Networks (GANs) have shown remarkable progress in learning to generate multi-view-consistent images and 3D geometries of scenes from collections of 2D images via neural volume rendering. Yet, the significant memory and computational costs of dense sampling in volume rendering have forced 3D GANs to adopt patch-based training or employ low-resolution rendering with p… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: See our project page: https://research.nvidia.com/labs/nxp/wysiwyg/

  4. arXiv:2312.11461  [pdf, other

    cs.CV cs.GR cs.LG

    GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning

    Authors: Ye Yuan, Xueting Li, Yangyi Huang, Shalini De Mello, Koki Nagano, Jan Kautz, Umar Iqbal

    Abstract: Gaussian splatting has emerged as a powerful 3D representation that harnesses the advantages of both explicit (mesh) and implicit (NeRF) 3D representations. In this paper, we seek to leverage Gaussian splatting to generate realistic animatable avatars from textual descriptions, addressing the limitations (e.g., flexibility and efficiency) imposed by mesh or NeRF-based representations. However, a n… ▽ More

    Submitted 29 March, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: CVPR 2024. Project website: https://nvlabs.github.io/GAvatar

  5. arXiv:2311.16854  [pdf, other

    cs.CV

    A Unified Approach for Text- and Image-guided 4D Scene Generation

    Authors: Yufeng Zheng, Xueting Li, Koki Nagano, Sifei Liu, Karsten Kreis, Otmar Hilliges, Shalini De Mello

    Abstract: Large-scale diffusion generative models are greatly simplifying image, video and 3D asset creation from user-provided text prompts and images. However, the challenging problem of text-to-4D dynamic 3D scene generation with diffusion guidance remains largely unexplored. We propose Dream-in-4D, which features a novel two-stage approach for text-to-4D synthesis, leveraging (1) 3D and 2D diffusion gui… ▽ More

    Submitted 7 May, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: Project page: https://research.nvidia.com/labs/nxp/dream-in-4d/

  6. arXiv:2309.12428  [pdf, other

    cs.CV

    Synthetic Image Detection: Highlights from the IEEE Video and Image Processing Cup 2022 Student Competition

    Authors: Davide Cozzolino, Koki Nagano, Lucas Thomaz, Angshul Majumdar, Luisa Verdoliva

    Abstract: The Video and Image Processing (VIP) Cup is a student competition that takes place each year at the IEEE International Conference on Image Processing. The 2022 IEEE VIP Cup asked undergraduate students to develop a system capable of distinguishing pristine images from generated ones. The interest in this topic stems from the incredible advances in the AI-based generation of visual data, with tools… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  7. arXiv:2306.08768  [pdf, other

    cs.CV

    Generalizable One-shot Neural Head Avatar

    Authors: Xueting Li, Shalini De Mello, Sifei Liu, Koki Nagano, Umar Iqbal, Jan Kautz

    Abstract: We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image. Existing methods either involve time-consuming optimization for a specific person with multiple images, or they struggle to synthesize intricate appearance details beyond the facial region. To address these limitations, we propose a framework that not only generalizes to unseen identities based o… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

  8. arXiv:2305.03713  [pdf, other

    cs.CV

    Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos

    Authors: Ekta Prashnani, Koki Nagano, Shalini De Mello, David Luebke, Orazio Gallo

    Abstract: Modern generators render talking-head videos with impressive photorealism, ushering in new user experiences such as videoconferencing under constrained bandwidth budgets. Their safe adoption, however, requires a mechanism to verify if the rendered video is trustworthy. For instance, for videoconferencing we must identify cases in which a synthetic video portrait uses the appearance of an individua… ▽ More

    Submitted 12 September, 2023; v1 submitted 5 May, 2023; originally announced May 2023.

    Comments: 13 pages, 6 figures

  9. Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization

    Authors: Connor Z. Lin, Koki Nagano, Jan Kautz, Eric R. Chan, Umar Iqbal, Leonidas Guibas, Gordon Wetzstein, Sameh Khamis

    Abstract: There is a growing demand for the accessible creation of high-quality 3D avatars that are animatable and customizable. Although 3D morphable models provide intuitive control for editing and animation, and robustness for single-view face reconstruction, they cannot easily capture geometric and appearance details. Methods based on neural implicit representations, such as signed distance functions (S… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: SIGGRAPH 2023, Project Page: https://research.nvidia.com/labs/toronto-ai/ssif

  10. arXiv:2305.02310  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Real-Time Radiance Fields for Single-Image Portrait View Synthesis

    Authors: Alex Trevithick, Matthew Chan, Michael Stengel, Eric R. Chan, Chao Liu, Zhiding Yu, Sameh Khamis, Manmohan Chandraker, Ravi Ramamoorthi, Koki Nagano

    Abstract: We present a one-shot method to infer and render a photorealistic 3D representation from a single unposed image (e.g., face portrait) in real-time. Given a single RGB input, our image encoder directly predicts a canonical triplane representation of a neural radiance field for 3D-aware novel view synthesis via volume rendering. Our method is fast (24 fps) on consumer hardware, and produces higher q… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: Project page: https://research.nvidia.com/labs/nxp/lp3d/

  11. arXiv:2304.06408  [pdf, other

    cs.CV

    Intriguing properties of synthetic images: from generative adversarial networks to diffusion models

    Authors: Riccardo Corvi, Davide Cozzolino, Giovanni Poggi, Koki Nagano, Luisa Verdoliva

    Abstract: Detecting fake images is becoming a major goal of computer vision. This need is becoming more and more pressing with the continuous improvement of synthesis methods based on Generative Adversarial Networks (GAN), and even more with the appearance of powerful methods based on Diffusion Models (DM). Towards this end, it is important to gain insight into which image features better discriminate fake… ▽ More

    Submitted 29 June, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

  12. arXiv:2304.02602  [pdf, other

    cs.CV cs.AI cs.GR

    Generative Novel View Synthesis with 3D-Aware Diffusion Models

    Authors: Eric R. Chan, Koki Nagano, Matthew A. Chan, Alexander W. Bergman, Jeong Joon Park, Axel Levy, Miika Aittala, Shalini De Mello, Tero Karras, Gordon Wetzstein

    Abstract: We present a diffusion-based model for 3D-aware generative novel view synthesis from as few as a single input image. Our model samples from the distribution of possible renderings consistent with the input and, even in the presence of ambiguity, is capable of rendering diverse and plausible novel views. To achieve this, our method makes use of existing 2D diffusion backbones but, crucially, incorp… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: Project page: https://nvlabs.github.io/genvs

  13. arXiv:2212.03237  [pdf, other

    cs.CV

    RANA: Relightable Articulated Neural Avatars

    Authors: Umar Iqbal, Akin Caliskan, Koki Nagano, Sameh Khamis, Pavlo Molchanov, Jan Kautz

    Abstract: We propose RANA, a relightable and articulated neural avatar for the photorealistic synthesis of humans under arbitrary viewpoints, body poses, and lighting. We only require a short video clip of the person to create the avatar and assume no knowledge about the lighting environment. We present a novel framework to model humans while disentangling their geometry, texture, and also lighting environm… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: project page: https://nvlabs.github.io/RANA/

  14. arXiv:2211.00680  [pdf, other

    cs.CV

    On the detection of synthetic images generated by diffusion models

    Authors: Riccardo Corvi, Davide Cozzolino, Giada Zingarini, Giovanni Poggi, Koki Nagano, Luisa Verdoliva

    Abstract: Over the past decade, there has been tremendous progress in creating synthetic media, mainly thanks to the development of powerful methods based on generative adversarial networks (GAN). Very recently, methods based on diffusion models (DM) have been gaining the spotlight. In addition to providing an impressive level of photorealism, they enable the creation of text-based visual content, opening u… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  15. arXiv:2209.10510  [pdf, other

    cs.CV cs.GR cs.LG

    Learning to Relight Portrait Images via a Virtual Light Stage and Synthetic-to-Real Adaptation

    Authors: Yu-Ying Yeh, Koki Nagano, Sameh Khamis, Jan Kautz, Ming-Yu Liu, Ting-Chun Wang

    Abstract: Given a portrait image of a person and an environment map of the target lighting, portrait relighting aims to re-illuminate the person in the image as if the person appeared in an environment with the target lighting. To achieve high-quality results, recent methods rely on deep learning. An effective approach is to supervise the training of deep neural networks with a high-fidelity dataset of desi… ▽ More

    Submitted 10 August, 2023; v1 submitted 21 September, 2022; originally announced September 2022.

    Comments: To appear in ACM Transactions on Graphics (SIGGRAPH Asia 2022). 21 pages, 25 figures, 7 tables. Project page: https://research.nvidia.com/labs/dir/lumos/

    Journal ref: ACM Trans. Graph. 41, 6, Article 231 (December 2022), 21 pages

  16. arXiv:2205.07058  [pdf, other

    cs.CV

    RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis

    Authors: Jonathan Tremblay, Moustafa Meshry, Alex Evans, Jan Kautz, Alexander Keller, Sameh Khamis, Thomas Müller, Charles Loop, Nathan Morrical, Koki Nagano, Towaki Takikawa, Stan Birchfield

    Abstract: We present a large-scale synthetic dataset for novel view synthesis consisting of ~300k images rendered from nearly 2000 complex scenes using high-quality ray tracing at high resolution (1600 x 1600 pixels). The dataset is orders of magnitude larger than existing synthetic datasets for novel view synthesis, thus providing a large unified benchmark for both training and evaluation. Using 4 distinct… ▽ More

    Submitted 24 October, 2022; v1 submitted 14 May, 2022; originally announced May 2022.

    Comments: ECCV 2022 Workshop on Learning to Generate 3D Shapes and Scenes. Project page at http://www.cs.umd.edu/~mmeshry/projects/rtmv

  17. arXiv:2203.15798  [pdf, other

    cs.CV

    DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance Fields for Articulated Avatars

    Authors: Amit Raj, Umar Iqbal, Koki Nagano, Sameh Khamis, Pavlo Molchanov, James Hays, Jan Kautz

    Abstract: Acquisition and creation of digital human avatars is an important problem with applications to virtual telepresence, gaming, and human modeling. Most contemporary approaches for avatar generation can be viewed either as 3D-based methods, which use multi-view data to learn a 3D representation with appearance (such as a mesh, implicit surface, or volume), or 2D-based methods which learn photo-realis… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: Project page at https://dracon-avatars.github.io/

  18. arXiv:2203.13964  [pdf, other

    cs.CV

    Fusing Global and Local Features for Generalized AI-Synthesized Image Detection

    Authors: Yan Ju, Shan Jia, Lipeng Ke, Hongfei Xue, Koki Nagano, Siwei Lyu

    Abstract: With the development of the Generative Adversarial Networks (GANs) and DeepFakes, AI-synthesized images are now of such high quality that humans can hardly distinguish them from real images. It is imperative for media forensics to develop detectors to expose them accurately. Existing detection methods have shown high performance in generated images detection, but they tend to generalize poorly in… ▽ More

    Submitted 22 November, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

    Comments: 5 pages, 3 figures, 2 tables

  19. arXiv:2112.07945  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Efficient Geometry-aware 3D Generative Adversarial Networks

    Authors: Eric R. Chan, Connor Z. Lin, Matthew A. Chan, Koki Nagano, Boxiao Pan, Shalini De Mello, Orazio Gallo, Leonidas Guibas, Jonathan Tremblay, Sameh Khamis, Tero Karras, Gordon Wetzstein

    Abstract: Unsupervised generation of high-quality multi-view-consistent images and 3D shapes using only collections of single-view 2D photographs has been a long-standing challenge. Existing 3D GANs are either compute-intensive or make approximations that are not 3D-consistent; the former limits quality and resolution of the generated images and the latter adversely affects multi-view consistency and shape… ▽ More

    Submitted 27 April, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: Project page: https://matthew-a-chan.github.io/EG3D

  20. arXiv:2112.01741  [pdf, other

    cs.CV cs.GR cs.LG

    Frame Averaging for Equivariant Shape Space Learning

    Authors: Matan Atzmon, Koki Nagano, Sanja Fidler, Sameh Khamis, Yaron Lipman

    Abstract: The task of shape space learning involves mapping a train set of shapes to and from a latent representation space with good generalization properties. Often, real-world collections of shapes have symmetries, which can be defined as transformations that do not change the essence of the shape. A natural way to incorporate symmetries in shape space learning is to ask that the mapping to the shape spa… ▽ More

    Submitted 26 August, 2022; v1 submitted 3 December, 2021; originally announced December 2021.

    Comments: Accepted to CVPR 2022

  21. arXiv:2106.11423  [pdf, other

    cs.CV cs.GR

    Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement

    Authors: Huiwen Luo, Koki Nagano, Han-Wei Kung, Mclean Goldwhite, Qingguo Xu, Zejian Wang, Lingyu Wei, Liwen Hu, Hao Li

    Abstract: We introduce a highly robust GAN-based framework for digitizing a normalized 3D avatar of a person from a single unconstrained photo. While the input image can be of a smiling person or taken in extreme lighting conditions, our method can reliably produce a high-quality textured model of a person's face in neutral expression and skin textures under diffuse lighting condition. Cutting-edge 3D face… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

    Comments: Accepted to CVPR 2021

  22. arXiv:2103.16879  [pdf, other

    cs.GT econ.TH math.OC

    Optimal class assignment problem: a case study at Gunma University

    Authors: Akifumi Kira, Kiyohito Nagano, Manabu Sugiyama, Naoyuki Kamiyama

    Abstract: In this study, we consider the real-world problem of assigning students to classes, where each student has a preference list, ranking a subset of classes in order of preference. Though we use existing approaches to include the daily class assignment of Gunma University, new concepts and adjustments are required to find improved results depending on real instances in the field. Thus, we propose min… ▽ More

    Submitted 31 March, 2021; originally announced March 2021.

    Comments: 15 pages

    MSC Class: 90C27; 91A80

  23. arXiv:2004.12452  [pdf, other

    cs.CV

    One-Shot Identity-Preserving Portrait Reenactment

    Authors: Sitao Xiang, Yuming Gu, Pengda Xiang, Mingming He, Koki Nagano, Haiwei Chen, Hao Li

    Abstract: We present a deep learning-based framework for portrait reenactment from a single picture of a target (one-shot) and a video of a driving subject. Existing facial reenactment methods suffer from identity mismatch and produce inconsistent identities when a target and a driving subject are different (cross-subject), especially in one-shot settings. In this work, we aim to address identity preservati… ▽ More

    Submitted 26 April, 2020; originally announced April 2020.

    Comments: 29 pages, 14 figures

  24. arXiv:1908.09135  [pdf, ps, other

    cs.DS

    Subadditive Load Balancing

    Authors: Kiyohito Nagano, Akihiro Kishimoto

    Abstract: Set function optimization is essential in AI and machine learning. We focus on a subadditive set function that generalizes submodularity, and examine the subadditivity of non-submodular functions. We also deal with a minimax subadditive load balancing problem, and present a modularization-minimization algorithm that theoretically guarantees a worst-case approximation factor. In addition, we give a… ▽ More

    Submitted 24 August, 2019; originally announced August 2019.

    Comments: 17 pages, 3 figures

    MSC Class: 90C27; 68W25

  25. arXiv:1612.00523  [pdf, other

    cs.CV cs.GR

    Photorealistic Facial Texture Inference Using Deep Neural Networks

    Authors: Shunsuke Saito, Lingyu Wei, Liwen Hu, Koki Nagano, Hao Li

    Abstract: We present a data-driven inference method that can synthesize a photorealistic texture map of a complete 3D face model given a partial 2D view of a person in the wild. After an initial estimation of shape and low-frequency albedo, we compute a high-frequency partial texture map, without the shading component, of the visible face area. To extract the fine appearance details from this incomplete inp… ▽ More

    Submitted 1 December, 2016; originally announced December 2016.

  26. arXiv:1309.6850  [pdf

    cs.LG cs.DS stat.ML

    Structured Convex Optimization under Submodular Constraints

    Authors: Kiyohito Nagano, Yoshinobu Kawahara

    Abstract: A number of discrete and continuous optimization problems in machine learning are related to convex minimization problems under submodular constraints. In this paper, we deal with a submodular function with a directed graph structure, and we show that a wide range of convex optimization problems under submodular constraints can be solved much more efficiently than general submodular optimization m… ▽ More

    Submitted 26 September, 2013; originally announced September 2013.

    Comments: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

    Report number: UAI-P-2013-PG-459-468