Skip to main content

Showing 1–7 of 7 results for author: Hayakawa, A

  1. arXiv:2405.17842  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation

    Authors: Akio Hayakawa, Masato Ishii, Takashi Shibuya, Yuki Mitsufuji

    Abstract: In this study, we aim to construct an audio-video generative model with minimal computational cost by leveraging pre-trained single-modal generative models for audio and video. To achieve this, we propose a novel method that guides each single-modal model to cooperatively generate well-aligned samples across modalities. Specifically, given two pre-trained base diffusion models, we train a lightwei… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  2. arXiv:2303.15780  [pdf, other

    cs.CV

    Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversion

    Authors: Hiromichi Kamata, Yuiko Sakuma, Akio Hayakawa, Masato Ishii, Takuya Narihira

    Abstract: We propose a high-quality 3D-to-3D conversion method, Instruct 3D-to-3D. Our method is designed for a novel task, which is to convert a given 3D scene to another scene according to text instructions. Instruct 3D-to-3D applies pretrained Image-to-Image diffusion models for 3D-to-3D conversion. This enables the likelihood maximization of each viewpoint image and high-quality 3D generation. In additi… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: Project page: https://sony.github.io/Instruct3Dto3D-doc/

  3. arXiv:2212.02024  [pdf, other

    cs.CV cs.LG

    Fine-grained Image Editing by Pixel-wise Guidance Using Diffusion Models

    Authors: Naoki Matsunaga, Masato Ishii, Akio Hayakawa, Kenji Suzuki, Takuya Narihira

    Abstract: Our goal is to develop fine-grained real-image editing methods suitable for real-world applications. In this paper, we first summarize four requirements for these methods and propose a novel diffusion-based image editing framework with pixel-wise guidance that satisfies these requirements. Specifically, we train pixel-classifiers with a few annotated data and then infer the segmentation map of a t… ▽ More

    Submitted 31 May, 2023; v1 submitted 4 December, 2022; originally announced December 2022.

    Comments: Accepted by AI for Content Creation (AI4CC) workshop at CVPR 2023

  4. arXiv:2102.06725  [pdf, other

    cs.LG cs.CV

    Neural Network Libraries: A Deep Learning Framework Designed from Engineers' Perspectives

    Authors: Takuya Narihira, Javier Alonsogarcia, Fabien Cardinaux, Akio Hayakawa, Masato Ishii, Kazunori Iwaki, Thomas Kemp, Yoshiyuki Kobayashi, Lukas Mauch, Akira Nakamura, Yukio Obuchi, Andrew Shin, Kenji Suzuki, Stephen Tiedmann, Stefan Uhlich, Takuya Yashima, Kazuki Yoshiyama

    Abstract: While there exist a plethora of deep learning tools and frameworks, the fast-growing complexity of the field brings new demands and challenges, such as more flexible network design, speedy computation on distributed setting, and compatibility between different tools. In this paper, we introduce Neural Network Libraries (https://nnabla.org), a deep learning framework designed from engineer's perspe… ▽ More

    Submitted 21 June, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: https://nnabla.org

  5. arXiv:2011.12528  [pdf, other

    cs.CV

    Reference-Based Video Colorization with Spatiotemporal Correspondence

    Authors: Naofumi Akimoto, Akio Hayakawa, Andrew Shin, Takuya Narihira

    Abstract: We propose a novel reference-based video colorization framework with spatiotemporal correspondence. Reference-based methods colorize grayscale frames referencing a user input color frame. Existing methods suffer from the color leakage between objects and the emergence of average colors, derived from non-local semantic correspondence in space. To address this issue, we warp colors only from the reg… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.

  6. arXiv:2010.14109  [pdf, other

    cs.LG

    Out-of-core Training for Extremely Large-Scale Neural Networks With Adaptive Window-Based Scheduling

    Authors: Akio Hayakawa, Takuya Narihira

    Abstract: While large neural networks demonstrate higher performance in various tasks, training large networks is difficult due to limitations on GPU memory size. We propose a novel out-of-core algorithm that enables faster training of extremely large-scale neural networks with sizes larger than allotted GPU memory. Under a given memory budget constraint, our scheduling algorithm locally adapts the timing o… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

  7. arXiv:1806.01694  [pdf

    cs.CL

    Understanding Meanings in Multilingual Customer Feedback

    Authors: Chao-Hong Liu, Declan Groves, Akira Hayakawa, Alberto Poncelas, Qun Liu

    Abstract: Understanding and being able to react to customer feedback is the most fundamental task in providing good customer service. However, there are two major obstacles for international companies to automatically detect the meaning of customer feedback in a global multilingual environment. Firstly, there is no widely acknowledged categorisation (classes) of meaning for customer feedback. Secondly, the… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.