Skip to main content

Showing 1–4 of 4 results for author: Zhao, B N

  1. arXiv:2312.14216  [pdf, other

    cs.CV

    DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models

    Authors: Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet, Yunhao Ge

    Abstract: The popularization of Text-to-Image (T2I) diffusion models enables the generation of high-quality images from text descriptions. However, generating diverse customized images with reference visual attributes remains challenging. This work focuses on personalizing T2I diffusion models at a more abstract concept or category level, adapting commonalities from a set of reference images while creating… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  2. arXiv:2309.05956  [pdf, other

    cs.CV

    Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation

    Authors: Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet

    Abstract: We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-to-image synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach1 decouples training data generation into foreground object generation, and contextually coherent background generation. To generate foreground objects, we employ a straightforward textual template,… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: Code in https://github.com/gyhandy/Text2Image-for-Detection

  3. arXiv:2212.07629  [pdf, other

    cs.CV

    EM-Paste: EM-guided Cut-Paste with DALL-E Augmentation for Image-level Weakly Supervised Instance Segmentation

    Authors: Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Laurent Itti, Vibhav Vineet

    Abstract: We propose EM-PASTE: an Expectation Maximization(EM) guided Cut-Paste compositional dataset augmentation approach for weakly-supervised instance segmentation using only image-level supervision. The proposed method consists of three main components. The first component generates high-quality foreground object masks. To this end, an EM-like approach is proposed that iteratively refines an initial se… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: 15 pages (including appendix), 7 figures

  4. arXiv:2206.09592  [pdf, other

    cs.CV

    DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection

    Authors: Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet

    Abstract: We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-toimage synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach decouples training data generation into foreground object mask generation and background (context) image generation. For foreground object mask generation, we use a simple textual template with obje… ▽ More

    Submitted 21 December, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: v3(same as v2) version, update structure (add foreground generation, stable diffusion), add more experiments