Skip to main content

Showing 1–21 of 21 results for author: You, M

  1. arXiv:2405.15364  [pdf, other

    cs.CV

    NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer

    Authors: Meng You, Zhiyu Zhu, Hui Liu, Junhui Hou

    Abstract: By harnessing the potent generative capabilities of pre-trained large video diffusion models, we propose NVS-Solver, a new novel view synthesis (NVS) paradigm that operates \textit{without} the need for training. NVS-Solver adaptively modulates the diffusion sampling process with the given views to enable the creation of remarkable visual experiences from single or multiple views of static scenes… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Technical Report

  2. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  3. arXiv:2401.12998  [pdf

    cs.CL cs.AI

    Evaluating and Enhancing Large Language Models Performance in Domain-specific Medicine: Osteoarthritis Management with DocOA

    Authors: Xi Chen, MingKe You, Li Wang, WeiZhi Liu, Yu Fu, Jie Xu, Shaoting Zhang, Gang Chen, Kang Li, Jian Li

    Abstract: The efficacy of large language models (LLMs) in domain-specific medicine, particularly for managing complex diseases such as osteoarthritis (OA), remains largely unexplored. This study focused on evaluating and enhancing the clinical capabilities of LLMs in specific domains, using osteoarthritis (OA) management as a case study. A domain specific benchmark framework was developed, which evaluate LL… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: 16 Pages, 7 Figures

  4. arXiv:2307.10846  [pdf, other

    cs.RO cs.AI

    Goal-Conditioned Reinforcement Learning with Disentanglement-based Reachability Planning

    Authors: Zhifeng Qian, Mingyu You, Hongjun Zhou, Xuanhui Xu, Bin He

    Abstract: Goal-Conditioned Reinforcement Learning (GCRL) can enable agents to spontaneously set diverse goals to learn a set of skills. Despite the excellent works proposed in various fields, reaching distant goals in temporally extended tasks remains a challenge for GCRL. Current works tackled this problem by leveraging planning algorithms to plan intermediate subgoals to augment GCRL. Their methods need t… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted by 2023 RAL with ICRA

  5. arXiv:2304.01716  [pdf, other

    cs.CV

    Decoupling Dynamic Monocular Videos for Dynamic View Synthesis

    Authors: Meng You, Junhui Hou

    Abstract: The challenge of dynamic view synthesis from dynamic monocular videos, i.e., synthesizing novel views for free viewpoints given a monocular video of a dynamic scene captured by a moving camera, mainly lies in accurately modeling the \textbf{dynamic objects} of a scene using limited 2D frames, each with a varying timestamp and viewpoint. Existing methods usually require pre-processed 2D optical flo… ▽ More

    Submitted 30 May, 2024; v1 submitted 4 April, 2023; originally announced April 2023.

  6. Model-free Optimization and Experimental Validation of RIS-assisted Wireless Communications under Rich Multipath Fading

    Authors: Tianrui Chen, Minglei You, Yangyishi Zhang, Gan Zheng, Jean Baptiste Gros, Geoffroy Lerosey, Youssef Nasser, Fraser Burton, Gabriele Gradoni

    Abstract: Reconfigurable intelligent surface (RIS) devices have emerged as an effective way to control the propagation channels for enhancing the end-users' performance. However, RIS optimization involves configuring the radio frequency response of a large number of radiating elements, which is challenging in real-world applications due to high computational complexity. In this paper, a model-free cross-ent… ▽ More

    Submitted 15 February, 2024; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: accepted by IEEE Wireless Communications Letters

  7. arXiv:2209.05013  [pdf, other

    cs.CV

    Learning A Locally Unified 3D Point Cloud for View Synthesis

    Authors: Meng You, Mantang Guo, Xianqiang Lyu, Hui Liu, Junhui Hou

    Abstract: In this paper, we explore the problem of 3D point cloud representation-based view synthesis from a set of sparse source views. To tackle this challenging problem, we propose a new deep learning-based view synthesis paradigm that learns a locally unified 3D point cloud from source views. Specifically, we first construct sub-point clouds by projecting source views to 3D space based on their depth ma… ▽ More

    Submitted 30 September, 2023; v1 submitted 12 September, 2022; originally announced September 2022.

    Comments: Accepted to TIP

  8. arXiv:2209.01996  [pdf, other

    cs.SD cs.CL eess.AS

    Bridging Music and Text with Crowdsourced Music Comments: A Sequence-to-Sequence Framework for Thematic Music Comments Generation

    Authors: Peining Zhang, Junliang Guo, Linli Xu, Mu You, Junming Yin

    Abstract: We consider a novel task of automatically generating text descriptions of music. Compared with other well-established text generation tasks such as image caption, the scarcity of well-paired music and text datasets makes it a much more challenging task. In this paper, we exploit the crowd-sourced music comments to construct a new dataset and propose a sequence-to-sequence model to generate text de… ▽ More

    Submitted 5 September, 2022; originally announced September 2022.

  9. 3D Part Assembly Generation with Instance Encoded Transformer

    Authors: Rufeng Zhang, Tao Kong, Weihao Wang, Xuan Han, Mingyu You

    Abstract: It is desirable to enable robots capable of automatic assembly. Structural understanding of object parts plays a crucial role in this task yet remains relatively unexplored. In this paper, we focus on the setting of furniture assembly from a complete set of part geometries, which is essentially a 6-DoF part pose estimation problem. We propose a multi-layer transformer-based framework that involves… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: 8 pages, 7 figures

    Journal ref: IROS 2022 and IEEE Robotics and Automation Letters (RA-L), 2022

  10. Weakly Supervised Disentangled Representation for Goal-conditioned Reinforcement Learning

    Authors: Zhifeng Qian, Mingyu You, Hongjun Zhou, Bin He

    Abstract: Goal-conditioned reinforcement learning is a crucial yet challenging algorithm which enables agents to achieve multiple user-specified goals when learning a set of skills in a dynamic environment. However, it typically requires millions of the environmental interactions explored by agents, which is sample-inefficient. In the paper, we propose a skill learning framework DR-GRL that aims to improve… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

    Comments: 8 pages; Accepted by RAL with ICRA 2022

    Journal ref: IEEE RAL 2022

  11. Digital Twins based Day-ahead Integrated Energy System Scheduling under Load and Renewable Energy Uncertainties

    Authors: Minglei You, Qian Wang, Hongjian Sun, Ivan Castro, Jing Jiang

    Abstract: By constructing digital twins (DT) of an integrated energy system (IES), one can benefit from DT's predictive capabilities to improve coordinations among various energy converters, hence enhancing energy efficiency, cost savings and carbon emission reduction. This paper is motivated by the fact that practical IESs suffer from multiple uncertainty sources, and complicated surrounding environment. T… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.

    Comments: 28 pages, 8 figures, journal paper accepted by Applied Energy

    Journal ref: Applied Energy, 2022, vol 305, 117899

  12. arXiv:2109.07819  [pdf, other

    cs.IT eess.SP

    Model-driven Learning for Generic MIMO Downlink Beamforming With Uplink Channel Information

    Authors: Juping Zhang, Minglei You, Gan Zheng, Ioannis Krikidis, Liqiang Zhao

    Abstract: Accurate downlink channel information is crucial to the beamforming design, but it is difficult to obtain in practice. This paper investigates a deep learning-based optimization approach of the downlink beamforming to maximize the system sum rate, when only the uplink channel information is available. Our main contribution is to propose a model-driven learning technique that exploits the structure… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: Accepted in IEEE Transactions on Wireless Communications

  13. MeSIN: Multilevel Selective and Interactive Network for Medication Recommendation

    Authors: Yang An, Liang Zhang, Mao You, Xueqing Tian, Bo Jin, Xiaopeng Wei

    Abstract: Recommending medications for patients using electronic health records (EHRs) is a crucial data mining task for an intelligent healthcare system. It can assist doctors in making clinical decisions more efficiently. However, the inherent complexity of the EHR data renders it as a challenging task: (1) Multilevel structures: the EHR data typically contains multilevel structures which are closely rela… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

    Comments: 15 pages, 6 figures

  14. arXiv:2012.11187  [pdf, other

    cs.CV

    Diverse Knowledge Distillation for End-to-End Person Search

    Authors: Xinyu Zhang, Xinlong Wang, Jia-Wang Bian, Chunhua Shen, Mingyu You

    Abstract: Person search aims to localize and identify a specific person from a gallery of images. Recent methods can be categorized into two groups, i.e., two-step and end-to-end approaches. The former views person search as two independent tasks and achieves dominant results using separately trained person detection and re-identification (Re-ID) models. The latter performs person search in an end-to-end fa… ▽ More

    Submitted 21 December, 2020; originally announced December 2020.

    Comments: Accepted to AAAI, 2021. Code is available at: https://git.io/DKD-PersonSearch

  15. arXiv:2003.11712  [pdf, other

    cs.CV

    Mask Encoding for Single Shot Instance Segmentation

    Authors: Rufeng Zhang, Zhi Tian, Chunhua Shen, Mingyu You, Youliang Yan

    Abstract: To date, instance segmentation is dominated by twostage methods, as pioneered by Mask R-CNN. In contrast, one-stage alternatives cannot compete with Mask R-CNN in mask AP, mainly due to the difficulty of compactly representing masks, making the design of one-stage methods very challenging. In this work, we propose a simple singleshot instance segmentation framework, termed mask encoding based inst… ▽ More

    Submitted 6 May, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

    Comments: Accepted to Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2020

  16. Deep Learning Enabled Optimization of Downlink Beamforming Under Per-Antenna Power Constraints: Algorithms and Experimental Demonstration

    Authors: Juping Zhang, Wenchao Xia, Minglei You, Gan Zheng, Sangarapillai Lambotharan, Kai-Kit Wong

    Abstract: This paper studies fast downlink beamforming algorithms using deep learning in multiuser multiple-input-single-output systems where each transmit antenna at the base station has its own power constraint. We focus on the signal-to-interference-plus-noise ratio (SINR) balancing problem which is quasi-convex but there is no efficient solution available. We first design a fast subgradient algorithm th… ▽ More

    Submitted 28 February, 2020; originally announced February 2020.

    Comments: This paper was accepted for publication in IEEE Transactions on Wireless Communications

  17. arXiv:1909.06023  [pdf, other

    cs.CV

    Part-Guided Attention Learning for Vehicle Instance Retrieval

    Authors: Xinyu Zhang, Rufeng Zhang, Jiewei Cao, Dong Gong, Mingyu You, Chunhua Shen

    Abstract: Vehicle instance retrieval often requires one to recognize the fine-grained visual differences between vehicles. Besides the holistic appearance of vehicles which is easily affected by the viewpoint variation and distortion, vehicle parts also provide crucial cues to differentiate near-identical vehicles. Motivated by these observations, we introduce a Part-Guided Attention Network (PGAN) to pinpo… ▽ More

    Submitted 26 September, 2020; v1 submitted 12 September, 2019; originally announced September 2019.

    Comments: 12 pages

  18. arXiv:1907.13315  [pdf, other

    cs.CV

    Self-training with progressive augmentation for unsupervised cross-domain person re-identification

    Authors: Xinyu Zhang, Jiewei Cao, Chunhua Shen, Mingyu You

    Abstract: Person re-identification (Re-ID) has achieved great improvement with deep learning and a large amount of labelled training data. However, it remains a challenging task for adapting a model trained in a source domain of labelled data to a target domain of only unlabelled data available. In this work, we develop a self-training method with progressive augmentation framework (PAST) to promote the mod… ▽ More

    Submitted 31 July, 2019; originally announced July 2019.

    Comments: Accepted to Proc. Int. Conf. Computer Vision, 2019. Code is available at: https://tinyurl.com/PASTReID

  19. arXiv:1809.06551  [pdf

    cs.CY

    Block Chain based Intelligent Industrial Network (DSDIN)

    Authors: Barco You, Matthias Hub, Mengzhe You, Bo Xu, Mingzhi Yu, Ivan Uemlianin

    Abstract: The manufacturing industry featured centralization in the past due to technical limitations, and factories (especially large manufacturers) gathered almost all of the resources for manufacturing, including: technologies, raw materials, equipment, workers, market information, etc. However, such centralized production is costly, inefficient and inflexible, and difficult to respond to rapidly changin… ▽ More

    Submitted 18 September, 2018; originally announced September 2018.

  20. arXiv:1707.03124  [pdf, other

    cs.CV

    Adversarial Generation of Training Examples: Applications to Moving Vehicle License Plate Recognition

    Authors: Xinlong Wang, Zhipeng Man, Mingyu You, Chunhua Shen

    Abstract: Generative Adversarial Networks (GAN) have attracted much research attention recently, leading to impressive results for natural image generation. However, to date little success was observed in using GAN generated images for improving classification tasks. Here we attempt to explore, in the context of car license plate recognition, whether it is possible to generate synthetic training data using… ▽ More

    Submitted 10 November, 2017; v1 submitted 11 July, 2017; originally announced July 2017.

  21. arXiv:1612.03882  [pdf, ps, other

    cs.IT

    Unified Framework for the Effective Rate Analysis of Wireless Communication Systems over MISO Fading Channels

    Authors: Minglei You, Hongjian Sun, Jing Jiang, Jiayi Zhang

    Abstract: This paper proposes a unified framework for the effective rate analysis over arbitrary correlated and not necessarily identical multiple inputs single output (MISO) fading channels, which uses moment generating function (MGF) based approach and H transform representation. The proposed framework has the potential to simplify the cumbersome analysis procedure compared to the probability density func… ▽ More

    Submitted 12 December, 2016; originally announced December 2016.