Skip to main content

Showing 1–5 of 5 results for author: Zhi, P

  1. arXiv:2404.10220  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

    Authors: Peiyuan Zhi, Zhiyuan Zhang, Muzhi Han, Zeyu Zhang, Zhitian Li, Ziyuan Jiao, Baoxiong Jia, Siyuan Huang

    Abstract: Autonomous robot navigation and manipulation in open environments require reasoning and replanning with closed-loop feedback. We present COME-robot, the first closed-loop framework utilizing the GPT-4V vision-language foundation model for open-ended reasoning and adaptive planning in real-world scenarios. We meticulously construct a library of action primitives for robot exploration, navigation, a… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  2. arXiv:2404.09465  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI

    Authors: Yandan Yang, Baoxiong Jia, Peiyuan Zhi, Siyuan Huang

    Abstract: With recent developments in Embodied Artificial Intelligence (EAI) research, there has been a growing demand for high-quality, large-scale interactive scene generation. While prior methods in scene synthesis have prioritized the naturalness and realism of the generated scenes, the physical plausibility and interactivity of scenes have been largely left unexplored. To address this disparity, we int… ▽ More

    Submitted 9 July, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024 (Highlight), 18 pages

  3. arXiv:2312.16902  [pdf, other

    cs.CV

    Joint Learning for Scattered Point Cloud Understanding with Hierarchical Self-Distillation

    Authors: Kaiyue Zhou, Ming Dong, Peiyuan Zhi, Shengjin Wang

    Abstract: Numerous point-cloud understanding techniques focus on whole entities and have succeeded in obtaining satisfactory results and limited sparsity tolerance. However, these methods are generally sensitive to incomplete point clouds that are scanned with flaws or large gaps. To address this issue, in this paper, we propose an end-to-end architecture that compensates for and identifies partial point cl… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Currently under review. Previously submitted to AAAI and got frustrated. Decisions: 1x weak reject, 2x weak accept, and 1 accept

  4. arXiv:2206.11141  [pdf, other

    cs.RO cs.AI cs.CV

    Hybrid Physical Metric For 6-DoF Grasp Pose Detection

    Authors: Yuhao Lu, Beixing Deng, Zhenyu Wang, Peiyuan Zhi, Yali Li, Shengjin Wang

    Abstract: 6-DoF grasp pose detection of multi-grasp and multi-object is a challenge task in the field of intelligent robot. To imitate human reasoning ability for grasping objects, data driven methods are widely studied. With the introduction of large-scale datasets, we discover that a single physical metric usually generates several discrete levels of grasp confidence scores, which cannot finely distinguis… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

    Comments: 7 pages, 7 figures, accepted by ICRA 2022

  5. arXiv:2111.01396  [pdf, other

    cs.CV

    Boundary Distribution Estimation for Precise Object Detection

    Authors: Peng Zhi, Haoran Zhou, Hang Huang, Rui Zhao, Rui Zhou, Qingguo Zhou

    Abstract: In the field of state-of-the-art object detection, the task of object localization is typically accomplished through a dedicated subnet that emphasizes bounding box regression. This subnet traditionally predicts the object's position by regressing the box's center position and scaling factors. Despite the widespread adoption of this approach, we have observed that the localization results often su… ▽ More

    Submitted 19 July, 2023; v1 submitted 2 November, 2021; originally announced November 2021.