Skip to main content

Showing 1–32 of 32 results for author: Geng, H

  1. arXiv:2407.04689  [pdf, other

    cs.RO cs.CV

    RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation

    Authors: Yuxuan Kuang, Junjie Ye, Haoran Geng, Jiageng Mao, Congyue Deng, Leonidas Guibas, He Wang, Yue Wang

    Abstract: This work proposes a retrieve-and-transfer framework for zero-shot robotic manipulation, dubbed RAM, featuring generalizability across various objects, environments, and embodiments. Unlike existing approaches that learn manipulation from expensive in-domain demonstrations, RAM capitalizes on a retrieval-based affordance transfer paradigm to acquire versatile manipulation capabilities from abundan… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  2. arXiv:2407.02263  [pdf, other

    cs.LG physics.chem-ph q-bio.BM quant-ph

    FreeCG: Free the Design Space of Clebsch-Gordan Transform for Machine Learning Force Field

    Authors: Shihao Shao, Haoran Geng, Qinghua Cui

    Abstract: The Clebsch-Gordan Transform (CG transform) effectively encodes many-body interactions. Many studies have proven its accuracy in depicting atomic environments, although this comes with high computational needs. The computational burden of this challenge is hard to reduce due to the need for permutation equivariance, which limits the design space of the CG transform layer. We show that, implementin… ▽ More

    Submitted 14 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  3. arXiv:2406.19776  [pdf, other

    cs.MM cs.IR

    MDF: A Dynamic Fusion Model for Multi-modal Fake News Detection

    Authors: Hongzhen Lv, Wenzhong Yang, Fuyuan Wei, Jiaren Peng, Haokun Geng

    Abstract: Fake news detection has received increasing attention from researchers in recent years, especially multi-modal fake news detection containing both text and images. However, many previous works have fed two modal features, text and image, into a binary classifier after a simple concatenation or attention mechanism, in which the features contain a large amount of noise inherent in the data,which in… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  4. arXiv:2404.17521  [pdf, other

    cs.RO cs.CV

    Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations

    Authors: Puhao Li, Tengyu Liu, Yuyang Li, Muzhi Han, Haoran Geng, Shu Wang, Yixin Zhu, Song-Chun Zhu, Siyuan Huang

    Abstract: Autonomous robotic systems capable of learning novel manipulation tasks are poised to transform industries from manufacturing to service automation. However, modern methods (e.g., VIP and R3M) still face significant hurdles, notably the domain gap among robotic embodiments and the sparsity of successful task executions within specific action spaces, resulting in misaligned and ambiguous task repre… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Project website and open-source code: https://xiaoyao-li.github.io/research/ag2manip

  5. arXiv:2402.17766  [pdf, other

    cs.CV

    ShapeLLM: Universal 3D Object Understanding for Embodied Interaction

    Authors: Zekun Qi, Runpei Dong, Shaochen Zhang, Haoran Geng, Chunrui Han, Zheng Ge, Li Yi, Kaisheng Ma

    Abstract: This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM) designed for embodied interaction, exploring a universal 3D object understanding with 3D point clouds and languages. ShapeLLM is built upon an improved 3D encoder by extending ReCon to ReCon++ that benefits from multi-view image distillation for enhanced geometry understanding. By utilizing ReCon++ as the 3D point clo… ▽ More

    Submitted 12 July, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted at ECCV 2024

  6. arXiv:2402.15824  [pdf, other

    cs.CR cs.AR

    A New Secure Memory System for Efficient Data Protection and Access Pattern Obfuscation

    Authors: Haoran Geng, Yuezhi Che, Aaron Dingler, Michael Niemier, Xiaobo Sharon Hu

    Abstract: As the reliance on secure memory environments permeates across applications, memory encryption is used to ensure memory security. However, most effective encryption schemes, such as the widely used AES-CTR, inherently introduce extra overheads, including those associated with counter storage and version number integrity checks. Moreover, encryption only protects data content, and it does not fully… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  7. arXiv:2402.09553  [pdf, other

    cs.AI cs.LG stat.ML

    Statistical and Machine Learning Models for Predicting Fire and Other Emergency Events

    Authors: Dilli Prasad Sharma, Nasim Beigi-Mohammadi, Hongxiang Geng, Dawn Dixon, Rob Madro, Phil Emmenegger, Carlos Tobar, Jeff Li, Alberto Leon-Garcia

    Abstract: Emergency events in a city cause considerable economic loss to individuals, their families, and the community. Accurate and timely prediction of events can help the emergency fire and rescue services in preparing for and mitigating the consequences of emergency events. In this paper, we present a systematic development of predictive models for various types of emergency events in the City of Edmon… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Journal ref: IEEE Access 12(2024) 56880-56909

  8. arXiv:2401.05459  [pdf, other

    cs.HC cs.AI cs.SE

    Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

    Authors: Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu

    Abstract: Since the advent of personal computing devices, intelligent personal assistants (IPAs) have been one of the key technologies that researchers and engineers have focused on, aiming to help users efficiently obtain information and execute tasks, and provide users with more intelligent, convenient, and rich interaction experiences. With the development of smartphones and IoT, computing and sensing de… ▽ More

    Submitted 8 May, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: https://github.com/MobileLLM/Personal_LLM_Agents_Survey

  9. arXiv:2312.16217  [pdf, other

    cs.CV cs.RO

    ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation

    Authors: Xiaoqi Li, Mingxu Zhang, Yiran Geng, Haoran Geng, Yuxing Long, Yan Shen, Renrui Zhang, Jiaming Liu, Hao Dong

    Abstract: Robot manipulation relies on accurately predicting contact points and end-effector directions to ensure successful operation. However, learning-based robot manipulation, trained on a limited category within a simulator, often struggles to achieve generalizability, especially when confronted with extensive categories. Therefore, we introduce an innovative approach for robot manipulation that levera… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

  10. arXiv:2312.01307  [pdf, other

    cs.RO cs.CV

    SAGE: Bridging Semantic and Actionable Parts for GEneralizable Manipulation of Articulated Objects

    Authors: Haoran Geng, Songlin Wei, Congyue Deng, Bokui Shen, He Wang, Leonidas Guibas

    Abstract: To interact with daily-life articulated objects of diverse structures and functionalities, understanding the object parts plays a central role in both user instruction comprehension and task execution. However, the possible discordance between the semantic meaning and physics functionalities of the parts poses a challenge for designing a general system. To address this problem, we propose SAGE, a… ▽ More

    Submitted 30 March, 2024; v1 submitted 3 December, 2023; originally announced December 2023.

  11. arXiv:2311.09376  [pdf, other

    cs.NE

    DISTA: Denoising Spiking Transformer with intrinsic plasticity and spatiotemporal attention

    Authors: Boxun Xu, Hejia Geng, Yuxuan Yin, Peng Li

    Abstract: Among the array of neural network architectures, the Vision Transformer (ViT) stands out as a prominent choice, acclaimed for its exceptional expressiveness and consistent high performance in various vision applications. Recently, the emerging Spiking ViT approach has endeavored to harness spiking neurons, paving the way for a more brain-inspired transformer architecture that thrives in ultra-low… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  12. arXiv:2311.07633  [pdf, other

    cs.LG cs.AI math.OC

    Rethinking and Benchmarking Predict-then-Optimize Paradigm for Combinatorial Optimization Problems

    Authors: Haoyu Geng, Hang Ruan, Runzhong Wang, Yang Li, Yang Wang, Lei Chen, Junchi Yan

    Abstract: Numerous web applications rely on solving combinatorial optimization problems, such as energy cost-aware scheduling, budget allocation on web advertising, and graph matching on social networks. However, many optimization problems involve unknown coefficients, and improper predictions of these factors may lead to inferior decisions which may cause energy wastage, inefficient resource allocation, in… ▽ More

    Submitted 19 November, 2023; v1 submitted 13 November, 2023; originally announced November 2023.

  13. arXiv:2311.02787  [pdf, other

    cs.RO cs.AI

    Make a Donut: Hierarchical EMD-Space Planning for Zero-Shot Deformable Manipulation with Tools

    Authors: Yang You, Bokui Shen, Congyue Deng, Haoran Geng, Songlin Wei, He Wang, Leonidas Guibas

    Abstract: Deformable object manipulation stands as one of the most captivating yet formidable challenges in robotics. While previous techniques have predominantly relied on learning latent dynamics through demonstrations, typically represented as either particles or images, there exists a pertinent limitation: acquiring suitable demonstrations, especially for long-horizon tasks, can be elusive. Moreover, ba… ▽ More

    Submitted 24 March, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: 8 pages

  14. arXiv:2310.01441  [pdf, other

    cs.CL cs.AI

    UPAR: A Kantian-Inspired Prompting Framework for Enhancing Large Language Model Capabilities

    Authors: Hejia Geng, Boxun Xu, Peng Li

    Abstract: Large Language Models (LLMs) have demonstrated impressive inferential capabilities, with numerous research endeavors devoted to enhancing this capacity through prompting. Despite these efforts, a unified epistemological foundation is still conspicuously absent. Drawing inspiration from Kant's a priori philosophy, we propose the UPAR prompting framework, designed to emulate the structure of human c… ▽ More

    Submitted 6 December, 2023; v1 submitted 30 September, 2023; originally announced October 2023.

  15. arXiv:2308.10373  [pdf, other

    cs.NE cs.CR cs.CV cs.LG

    HoSNN: Adversarially-Robust Homeostatic Spiking Neural Networks with Adaptive Firing Thresholds

    Authors: Hejia Geng, Peng Li

    Abstract: While spiking neural networks (SNNs) offer a promising neurally-inspired model of computation, they are vulnerable to adversarial attacks. We present the first study that draws inspiration from neural homeostasis to design a threshold-adapting leaky integrate-and-fire (TA-LIF) neuron model and utilize TA-LIF neurons to construct the adversarially robust homeostatic SNNs (HoSNNs) for improved robus… ▽ More

    Submitted 31 May, 2024; v1 submitted 20 August, 2023; originally announced August 2023.

  16. arXiv:2308.02648  [pdf, other

    cs.CR cs.AR

    Privacy Preserving In-memory Computing Engine

    Authors: Haoran Geng, Jianqiao Mo, Dayane Reis, Jonathan Takeshita, Taeho Jung, Brandon Reagen, Michael Niemier, Xiaobo Sharon Hu

    Abstract: Privacy has rapidly become a major concern/design consideration. Homomorphic Encryption (HE) and Garbled Circuits (GC) are privacy-preserving techniques that support computations on encrypted data. HE and GC can complement each other, as HE is more efficient for linear operations, while GC is more effective for non-linear operations. Together, they enable complex computing tasks, such as machine l… ▽ More

    Submitted 10 August, 2023; v1 submitted 4 August, 2023; originally announced August 2023.

  17. arXiv:2307.14557  [pdf, other

    cs.CR cs.AR

    Accelerating Polynomial Modular Multiplication with Crossbar-Based Compute-in-Memory

    Authors: Mengyuan Li, Haoran Geng, Michael Niemier, Xiaobo Sharon Hu

    Abstract: Lattice-based cryptographic algorithms built on ring learning with error theory are gaining importance due to their potential for providing post-quantum security. However, these algorithms involve complex polynomial operations, such as polynomial modular multiplication (PMM), which is the most time-consuming part of these algorithms. Accelerating PMM is crucial to make lattice-based cryptographic… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

    Comments: Accepted by 42nd International Conference on Computer-Aided Design (ICCAD)

  18. arXiv:2305.17417  [pdf, other

    cs.DL cs.LG physics.soc-ph

    Modeling Dynamic Heterogeneous Graph and Node Importance for Future Citation Prediction

    Authors: Hao Geng, Deqing Wang, Fuzhen Zhuang, Xuehua Ming, Chenguang Du, Ting Jiang, Haolong Guo, Rui Liu

    Abstract: Accurate citation count prediction of newly published papers could help editors and readers rapidly figure out the influential papers in the future. Though many approaches are proposed to predict a paper's future citation, most ignore the dynamic heterogeneous graph structure or node importance in academic networks. To cope with this problem, we propose a Dynamic heterogeneous Graph and Node Impor… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

    Comments: Accepted by CIKM'2022

  19. arXiv:2304.04321  [pdf, other

    cs.AI cs.CL cs.CV cs.RO

    ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes

    Authors: Ran Gong, Jiangyong Huang, Yizhou Zhao, Haoran Geng, Xiaofeng Gao, Qingyang Wu, Wensi Ai, Ziheng Zhou, Demetri Terzopoulos, Song-Chun Zhu, Baoxiong Jia, Siyuan Huang

    Abstract: Understanding the continuous states of objects is essential for task learning and planning in the real world. However, most existing task learning benchmarks assume discrete (e.g., binary) object goal states, which poses challenges for the learning of complex tasks and transferring learned policy from simulated environments to the real world. Furthermore, state discretization limits a robot's abil… ▽ More

    Submitted 11 September, 2023; v1 submitted 9 April, 2023; originally announced April 2023.

    Comments: The first two authors contributed equally; 20 pages; 17 figures; project availalbe: https://arnold-benchmark.github.io/ ICCV 2023

  20. arXiv:2304.00464  [pdf, other

    cs.RO cs.CV

    UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning

    Authors: Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang, Li Yi, He Wang

    Abstract: We propose a novel, object-agnostic method for learning a universal policy for dexterous object grasping from realistic point cloud observations and proprioceptive information under a table-top setting, namely UniDexGrasp++. To address the challenge of learning the vision-based policy across thousands of object instances, we propose Geometry-aware Curriculum Learning (GeoCurriculum) and Geometry-a… ▽ More

    Submitted 3 April, 2023; v1 submitted 2 April, 2023; originally announced April 2023.

  21. arXiv:2303.17464  [pdf, other

    cs.SI physics.soc-ph

    HMES: A Scalable Human Mobility and Epidemic Simulation System with Fast Intervention Modeling

    Authors: Haoyu Geng, Guanjie Zheng, Zhengqing Han, Hua Wei, Zhenhui Li

    Abstract: Recently, the world has witnessed the most severe pandemic (COVID-19) in this century. Studies on epidemic prediction and simulation have received increasing attention. However, the current methods suffer from three issues. First, most of the current studies focus on epidemic prediction, which can not provide adequate support for intervention policy making. Second, most of the current intervention… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: UIC-2022, code available at https://github.com/hygeng/humanflow

  22. arXiv:2303.16958  [pdf, other

    cs.CV cs.RO

    PartManip: Learning Cross-Category Generalizable Part Manipulation Policy from Point Cloud Observations

    Authors: Haoran Geng, Ziming Li, Yiran Geng, Jiayi Chen, Hao Dong, He Wang

    Abstract: Learning a generalizable object manipulation policy is vital for an embodied agent to work in complex real-world scenes. Parts, as the shared components in different object categories, have the potential to increase the generalization ability of the manipulation policy and achieve cross-category object manipulation. In this work, we build the first large-scale, part-based cross-category object man… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: Accepted by CVPR2023

  23. arXiv:2303.12341  [pdf, other

    cs.LG

    EasyDGL: Encode, Train and Interpret for Continuous-time Dynamic Graph Learning

    Authors: Chao Chen, Haoyu Geng, Nianzu Yang, Xiaokang Yang, Junchi Yan

    Abstract: Dynamic graphs arise in various real-world applications, and it is often welcomed to model the dynamics directly in continuous time domain for its flexibility. This paper aims to design an easy-to-use pipeline (termed as EasyDGL which is also due to its implementation by DGL toolkit) composed of three key modules with both strong fitting ability and interpretability. Specifically the proposed pipe… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: 9 figures, 7 tables

  24. arXiv:2303.00938  [pdf, other

    cs.RO cs.CV

    UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy

    Authors: Yinzhen Xu, Weikang Wan, Jialiang Zhang, Haoran Liu, Zikang Shan, Hao Shen, Ruicheng Wang, Haoran Geng, Yijia Weng, Jiayi Chen, Tengyu Liu, Li Yi, He Wang

    Abstract: In this work, we tackle the problem of learning universal robotic dexterous grasping from a point cloud observation under a table-top setting. The goal is to grasp and lift up objects in high-quality and diverse ways and generalize across hundreds of categories and even the unseen. Inspired by successful pipelines used in parallel gripper grasping, we split the task into two stages: 1) grasp propo… ▽ More

    Submitted 25 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Comments: Accepted to CVPR 2023

  25. arXiv:2302.03933  [pdf, other

    cs.LG

    Graph Signal Sampling for Inductive One-Bit Matrix Completion: a Closed-form Solution

    Authors: Chao Chen, Haoyu Geng, Gang Zeng, Zhaobing Han, Hua Chai, Xiaokang Yang, Junchi Yan

    Abstract: Inductive one-bit matrix completion is motivated by modern applications such as recommender systems, where new users would appear at test stage with the ratings consisting of only ones and no zeros. We propose a unified graph signal sampling framework which enjoys the benefits of graph signal analysis and processing. The key idea is to transform each user's ratings on the items to a function (sign… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

    Comments: Published in ICLR 2023

  26. arXiv:2301.12399  [pdf, other

    cs.CY

    Learning Analytics from Spoken Discussion Dialogs in Flipped Classroom

    Authors: Hang Su, Borislav Dzodzo, Changlun Li, Danyang Zhao, Hao Geng, Yunxiang Li, Sidharth Jaggi, Helen Meng

    Abstract: The flipped classroom is a new pedagogical strategy that has been gaining increasing importance recently. Spoken discussion dialog commonly occurs in flipped classroom, which embeds rich information indicating processes and progression of students' learning. This study focuses on learning analytics from spoken discussion dialog in the flipped classroom, which aims to collect and analyze the discus… ▽ More

    Submitted 29 January, 2023; originally announced January 2023.

  27. arXiv:2211.05272  [pdf, other

    cs.CV

    GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts

    Authors: Haoran Geng, Helin Xu, Chengyang Zhao, Chao Xu, Li Yi, Siyuan Huang, He Wang

    Abstract: For years, researchers have been devoted to generalizable object perception and manipulation, where cross-category generalizability is highly desired yet underexplored. In this work, we propose to learn such cross-category skills via Generalizable and Actionable Parts (GAParts). By identifying and defining 9 GAPart classes (lids, handles, etc.) in 27 object categories, we construct a large-scale p… ▽ More

    Submitted 26 March, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

    Comments: To appear in CVPR 2023 (Highlight)

  28. arXiv:2209.12941  [pdf, other

    cs.RO cs.AI

    End-to-End Affordance Learning for Robotic Manipulation

    Authors: Yiran Geng, Boshi An, Haoran Geng, Yuanpei Chen, Yaodong Yang, Hao Dong

    Abstract: Learning to manipulate 3D objects in an interactive environment has been a challenging problem in Reinforcement Learning (RL). In particular, it is hard to train a policy that can generalize over objects with different semantic categories, diverse shape geometry and versatile functionality. Recently, the technique of visual affordance has shown great prospects in providing object-centric informati… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

    Comments: 8 pages, 3 figures

  29. arXiv:2204.06517  [pdf, other

    cs.IR cs.LG

    Learning Self-Modulating Attention in Continuous Time Space with Applications to Sequential Recommendation

    Authors: Chao Chen, Haoyu Geng, Nianzu Yang, Junchi Yan, Daiyue Xue, Jianping Yu, Xiaokang Yang

    Abstract: User interests are usually dynamic in the real world, which poses both theoretical and practical challenges for learning accurate preferences from rich behavior data. Among existing user behavior modeling solutions, attention networks are widely adopted for its effectiveness and relative simplicity. Despite being extensively studied, existing attentions still suffer from two limitations: i) conven… ▽ More

    Submitted 29 March, 2022; originally announced April 2022.

    Comments: Published in ICML 2021

  30. arXiv:2112.02231  [pdf, other

    cs.CR cs.AR cs.ET

    IMCRYPTO: An In-Memory Computing Fabric for AES Encryption and Decryption

    Authors: Dayane Reis, Haoran Geng, Michael Niemier, Xiaobo Sharon Hu

    Abstract: This paper proposes IMCRYPTO, an in-memory computing (IMC) fabric for accelerating AES encryption and decryption. IMCRYPTO employs a unified structure to implement encryption and decryption in a single hardware architecture, with combined (Inv)SubBytes and (Inv)MixColumns steps. Because of this step-combination, as well as the high parallelism achieved by multiple units of random-access memory (RA… ▽ More

    Submitted 3 December, 2021; originally announced December 2021.

  31. Composing Recurrent Spiking Neural Networks using Locally-Recurrent Motifs and Risk-Mitigating Architectural Optimization

    Authors: Wenrui Zhang, Hejia Geng, Peng Li

    Abstract: In neural circuits, recurrent connectivity plays a crucial role in network function and stability. However, existing recurrent spiking neural networks (RSNNs) are often constructed by random connections without optimization. While RSNNs can produce rich dynamics that are critical for memory formation and learning, systemic architectural optimization of RSNNs is still an open challenge. We aim to e… ▽ More

    Submitted 15 September, 2023; v1 submitted 3 August, 2021; originally announced August 2021.

    Comments: 20 pages, 7 figures

    Journal ref: Frontiers in Neuroscience, 2024, Vol. 18, Art. e1412559

  32. arXiv:1912.07254  [pdf, other

    cs.LG stat.ML

    VLSI Mask Optimization: From Shallow To Deep Learning

    Authors: Haoyu Yang, Wei Zhong, Yuzhe Ma, Hao Geng, Ran Chen, Wanli Chen, Bei Yu

    Abstract: VLSI mask optimization is one of the most critical stages in manufacturability aware design, which is costly due to the complicated mask optimization and lithography simulation. Recent researches have shown prominent advantages of machine learning techniques dealing with complicated and big data problems, which bring potential of dedicated machine learning solution for DFM problems and facilitate… ▽ More

    Submitted 16 December, 2019; originally announced December 2019.

    Comments: 6 pages; accepted by 25th Asia and South Pacific Design Automation Conference (ASP-DAC 2020)