Skip to main content

Showing 1–50 of 545 results for author: Peng, B

  1. arXiv:2407.10485  [pdf, other

    cs.CV

    Effective Motion Modeling for UAV-platform Multiple Object Tracking with Re-Margin Loss

    Authors: Mufeng Yao, Jinlong Peng, Qingdong He, Bo Peng, Hao Chen, Mingmin Chi, Chao Liu, Jon Atli Benediktsson

    Abstract: Multiple object tracking (MOT) from unmanned aerial vehicle (UAV) platforms requires efficient motion modeling. This is because UAV-MOT faces tracking difficulties caused by large and irregular motion, and insufficient training due to the motion long-tailed distribution of current UAV-MOT datasets. Previous UAV-MOT methods either extract motion and detection features redundantly or supervise motio… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2308.07207

  2. arXiv:2407.10481  [pdf, other

    cs.LG cs.AI cs.CL cs.GR

    SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation

    Authors: Jordan Juravsky, Yunrong Guo, Sanja Fidler, Xue Bin Peng

    Abstract: Physically-simulated models for human motion can generate high-quality responsive character animations, often in real-time. Natural language serves as a flexible interface for controlling these models, allowing expert and non-expert users to quickly create and edit their animations. Many recent physics-based animation methods, including those that use text interfaces, train control policies using… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  3. arXiv:2407.06584  [pdf, other

    cs.RO

    HiLMa-Res: A General Hierarchical Framework via Residual RL for Combining Quadrupedal Locomotion and Manipulation

    Authors: Xiaoyu Huang, Qiayuan Liao, Yiming Ni, Zhongyu Li, Laura Smith, Sergey Levine, Xue Bin Peng, Koushil Sreenath

    Abstract: This work presents HiLMa-Res, a hierarchical framework leveraging reinforcement learning to tackle manipulation tasks while performing continuous locomotion using quadrupedal robots. Unlike most previous efforts that focus on solving a specific task, HiLMa-Res is designed to be general for various loco-manipulation tasks that require quadrupedal robots to maintain sustained mobility. The novel des… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: IROS 2024

  4. arXiv:2407.05324  [pdf, other

    cs.CV

    PICA: Physics-Integrated Clothed Avatar

    Authors: Bo Peng, Yunfan Tao, Haoyu Zhan, Yudong Guo, Juyong Zhang

    Abstract: We introduce PICA, a novel representation for high-fidelity animatable clothed human avatars with physics-accurate dynamics, even for loose clothing. Previous neural rendering-based representations of animatable clothed humans typically employ a single model to represent both the clothing and the underlying body. While efficient, these approaches often fail to accurately represent complex garment… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Project page: https://ustc3dv.github.io/PICA/

  5. arXiv:2407.00617  [pdf, other

    cs.LG cs.AI cs.CL cs.GT

    Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

    Authors: Yuheng Zhang, Dian Yu, Baolin Peng, Linfeng Song, Ye Tian, Mingyue Huo, Nan Jiang, Haitao Mi, Dong Yu

    Abstract: Reinforcement Learning with Human Feedback (RLHF) has achieved great success in aligning large language models (LLMs) with human preferences. Prevalent RLHF approaches are reward-based, following the Bradley-Terry (BT) model assumption, which may not fully capture the complexity of human preferences. In this paper, we explore RLHF under a general preference framework and approach it from a game-th… ▽ More

    Submitted 7 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

  6. arXiv:2407.00320  [pdf, other

    cs.CL cs.AI cs.LG

    LiteSearch: Efficacious Tree Search for LLM

    Authors: Ante Wang, Linfeng Song, Ye Tian, Baolin Peng, Dian Yu, Haitao Mi, Jinsong Su, Dong Yu

    Abstract: Recent research suggests that tree search algorithms (e.g. Monte Carlo Tree Search) can dramatically boost LLM performance on complex mathematical reasoning tasks. However, they often require more than 10 times the computational resources of greedy decoding due to wasteful search strategies, making them difficult to be deployed in practical applications. This study introduces a novel guided tree s… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  7. arXiv:2406.19131  [pdf, other

    cs.CV

    CELLO: Causal Evaluation of Large Vision-Language Models

    Authors: Meiqi Chen, Bo Peng, Yan Zhang, Chaochao Lu

    Abstract: Causal reasoning is fundamental to human intelligence and crucial for effective decision-making in real-world environments. Despite recent advancements in large vision-language models (LVLMs), their ability to comprehend causality remains unclear. Previous work typically focuses on commonsense causality between events and/or actions, which is insufficient for applications like embodied agents and… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  8. arXiv:2406.17338  [pdf, other

    eess.IV cs.CV cs.LG

    Robustly Optimized Deep Feature Decoupling Network for Fatty Liver Diseases Detection

    Authors: Peng Huang, Shu Hu, Bo Peng, Jiashu Zhang, Xi Wu, Xin Wang

    Abstract: Current medical image classification efforts mainly aim for higher average performance, often neglecting the balance between different classes. This can lead to significant differences in recognition accuracy between classes and obvious recognition weaknesses. Without the support of massive data, deep learning faces challenges in fine-grained classification of fatty liver. In this paper, we propos… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: MICCAI 2024

  9. arXiv:2406.11937  [pdf, other

    physics.ins-det hep-ex physics.data-an

    Using graph neural networks to reconstruct charged pion showers in the CMS High Granularity Calorimeter

    Authors: M. Aamir, B. Acar, G. Adamov, T. Adams, C. Adloff, S. Afanasiev, C. Agrawal, C. Agrawal, A. Ahmad, H. A. Ahmed, S. Akbar, N. Akchurin, B. Akgul, B. Akgun, R. O. Akpinar, E. Aktas, A. AlKadhim, V. Alexakhin, J. Alimena, J. Alison, A. Alpana, W. Alshehri, P. Alvarez Dominguez, M. Alyari, C. Amendola , et al. (550 additional authors not shown)

    Abstract: A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadr… ▽ More

    Submitted 30 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Prepared for submission to JINST

  10. arXiv:2406.11528  [pdf, other

    econ.TH cs.GT

    Optimal Robust Contract Design

    Authors: Bo Peng, Zhihao Gavin Tang

    Abstract: We consider the robust contract design problem when the principal only has limited information about the actions the agent can take. The principal evaluates a contract according to its worst-case performance caused by the uncertain action space. Carroll (AER 2015) showed that a linear contract is optimal among deterministic contracts. Recently, Kambhampati (JET 2023) showed that the principal's pa… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Full version of EC 2024 paper

  11. DET-LSH: A Locality-Sensitive Hashing Scheme with Dynamic Encoding Tree for Approximate Nearest Neighbor Search

    Authors: Jiuqi Wei, Botao Peng, Xiaodong Lee, Themis Palpanas

    Abstract: Locality-sensitive hashing (LSH) is a well-known solution for approximate nearest neighbor (ANN) search in high-dimensional spaces due to its robust theoretical guarantee on query accuracy. Traditional LSH-based methods mainly focus on improving the efficiency and accuracy of the query phase by designing different query strategies, but pay little attention to improving the efficiency of the indexi… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Journal ref: PVLDB, 17(9): 2241 - 2254, 2024

  12. arXiv:2406.09399  [pdf, other

    cs.CV

    OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation

    Authors: Junke Wang, Yi Jiang, Zehuan Yuan, Binyue Peng, Zuxuan Wu, Yu-Gang Jiang

    Abstract: Tokenizer, serving as a translator to map the intricate visual data into a compact latent space, lies at the core of visual generative models. Based on the finding that existing tokenizers are tailored to image or video inputs, this paper presents OmniTokenizer, a transformer-based tokenizer for joint image and video tokenization. OmniTokenizer is designed with a spatial-temporal decoupled archite… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  13. arXiv:2406.08580  [pdf, other

    physics.chem-ph

    Anomalous Enhancement of the Electrocatalytic Hydrogen Evolution Reaction in AuPt Nanoclusters

    Authors: Jiahui Kang, Jan Kloppenburg, Jiali Sheng, Zhenyu Xu, Kristoffer Meinander, Hua Jiang, Zhong-Peng Lv, Esko I. Kauppinen, Qiang Zhang, Xi Chen, Olli Ikkala, Miguel A. Caro, Bo Peng

    Abstract: Energy- and resource-efficient electrocatalytic water splitting is of paramount importance to enable sustainable hydrogen production. The best bulk catalyst for the hydrogen evolution reaction (HER), i.e., platinum, is one of the scarcest elements on Earth. The use of raw material for HER can be dramatically reduced by utilizing nanoclusters. In addition, nanoalloying can further improve the perfo… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  14. arXiv:2406.06615  [pdf, other

    cs.CL cs.AI cs.LG cs.RO

    Language Guided Skill Discovery

    Authors: Seungeun Rho, Laura Smith, Tianyu Li, Sergey Levine, Xue Bin Peng, Sehoon Ha

    Abstract: Skill discovery methods enable agents to learn diverse emergent behaviors without explicit rewards. To make learned skills useful for unknown downstream tasks, obtaining a semantically diverse repertoire of skills is essential. While some approaches introduce a discriminator to distinguish skills and others aim to increase state coverage, no existing work directly addresses the "semantic diversity… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  15. arXiv:2406.06525  [pdf, other

    cs.CV

    Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

    Authors: Peize Sun, Yi Jiang, Shoufa Chen, Shilong Zhang, Bingyue Peng, Ping Luo, Zehuan Yuan

    Abstract: We introduce LlamaGen, a new family of image generation models that apply original ``next-token prediction'' paradigm of large language models to visual generation domain. It is an affirmative answer to whether vanilla autoregressive models, e.g., Llama, without inductive biases on visual signals can achieve state-of-the-art image generation performance if scaling properly. We reexamine design spa… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Codes and models: \url{https://github.com/FoundationVision/LlamaGen}

  16. arXiv:2406.06326  [pdf, other

    cs.CL

    Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching

    Authors: Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Yipeng Zhang, Haitao Mi, Helen Meng

    Abstract: Large language models (LLMs) often struggle to provide up-to-date information due to their one-time training and the constantly evolving nature of the world. To keep LLMs current, existing approaches typically involve continued pre-training on new documents. However, they frequently face difficulties in extracting stored knowledge. Motivated by the remarkable success of the Feynman Technique in ef… ▽ More

    Submitted 15 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: 30 pages

  17. arXiv:2406.05652  [pdf, other

    eess.SP

    Distributed Combinatorial Optimization of Downlink User Assignment in mmWave Cell-free Massive MIMO Using Graph Neural Networks

    Authors: Bile Peng, Bihan Guo, Karl-Ludwig Besser, Luca Kunz, Ramprasad Raghunath, Anke Schmeink, Eduard A Jorswieck, Giuseppe Caire, H. Vincent Poor

    Abstract: Millimeter wave (mmWave) cell-free massive MIMO (CF mMIMO) is a promising solution for future wireless communications. However, its optimization is non-trivial due to the challenging channel characteristics. We show that mmWave CF mMIMO optimization is largely an assignment problem between access points (APs) and users due to the high path loss of mmWave channels, the limited output power of the a… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  18. arXiv:2406.04316  [pdf, other

    cs.CV

    Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking

    Authors: Jiyao Zhang, Weiyao Huang, Bo Peng, Mingdong Wu, Fei Hu, Zijian Chen, Bo Zhao, Hao Dong

    Abstract: 6D Object Pose Estimation is a crucial yet challenging task in computer vision, suffering from a significant lack of large-scale datasets. This scarcity impedes comprehensive evaluation of model performance, limiting research advancements. Furthermore, the restricted number of available instances or categories curtails its applications. To address these issues, this paper introduces Omni6DPose, a… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  19. arXiv:2406.02357  [pdf, ps, other

    cs.GT cs.AI cs.DS cs.LG

    The complexity of approximate (coarse) correlated equilibrium for incomplete information games

    Authors: Binghui Peng, Aviad Rubinstein

    Abstract: We study the iteration complexity of decentralized learning of approximate correlated equilibria in incomplete information games. On the negative side, we prove that in $\mathit{extensive}$-$\mathit{form}$ $\mathit{games}$, assuming $\mathsf{PPAD} \not\subset \mathsf{TIME}(n^{\mathsf{polylog}(n)})$, any polynomial-time learning algorithms must take at least $2^{\log_2^{1-o(1)}(|\mathcal{I}|)}$ i… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  20. arXiv:2406.01239  [pdf, other

    math.OC

    Tighter yet more tractable relaxations and nontrivial instance generation for sparse standard quadratic optimization

    Authors: Immanuel Bomze, Bo Peng, Yuzhou Qiu, E. Alper Yildirim

    Abstract: The Standard Quadratic optimization Problem (StQP), arguably the simplest among all classes of NP-hard optimization problems, consists of extremizing a quadratic form (the simplest nonlinear polynomial) over the standard simplex (the simplest polytope/compact feasible set). As a problem class, StQPs may be nonconvex with an exponential number of inefficient local solutions. StQPs arise in a multit… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Technical Report, School of Mathematics, The University of Edinburgh, Edinburgh, EH9 3FD, Scotland, United Kingdom

    MSC Class: 90C11; 90C20; 90C22

  21. arXiv:2406.01238  [pdf, other

    cs.CL

    EffiQA: Efficient Question-Answering with Strategic Multi-Model Collaboration on Knowledge Graphs

    Authors: Zixuan Dong, Baoyun Peng, Yufei Wang, Jia Fu, Xiaodong Wang, Yongxue Shan, Xin Zhou

    Abstract: While large language models (LLMs) have shown remarkable capabilities in natural language processing, they struggle with complex, multi-step reasoning tasks involving knowledge graphs (KGs). Existing approaches that integrate LLMs and KGs either underutilize the reasoning abilities of LLMs or suffer from prohibitive computational costs due to tight coupling. To address these limitations, we propos… ▽ More

    Submitted 7 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 10 pages, 4 figures, 3 tables

  22. arXiv:2406.00989  [pdf, other

    physics.chem-ph

    On the exact limit of the time-dependent coupled cluster ansatz and its approximations in the real-time equation-of-motion coupled cluster cumulant Green's function approach

    Authors: Bo Peng, Himadri Pathak, Ajay Panyala, Fernando D. Vila, John J. Rehr, Karol Kowalski

    Abstract: In this paper, we analyze the properties of the recently proposed real-time equation-of-motion coupled-cluster (RT-EOM-CC) cumulant Green's function approach [J. Chem. Phys. 2020, 152, 174113]. We specifically focus on identifying the limitations of the original time-dependent coupled cluster (TDCC) ansatz and propose an enhanced extended TDCC ansatz ensuring the exactness in the expansion limit.… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  23. arXiv:2406.00941  [pdf, ps, other

    econ.EM

    A Robust Residual-Based Test for Structural Changes in Factor Models

    Authors: Bin Peng, Liangjun Su, Yayi Yan

    Abstract: In this paper, we propose an easy-to-implement residual-based specification testing procedure for detecting structural changes in factor models, which is powerful against both smooth and abrupt structural changes with unknown break dates. The proposed test is robust against the over-specified number of factors, and serially and cross-sectionally correlated error processes. A new central limit theo… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  24. arXiv:2405.15526  [pdf

    physics.chem-ph

    Syngas conversion to higher alcohols via wood-framed Cu/Co-carbon catalyst

    Authors: Guihua Yan, Paulina Pršlja, Gaofeng Chen, Jiahui Kang, Yongde Liu, Miguel A. Caro, Xi Chen, Xianhai Zeng, Bo Peng

    Abstract: Syngas conversion into higher alcohols represents a promising avenue for transforming coal or biomass into liquid fuels. However, the commercialization of this process has been hindered by the high cost, low activity, and inadequate C$_{2+}$OH selectivity of catalysts. Herein, we have developed Cu/Co carbon wood catalysts, offering a cost-effective and stable alternative with exceptional selectivi… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  25. arXiv:2405.11126  [pdf, other

    cs.CV cs.GR cs.LG

    Flexible Motion In-betweening with Diffusion Models

    Authors: Setareh Cohan, Guy Tevet, Daniele Reda, Xue Bin Peng, Michiel van de Panne

    Abstract: Motion in-betweening, a fundamental task in character animation, consists of generating motion sequences that plausibly interpolate user-provided keyframe constraints. It has long been recognized as a labor-intensive and challenging process. We investigate the potential of diffusion models in generating diverse human motions guided by keyframes. Unlike previous inbetweening methods, we propose a s… ▽ More

    Submitted 23 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: SIGGRAPH 2024. For project page and code, see https://setarehc.github.io/CondMDI/

  26. arXiv:2405.08998  [pdf, other

    astro-ph.EP

    Puffy Venuses: the Mass-Radius Impact of Carbon-Rich Atmospheres on Lava Worlds

    Authors: Bo Peng, Diana Valencia

    Abstract: The recent advancements in exoplanet observations enable the potential detection of exo-Venuses, rocky planets with carbon-rich atmospheres. How extended these atmospheres can be, given high carbon abundances, has not been studied. To answer this, we present a model for a theoretical class of exoplanets - puffy Venuses - characterized by thick, carbon-dominated atmospheres in equilibrium with glob… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: V3, under review in ApJL. We welcome & appreciate your comments

  27. arXiv:2405.07420  [pdf, other

    econ.EM

    Robust Inference for High-Dimensional Panel Data Models

    Authors: Jiti Gao, Bin Peng, Yayi Yan

    Abstract: In this paper, we propose a robust estimation and inferential method for high-dimensional panel data models. Specifically, (1) we investigate the case where the number of regressors can grow faster than the sample size, (2) we pay particular attention to non-Gaussian, serially and cross-sectionally correlated and heteroskedastic error processes, and (3) we develop an estimation method for high-dim… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

  28. arXiv:2405.00622  [pdf, other

    cs.CL cs.AI cs.LG

    Causal Evaluation of Language Models

    Authors: Sirui Chen, Bo Peng, Meiqi Chen, Ruiqi Wang, Mengying Xu, Xingyu Zeng, Rui Zhao, Shengjie Zhao, Yu Qiao, Chaochao Lu

    Abstract: Causal reasoning is viewed as crucial for achieving human-level machine intelligence. Recent advances in language models have expanded the horizons of artificial intelligence across various domains, sparking inquiries into their potential for causal reasoning. In this work, we introduce Causal evaluation of Language Models (CaLM), which, to the best of our knowledge, is the first comprehensive ben… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 315 pages, 230 figures, 21 tables. Project website: https://opencausalab.github.io/CaLM

  29. arXiv:2404.19264  [pdf, other

    cs.RO

    DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets

    Authors: Xiaoyu Huang, Yufeng Chi, Ruofeng Wang, Zhongyu Li, Xue Bin Peng, Sophia Shao, Borivoje Nikolic, Koushil Sreenath

    Abstract: This work introduces DiffuseLoco, a framework for training multi-skill diffusion-based policies for dynamic legged locomotion from offline datasets, enabling real-time control of diverse skills on robots in the real world. Offline learning at scale has led to breakthroughs in computer vision, natural language processing, and robotic manipulation domains. However, scaling up learning for legged rob… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  30. arXiv:2404.18246  [pdf, other

    cs.LG cs.CV

    AdaFSNet: Time Series Classification Based on Convolutional Network with a Adaptive and Effective Kernel Size Configuration

    Authors: Haoxiao Wang, Bo Peng, Jianhua Zhang, Xu Cheng

    Abstract: Time series classification is one of the most critical and challenging problems in data mining, existing widely in various fields and holding significant research importance. Despite extensive research and notable achievements with successful real-world applications, addressing the challenge of capturing the appropriate receptive field (RF) size from one-dimensional or multi-dimensional time serie… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCNN 2024

  31. arXiv:2404.16807  [pdf, other

    cs.CL

    Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning

    Authors: Tianhui Zhang, Bei Peng, Danushka Bollegala

    Abstract: Generative Commonsense Reasoning (GCR) requires a model to reason about a situation using commonsense knowledge, while generating coherent sentences. Although the quality of the generated sentences is crucial, the diversity of the generation is equally important because it reflects the model's ability to use a range of commonsense knowledge facts. Large Language Models (LLMs) have shown proficienc… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 16 pages, 6 figures

  32. arXiv:2404.16522  [pdf, other

    eess.IV cs.LG

    A Deep Learning-Driven Pipeline for Differentiating Hypertrophic Cardiomyopathy from Cardiac Amyloidosis Using 2D Multi-View Echocardiography

    Authors: Bo Peng, Xiaofeng Li, Xinyu Li, Zhenghan Wang, Hui Deng, Xiaoxian Luo, Lixue Yin, Hongmei Zhang

    Abstract: Hypertrophic cardiomyopathy (HCM) and cardiac amyloidosis (CA) are both heart conditions that can progress to heart failure if untreated. They exhibit similar echocardiographic characteristics, often leading to diagnostic challenges. This paper introduces a novel multi-view deep learning approach that utilizes 2D echocardiography for differentiating between HCM and CA. The method begins by classif… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  33. arXiv:2404.12253  [pdf, other

    cs.CL cs.LG

    Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

    Authors: Ye Tian, Baolin Peng, Linfeng Song, Lifeng Jin, Dian Yu, Haitao Mi, Dong Yu

    Abstract: Despite the impressive capabilities of Large Language Models (LLMs) on various tasks, they still struggle with scenarios that involves complex reasoning and planning. Recent work proposed advanced prompting techniques and the necessity of fine-tuning with high-quality data to augment LLMs' reasoning abilities. However, these approaches are inherently constrained by data availability and quality. I… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  34. arXiv:2404.11054  [pdf, other

    cs.CV

    Multilateral Temporal-view Pyramid Transformer for Video Inpainting Detection

    Authors: Ying Zhang, Yuezun Li, Bo Peng, Jiaran Zhou, Huiyu Zhou, Junyu Dong

    Abstract: The task of video inpainting detection is to expose the pixel-level inpainted regions within a video sequence. Existing methods usually focus on leveraging spatial and temporal inconsistencies. However, these methods typically employ fixed operations to combine spatial and temporal clues, limiting their applicability in different scenarios. In this paper, we introduce a novel Multilateral Temporal… ▽ More

    Submitted 6 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  35. arXiv:2404.10685  [pdf, other

    cs.CV cs.GR

    Generating Human Interaction Motions in Scenes with Text Control

    Authors: Hongwei Yi, Justus Thies, Michael J. Black, Xue Bin Peng, Davis Rempe

    Abstract: We present TeSMo, a method for text-controlled scene-aware motion generation based on denoising diffusion models. Previous text-to-motion methods focus on characters in isolation without considering scenes due to the limited availability of datasets that include motion, text descriptions, and interactive scenes. Our approach begins with pre-training a scene-agnostic text-to-motion diffusion model,… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Project Page: https://research.nvidia.com/labs/toronto-ai/tesmo/

  36. arXiv:2404.10099  [pdf, other

    math.OC cs.LG

    Feature selection in linear SVMs via hard cardinality constraint: a scalable SDP decomposition approach

    Authors: Immanuel Bomze, Federico D'Onofrio, Laura Palagi, Bo Peng

    Abstract: In this paper, we study the embedded feature selection problem in linear Support Vector Machines (SVMs), in which a cardinality constraint is employed, leading to a fully explainable selection model. The problem is NP-hard due to the presence of the cardinality constraint, even though the original linear SVM amounts to a problem solvable in polynomial time. To handle the hard problem, we first int… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Submitted to European Journal of Operational Research. arXiv admin note: text overlap with arXiv:1808.02435 by other authors

    MSC Class: 90C22; 90C11 ACM Class: I.5.1; I.2.0

  37. arXiv:2404.09338  [pdf, other

    cs.CL

    Entropy Guided Extrapolative Decoding to Improve Factuality in Large Language Models

    Authors: Souvik Das, Lifeng Jin, Linfeng Song, Haitao Mi, Baolin Peng, Dong Yu

    Abstract: Large language models (LLMs) exhibit impressive natural language capabilities but suffer from hallucination -- generating content ungrounded in the realities of training data. Recent work has focused on decoding techniques to improve factuality during inference by leveraging LLMs' hierarchical representation of factual knowledge, manipulating the predicted distributions at inference time. Current… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: Work in Progress

  38. arXiv:2404.08549  [pdf

    eess.IV cs.CV physics.bio-ph

    Benchmarking the Cell Image Segmentation Models Robustness under the Microscope Optical Aberrations

    Authors: Boyuan Peng, Jiaju Chen, Qihui Ye, Minjiang Chen, Peiwu Qin, Chenggang Yan, Dongmei Yu, Zhenglin Chen

    Abstract: Cell segmentation is essential in biomedical research for analyzing cellular morphology and behavior. Deep learning methods, particularly convolutional neural networks (CNNs), have revolutionized cell segmentation by extracting intricate features from images. However, the robustness of these methods under microscope optical aberrations remains a critical challenge. This study comprehensively evalu… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  39. arXiv:2404.08365  [pdf, other

    econ.EM

    Estimation and Inference for Three-Dimensional Panel Data Models

    Authors: Guohua Feng, Jiti Gao, Fei Liu, Bin Peng

    Abstract: Hierarchical panel data models have recently garnered significant attention. This study contributes to the relevant literature by introducing a novel three-dimensional (3D) hierarchical panel data model, which integrates panel regression with three sets of latent factor structures: one set of global factors and two sets of local factors. Instead of aggregating latent factors from various nodes, as… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  40. arXiv:2404.08341  [pdf, other

    cs.CV

    Counterfactual Explanations for Face Forgery Detection via Adversarial Removal of Artifacts

    Authors: Yang Li, Songlin Yang, Wei Wang, Ziwen He, Bo Peng, Jing Dong

    Abstract: Highly realistic AI generated face forgeries known as deepfakes have raised serious social concerns. Although DNN-based face forgery detection models have achieved good performance, they are vulnerable to latest generative methods that have less forgery traces and adversarial attacks. This limitation of generalization and robustness hinders the credibility of detection results and requires more ex… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: Accepted to ICME2024

  41. arXiv:2404.07470  [pdf, other

    cs.CL

    Scalable Language Model with Generalized Continual Learning

    Authors: Bohao Peng, Zhuotao Tian, Shu Liu, Mingchang Yang, Jiaya Jia

    Abstract: Continual learning has gained increasing importance as it facilitates the acquisition and refinement of scalable knowledge and skills in language models. However, existing methods typically encounter strict limitations and challenges in real-world scenarios, such as reliance on experience replay, optimization constraints, and inference task-ID. In this study, we introduce the Scalable Language Mod… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: The Twelfth International Conference on Learning Representations

  42. arXiv:2404.05892  [pdf, other

    cs.CL cs.AI

    Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

    Authors: Bo Peng, Daniel Goldstein, Quentin Anthony, Alon Albalak, Eric Alcaide, Stella Biderman, Eugene Cheah, Xingjian Du, Teddy Ferdinan, Haowen Hou, Przemysław Kazienko, Kranthi Kiran GV, Jan Kocoń, Bartłomiej Koptyra, Satyapriya Krishna, Ronald McClelland Jr., Niklas Muennighoff, Fares Obeid, Atsushi Saito, Guangyu Song, Haoqin Tu, Stanisław Woźniak, Ruichong Zhang, Bingchen Zhao, Qihang Zhao , et al. (3 additional authors not shown)

    Abstract: We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon the RWKV (RWKV-4) architecture. Our architectural design advancements include multi-headed matrix-valued states and a dynamic recurrence mechanism that improve expressivity while maintaining the inference efficiency characteristics of RNNs. We introduce a new multilingual corpus with 1.12 trillion tokens and a fast tokeni… ▽ More

    Submitted 10 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  43. arXiv:2404.04875  [pdf, other

    cs.CV

    NeRF2Points: Large-Scale Point Cloud Generation From Street Views' Radiance Field Optimization

    Authors: Peng Tu, Xun Zhou, Mingming Wang, Xiaojun Yang, Bo Peng, Ping Chen, Xiu Su, Yawen Huang, Yefeng Zheng, Chang Xu

    Abstract: Neural Radiance Fields (NeRF) have emerged as a paradigm-shifting methodology for the photorealistic rendering of objects and environments, enabling the synthesis of novel viewpoints with remarkable fidelity. This is accomplished through the strategic utilization of object-centric camera poses characterized by significant inter-frame overlap. This paper explores a compelling, alternative utility o… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: 18 pages

  44. arXiv:2404.04062  [pdf, other

    cs.LG math.OC

    Derivative-free tree optimization for complex systems

    Authors: Ye Wei, Bo Peng, Ruiwen Xie, Yangtao Chen, Yu Qin, Peng Wen, Stefan Bauer, Po-Yen Tung

    Abstract: A tremendous range of design tasks in materials, physics, and biology can be formulated as finding the optimum of an objective function depending on many parameters without knowing its closed-form expression or the derivative. Traditional derivative-free optimization techniques often rely on strong assumptions about objective functions, thereby failing at optimizing non-convex systems beyond 100 d… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 39 pages, 3 figures

  45. arXiv:2404.02905  [pdf, other

    cs.CV cs.AI

    Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

    Authors: Keyu Tian, Yi Jiang, Zehuan Yuan, Bingyue Peng, Liwei Wang

    Abstract: We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction" or "next-resolution prediction", diverging from the standard raster-scan "next-token prediction". This simple, intuitive methodology allows autoregressive (AR) transformers to learn visual distributions fast and generalize well: V… ▽ More

    Submitted 10 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: Demo website: https://var.vision/

  46. arXiv:2404.00230  [pdf, other

    cs.CV

    Latent Watermark: Inject and Detect Watermarks in Latent Diffusion Space

    Authors: Zheling Meng, Bo Peng, Jing Dong

    Abstract: Watermarking is a tool for actively identifying and attributing the images generated by latent diffusion models. Existing methods face the dilemma of image quality and watermark robustness. Watermarks with superior image quality usually have inferior robustness against attacks such as blurring and JPEG compression, while watermarks with superior robustness usually significantly damage image qualit… ▽ More

    Submitted 11 July, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

  47. arXiv:2404.00205  [pdf, other

    cs.CL

    Conceptual and Unbiased Reasoning in Language Models

    Authors: Ben Zhou, Hongming Zhang, Sihao Chen, Dian Yu, Hongwei Wang, Baolin Peng, Dan Roth, Dong Yu

    Abstract: Conceptual reasoning, the ability to reason in abstract and high-level perspectives, is key to generalization in human cognition. However, limited study has been done on large language models' capability to perform conceptual reasoning. In this work, we bridge this gap and propose a novel conceptualization framework that forces models to perform conceptual reasoning on abstract questions and gener… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

    Comments: Preprint under review

  48. arXiv:2403.17326  [pdf

    cond-mat.mtrl-sci

    Unveiling the origin of unconventional moire ferroelectricity

    Authors: Ruirui Niu, Zhuoxian Li, Xiangyan Han, Qianling Liu, Zhuangzhuang Qu, Zhiyu Wang, Chunrui Han, Kenji Watanabe, Takashi Taniguchi, Kaihui Liu, Jinhai Mao, Wu Shi, Bo Peng, Zheng Vitto Han, Zizhao Gan, Jianming Lu

    Abstract: Interfacial ferroelectricity emerges in heterostructures consisting of nonpolar van der Waals (vdW) layers, greatly expanding the scope of two dimensional ferroelectrics. In particular, the unconventional moire ferroelectricity observed in bilayer graphene/boron nitride (BN) heterostructures, exhibits promising functionalities with topological current, superconductivity and synaptic responses. How… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  49. arXiv:2403.17137  [pdf, other

    cond-mat.mes-hall cond-mat.dis-nn cond-mat.str-el

    Superlattice induced electron percolation within a single Landau level

    Authors: Nilanjan Roy, Bo Peng, Bo Yang

    Abstract: We investigate the quantum Hall effect in a single Landau level in the presence of a square superlattice of $δ$-function potentials. The interplay between the superlattice spacing $a_s$ and the magnetic length $\ell_B$ in clean system leads to three interesting characteristic regimes corresponding to $a_s \lt \ell_B$, $a_s \gg \ell_B$ and the intermediate one where $a_s \sim \ell_B$ . In the inter… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 5 pages, 4 figures and supplementary materials

  50. arXiv:2403.14418  [pdf, other

    cs.CV

    OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation

    Authors: Bohao Peng, Xiaoyang Wu, Li Jiang, Yukang Chen, Hengshuang Zhao, Zhuotao Tian, Jiaya Jia

    Abstract: The booming of 3D recognition in the 2020s began with the introduction of point cloud transformers. They quickly overwhelmed sparse CNNs and became state-of-the-art models, especially in 3D semantic segmentation. However, sparse CNNs are still valuable networks, due to their efficiency treasure, and ease of application. In this work, we reexamine the design distinctions and test the limits of what… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: CVPR 2024