Skip to main content

Showing 1–50 of 127 results for author: Lan, X

  1. arXiv:2407.11699  [pdf, other

    cs.CV

    Relation DETR: Exploring Explicit Position Relation Prior for Object Detection

    Authors: Xiuquan Hou, Meiqin Liu, Senlin Zhang, Ping Wei, Badong Chen, Xuguang Lan

    Abstract: This paper presents a general scheme for enhancing the convergence and performance of DETR (DEtection TRansformer). We investigate the slow convergence problem in transformers from a new perspective, suggesting that it arises from the self-attention that introduces no structural bias over inputs. To address this issue, we explore incorporating position relation prior as attention bias to augment o… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  2. arXiv:2407.11497  [pdf, other

    cs.HC cs.GR

    "I Came Across a Junk": Understanding Design Flaws of Data Visualization from the Public's Perspective

    Authors: Xingyu Lan, Yu Liu

    Abstract: The visualization community has a rich history of reflecting upon flaws of visualization design, and research in this direction has remained lively until now. However, three main gaps still exist. First, most existing work characterizes design flaws from the perspective of researchers rather than the perspective of general users. Second, little work has been done to infer why these design flaws oc… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  3. arXiv:2407.07844  [pdf, other

    cs.CV

    OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion

    Authors: Hao Wang, Pengzhen Ren, Zequn Jie, Xiao Dong, Chengjian Feng, Yinlong Qian, Lin Ma, Dongmei Jiang, Yaowei Wang, Xiangyuan Lan, Xiaodan Liang

    Abstract: Open-vocabulary detection is a challenging task due to the requirement of detecting objects based on class names, including those not encountered during training. Existing methods have shown strong zero-shot detection capabilities through pre-training on diverse large-scale datasets. However, these approaches still face two primary challenges: (i) how to universally integrate diverse data sources… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Technical Report

  4. arXiv:2406.18838  [pdf

    cond-mat.mtrl-sci

    Electric-field control of the perpendicular magnetization switching in ferroelectric/ferrimagnet heterostructures

    Authors: Pengfei Liu, Tao Xu, Qi Liu, Juncai Dong, Ting Lin, Qinhua Zhang, Xiukai Lan, Yu Sheng, Chunyu Wang, Jiajing Pei, Hongxin Yang, Lin Gu, Kaiyou Wang

    Abstract: Electric field control of the magnetic state in ferrimagnets holds great promise for developing spintronic devices due to low power consumption. Here, we demonstrate a non-volatile reversal of perpendicular net magnetization in a ferrimagnet by manipulating the electric-field driven polarization within the Pb (Zr0.2Ti0.8) O3 (PZT)/CoGd heterostructure. Electron energy loss spectra and X-ray absorp… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 21 pages,4 figures

  5. arXiv:2404.13405  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Field-free switching of perpendicular magnetization by cooperation of planar Hall and orbital Hall effects

    Authors: Zelalem Abebe Bekele, Yuan-Yuan Jiang, Kun Lei, Xiukai Lan, Xiangyu Liu, Hui Wen, Ding-Fu Shao, Kaiyou Wang

    Abstract: Spin-orbit torques (SOTs) generated through the conventional spin Hall effect and/or Rashba-Edelstein effect are promising for manipulating magnetization. However, this approach typically exhibits non-deterministic and inefficient behaviour when it comes to switching perpendicular ferromagnets. This limitation posed a challenge for write-in operations in high-density magnetic memory devices. Here,… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: 13 pages, 3 figures, submitted to Nat. Commun

  6. arXiv:2404.01622  [pdf, ps, other

    cs.HC cs.AI cs.GR

    Gen4DS: Workshop on Data Storytelling in an Era of Generative AI

    Authors: Xingyu Lan, Leni Yang, Zezhong Wang, Yun Wang, Danqing Shi, Sheelagh Carpendale

    Abstract: Storytelling is an ancient and precious human ability that has been rejuvenated in the digital age. Over the last decade, there has been a notable surge in the recognition and application of data storytelling, both in academia and industry. Recently, the rapid development of generative AI has brought new opportunities and challenges to this field, sparking numerous new questions. These questions m… ▽ More

    Submitted 5 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  7. arXiv:2403.10750  [pdf, other

    cs.CL cs.AI

    Depression Detection on Social Media with Large Language Models

    Authors: Xiaochong Lan, Yiming Cheng, Li Sheng, Chen Gao, Yong Li

    Abstract: Depression harms. However, due to a lack of mental health awareness and fear of stigma, many patients do not actively seek diagnosis and treatment, leading to detrimental outcomes. Depression detection aims to determine whether an individual suffers from depression by analyzing their history of posts on social media, which can significantly aid in early detection and intervention. It mainly faces… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  8. arXiv:2402.19231  [pdf, other

    cs.CV cs.RO

    CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition

    Authors: Feng Lu, Xiangyuan Lan, Lijun Zhang, Dongmei Jiang, Yaowei Wang, Chun Yuan

    Abstract: Over the past decade, most methods in visual place recognition (VPR) have used neural networks to produce feature representations. These networks typically produce a global representation of a place image using only this image itself and neglect the cross-image variations (e.g. viewpoint and illumination), which limits their robustness in challenging scenes. In this paper, we propose a robust glob… ▽ More

    Submitted 1 April, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted by CVPR2024

  9. arXiv:2402.17978  [pdf, other

    cs.LG cs.AI cs.MA

    Imagine, Initialize, and Explore: An Effective Exploration Method in Multi-Agent Reinforcement Learning

    Authors: Zeyang Liu, Lipeng Wan, Xinrui Yang, Zhuoran Chen, Xingyu Chen, Xuguang Lan

    Abstract: Effective exploration is crucial to discovering optimal strategies for multi-agent reinforcement learning (MARL) in complex coordination tasks. Existing methods mainly utilize intrinsic rewards to enable committed exploration or use role-based learning for decomposing joint action spaces instead of directly conducting a collective search in the entire action-observation space. However, they often… ▽ More

    Submitted 1 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: The 38th Annual AAAI Conference on Artificial Intelligence

  10. Deep Homography Estimation for Visual Place Recognition

    Authors: Feng Lu, Shuting Dong, Lijun Zhang, Bingxi Liu, Xiangyuan Lan, Dongmei Jiang, Chun Yuan

    Abstract: Visual place recognition (VPR) is a fundamental task for many applications such as robot localization and augmented reality. Recently, the hierarchical VPR methods have received considerable attention due to the trade-off between accuracy and efficiency. They usually first use global features to retrieve the candidate images, then verify the spatial consistency of matched local features for re-ran… ▽ More

    Submitted 18 March, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI2024

    Journal ref: AAAI 2024

  11. arXiv:2402.14505  [pdf, other

    cs.CV cs.AI

    Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition

    Authors: Feng Lu, Lijun Zhang, Xiangyuan Lan, Shuting Dong, Yaowei Wang, Chun Yuan

    Abstract: Recent studies show that vision models pre-trained in generic visual learning tasks with large-scale data can provide useful feature representations for a wide range of visual perception problems. However, few attempts have been made to exploit pre-trained foundation models in visual place recognition (VPR). Due to the inherent difference in training objectives and data between the tasks of model… ▽ More

    Submitted 3 April, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: ICLR2024

  12. arXiv:2402.11816  [pdf, other

    cs.CV cs.LG

    Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning

    Authors: Jihai Zhang, Xiang Lan, Xiaoye Qu, Yu Cheng, Mengling Feng, Bryan Hooi

    Abstract: Self-Supervised Contrastive Learning has proven effective in deriving high-quality representations from unlabeled data. However, a major challenge that hinders both unimodal and multimodal contrastive learning is feature suppression, a phenomenon where the trained model captures only a limited portion of the information from the input data while overlooking other potentially valuable content. This… ▽ More

    Submitted 15 July, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: ECCV 2024 Camera-Ready

  13. arXiv:2402.11792  [pdf, other

    cs.RO

    SInViG: A Self-Evolving Interactive Visual Agent for Human-Robot Interaction

    Authors: Jie Xu, Hanbo Zhang, Xinghang Li, Huaping Liu, Xuguang Lan, Tao Kong

    Abstract: Linguistic ambiguity is ubiquitous in our daily lives. Previous works adopted interaction between robots and humans for language disambiguation. Nevertheless, when interactive robots are deployed in daily environments, there are significant challenges for natural human-robot interaction, stemming from complex and unpredictable visual inputs, open-ended interaction, and diverse user demands. In thi… ▽ More

    Submitted 19 February, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  14. arXiv:2402.03699  [pdf

    cs.RO cs.CV

    Automatic Robotic Development through Collaborative Framework by Large Language Models

    Authors: Zhirong Luan, Yujun Lai, Rundong Huang, Xiaruiqi Lan, Liangjun Chen, Badong Chen

    Abstract: Despite the remarkable code generation abilities of large language models LLMs, they still face challenges in complex task handling. Robot development, a highly intricate field, inherently demands human involvement in task allocation and collaborative teamwork . To enhance robot development, we propose an innovative automated collaboration framework inspired by real-world robot developers. This fr… ▽ More

    Submitted 16 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  15. arXiv:2401.16699  [pdf, other

    cs.RO

    Towards Unified Interactive Visual Grounding in The Wild

    Authors: Jie Xu, Hanbo Zhang, Qingyi Si, Yifeng Li, Xuguang Lan, Tao Kong

    Abstract: Interactive visual grounding in Human-Robot Interaction (HRI) is challenging yet practical due to the inevitable ambiguity in natural languages. It requires robots to disambiguate the user input by active information gathering. Previous approaches often rely on predefined templates to ask disambiguation questions, resulting in performance reduction in realistic interactive scenarios. In this paper… ▽ More

    Submitted 18 February, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted to ICRA 2024

  16. arXiv:2401.16355  [pdf, other

    cs.CV

    PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology

    Authors: Yuxuan Sun, Hao Wu, Chenglu Zhu, Sunyi Zheng, Qizi Chen, Kai Zhang, Yunlong Zhang, Dan Wan, Xiaoxiao Lan, Mengyue Zheng, Jingxiong Li, Xinheng Lyu, Tao Lin, Lin Yang

    Abstract: The emergence of large multimodal models has unlocked remarkable potential in AI, particularly in pathology. However, the lack of specialized, high-quality benchmark impeded their development and precise evaluation. To address this, we introduce PathMMU, the largest and highest-quality expert-validated pathology benchmark for Large Multimodal Models (LMMs). It comprises 33,428 multimodal multi-cho… ▽ More

    Submitted 20 March, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: 27 pages, 12 figures

  17. arXiv:2401.05671  [pdf

    cond-mat.mtrl-sci

    Deciphering Interphase Instability of Lithium Metal Batteries with Localized High-Concentration Electrolytes at Elevated Temperatures

    Authors: Tao Meng, Shanshan Yang, Yitong Peng, Xiwei Lan, Pingan Li, Kangjia Hu, Xianluo Hu

    Abstract: Lithium metal batteries (LMBs), when coupled with a localized high-concentration electrolyte and a high-voltage nickel-rich cathode, offer a solution to the increasing demand for high energy density and long cycle life. However, the aggressive electrode chemistry poses safety risks to LMBs at higher temperatures and cutoff voltages. Here, we decipher the interphase instability in LHCE-based LMBs w… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: 10 pages, 8 figures

  18. arXiv:2312.11970  [pdf, other

    cs.AI cs.CL cs.CY cs.MA

    Large Language Models Empowered Agent-based Modeling and Simulation: A Survey and Perspectives

    Authors: Chen Gao, Xiaochong Lan, Nian Li, Yuan Yuan, Jingtao Ding, Zhilun Zhou, Fengli Xu, Yong Li

    Abstract: Agent-based modeling and simulation has evolved as a powerful tool for modeling complex systems, offering insights into emergent behaviors and interactions among diverse agents. Integrating large language models into agent-based modeling and simulation presents a promising avenue for enhancing simulation capabilities. This paper surveys the landscape of utilizing large language models in agent-bas… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 37 pages

  19. arXiv:2310.10467  [pdf, other

    cs.CL cs.AI

    Stance Detection with Collaborative Role-Infused LLM-Based Agents

    Authors: Xiaochong Lan, Chen Gao, Depeng Jin, Yong Li

    Abstract: Stance detection automatically detects the stance in a text towards a target, vital for content analysis in web and social media research. Despite their promising capabilities, LLMs encounter challenges when directly applied to stance detection. First, stance detection demands multi-aspect knowledge, from deciphering event-related terminologies to understanding the expression styles in social medi… ▽ More

    Submitted 16 April, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  20. arXiv:2310.05694  [pdf, other

    cs.CL

    A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics

    Authors: Kai He, Rui Mao, Qika Lin, Yucheng Ruan, Xiang Lan, Mengling Feng, Erik Cambria

    Abstract: The utilization of large language models (LLMs) in the Healthcare domain has generated both excitement and concern due to their ability to effectively respond to freetext queries with certain professional knowledge. This survey outlines the capabilities of the currently developed LLMs for Healthcare and explicates their development process, with the aim of providing an overview of the development… ▽ More

    Submitted 11 June, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

  21. arXiv:2309.15983  [pdf, other

    stat.ME econ.EM stat.AP

    What To Do (and Not to Do) with Causal Panel Analysis under Parallel Trends: Lessons from A Large Reanalysis Study

    Authors: Albert Chiu, Xingchen Lan, Ziyi Liu, Yiqing Xu

    Abstract: Two-way fixed effects (TWFE) models are ubiquitous in causal panel analysis in political science. However, recent methodological discussions challenge their validity in the presence of heterogeneous treatment effects (HTE) and violations of the parallel trends assumption (PTA). This burgeoning literature has introduced multiple estimators and diagnostics, leading to confusion among empirical resea… ▽ More

    Submitted 14 June, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

  22. FoodSAM: Any Food Segmentation

    Authors: Xing Lan, Jiayi Lyu, Hanyu Jiang, Kun Dong, Zehai Niu, Yi Zhang, Jian Xue

    Abstract: In this paper, we explore the zero-shot capability of the Segment Anything Model (SAM) for food image segmentation. To address the lack of class-specific information in SAM-generated masks, we propose a novel framework, called FoodSAM. This innovative approach integrates the coarse semantic mask with SAM-generated masks to enhance semantic segmentation quality. Besides, we recognize that the ingre… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: Code is available at https://github.com/jamesjg/FoodSAM

  23. arXiv:2308.02831  [pdf, other

    cs.HC

    Affective Visualization Design: Leveraging the Emotional Impact of Data

    Authors: Xingyu Lan, Yanqiu Wu, Nan Cao

    Abstract: In recent years, more and more researchers have reflected on the undervaluation of emotion in data visualization and highlighted the importance of considering human emotion in visualization design. Meanwhile, an increasing number of studies have been conducted to explore emotion-related factors. However, so far, this research area is still in its early stages and faces a set of challenges, such as… ▽ More

    Submitted 5 August, 2023; originally announced August 2023.

    Comments: to appear at IEEE VIS 2023

  24. NEON: Living Needs Prediction System in Meituan

    Authors: Xiaochong Lan, Chen Gao, Shiqi Wen, Xiuqi Chen, Yingge Che, Han Zhang, Huazhou Wei, Hengliang Luo, Yong Li

    Abstract: Living needs refer to the various needs in human's daily lives for survival and well-being, including food, housing, entertainment, etc. On life service platforms that connect users to service providers, such as Meituan, the problem of living needs prediction is fundamental as it helps understand users and boost various downstream applications such as personalized recommendation. However, the prob… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

  25. Fermi-LAT detection of A new starburst galaxy candidate: IRAS 13052-5711

    Authors: Yunchuan Xiang, Qingquan Jiang, Xiaofei Lan

    Abstract: A likely starburst galaxy (SBG), IRAS 13052-5711, which is the most distant SBG candidate discovered to date, was found by analyzing 14.4 years of data from the Fermi large-area telescope (Fermi-LAT). This SBG's significance level is approximately 6.55$σ$ in the 0.1-500 GeV band. Its spatial position is close to that of 4FGL J1308.9-5730, determined from the Fermi large telescope fourth-source Cat… ▽ More

    Submitted 29 July, 2023; originally announced July 2023.

  26. arXiv:2307.14984  [pdf, other

    cs.SI

    S3: Social-network Simulation System with Large Language Model-Empowered Agents

    Authors: Chen Gao, Xiaochong Lan, Zhihong Lu, Jinzhu Mao, Jinghua Piao, Huandong Wang, Depeng Jin, Yong Li

    Abstract: Social network simulation plays a crucial role in addressing various challenges within social science. It offers extensive applications such as state prediction, phenomena explanation, and policy-making support, among others. In this work, we harness the formidable human-like capabilities exhibited by large language models (LLMs) in sensing, reasoning, and behaving, and utilize these qualities to… ▽ More

    Submitted 19 October, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

  27. arXiv:2307.11458  [pdf, other

    cs.CV

    Strip-MLP: Efficient Token Interaction for Vision MLP

    Authors: Guiping Cao, Shengda Luo, Wenjian Huang, Xiangyuan Lan, Dongmei Jiang, Yaowei Wang, Jianguo Zhang

    Abstract: Token interaction operation is one of the core modules in MLP-based models to exchange and aggregate information between different spatial locations. However, the power of token interaction on the spatial dimension is highly dependent on the spatial resolution of the feature maps, which limits the model's expressive ability, especially in deep layers where the feature are down-sampled to a small s… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  28. arXiv:2307.09193  [pdf, other

    cs.AI cs.IR

    ESMC: Entire Space Multi-Task Model for Post-Click Conversion Rate via Parameter Constraint

    Authors: Zhenhao Jiang, Biao Zeng, Hao Feng, Jin Liu, Jicong Fan, Jie Zhang, Jia Jia, Ning Hu, Xingyu Chen, Xuguang Lan

    Abstract: Large-scale online recommender system spreads all over the Internet being in charge of two basic tasks: Click-Through Rate (CTR) and Post-Click Conversion Rate (CVR) estimations. However, traditional CVR estimators suffer from well-known Sample Selection Bias and Data Sparsity issues. Entire space models were proposed to address the two issues via tracing the decision-making path of "exposure_clic… ▽ More

    Submitted 29 July, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

  29. arXiv:2305.12624  [pdf, other

    stat.ME

    Scalable regression calibration approaches to correcting measurement error in multi-level generalized functional linear regression models with heteroscedastic measurement errors

    Authors: Yuanyuan Luan, Roger S. Zoh, Erjia Cui, Xue Lan, Sneha Jadhav, Carmen D. Tekwe

    Abstract: Wearable devices permit the continuous monitoring of biological processes, such as blood glucose metabolism, and behavior, such as sleep quality and physical activity. The continuous monitoring often occurs in epochs of 60 seconds over multiple days, resulting in high dimensional longitudinal curves that are best described and analyzed as functional data. From this perspective, the functional data… ▽ More

    Submitted 20 April, 2024; v1 submitted 21 May, 2023; originally announced May 2023.

  30. arXiv:2304.12592  [pdf, other

    cs.CV cs.AI

    MMRDN: Consistent Representation for Multi-View Manipulation Relationship Detection in Object-Stacked Scenes

    Authors: Han Wang, Jiayuan Zhang, Lipeng Wan, Xingyu Chen, Xuguang Lan, Nanning Zheng

    Abstract: Manipulation relationship detection (MRD) aims to guide the robot to grasp objects in the right order, which is important to ensure the safety and reliability of grasping in object stacked scenes. Previous works infer manipulation relationship by deep neural network trained with data collected from a predefined view, which has limitation in visual dislocation in unstructured environments. Multi-vi… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  31. arXiv:2304.01171  [pdf, other

    cs.CV

    Revisiting Context Aggregation for Image Matting

    Authors: Qinglin Liu, Xiaoqian Lv, Quanling Meng, Zonglin Li, Xiangyuan Lan, Shuo Yang, Shengping Zhang, Liqiang Nie

    Abstract: Traditional studies emphasize the significance of context information in improving matting performance. Consequently, deep learning-based matting methods delve into designing pooling or affinity-based context aggregation modules to achieve superior results. However, these modules cannot well handle the context scale shift caused by the difference in image size during training and inference, result… ▽ More

    Submitted 14 May, 2024; v1 submitted 3 April, 2023; originally announced April 2023.

  32. arXiv:2303.17408  [pdf, other

    cs.CL

    P-Transformer: A Prompt-based Multimodal Transformer Architecture For Medical Tabular Data

    Authors: Yucheng Ruan, Xiang Lan, Daniel J. Tan, Hairil Rizal Abdullah, Mengling Feng

    Abstract: Medical tabular data, abundant in Electronic Health Records (EHRs), is a valuable resource for diverse medical tasks such as risk prediction. While deep learning approaches, particularly transformer-based models, have shown remarkable performance in tabular data prediction, there are still problems remained for existing work to be effectively adapted into medical domain, such as under-utilization… ▽ More

    Submitted 9 January, 2024; v1 submitted 30 March, 2023; originally announced March 2023.

  33. arXiv:2303.07828  [pdf, other

    cs.RO

    Prioritized Planning for Target-Oriented Manipulation via Hierarchical Stacking Relationship Prediction

    Authors: Zewen Wu, Jian Tang, Xingyu Chen, Chengzhong Ma, Xuguang Lan, Nanning Zheng

    Abstract: In scenarios involving the grasping of multiple targets, the learning of stacking relationships between objects is fundamental for robots to execute safely and efficiently. However, current methods lack subdivision for the hierarchy of stacking relationship types. In scenes where objects are mostly stacked in an orderly manner, they are incapable of performing human-like and high-efficient graspin… ▽ More

    Submitted 25 June, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: 8 pages, 8 figures. Accepted by 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023)

  34. arXiv:2302.03357  [pdf, other

    cs.LG

    Towards Enhancing Time Series Contrastive Learning: A Dynamic Bad Pair Mining Approach

    Authors: Xiang Lan, Hanshu Yan, Shenda Hong, Mengling Feng

    Abstract: Not all positive pairs are beneficial to time series contrastive learning. In this paper, we study two types of bad positive pairs that can impair the quality of time series representation learned through contrastive learning: the noisy positive pair and the faulty positive pair. We observe that, with the presence of noisy positive pairs, the model tends to simply learn the pattern of noise (Noisy… ▽ More

    Submitted 28 March, 2024; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: ICLR 2024 Camera Ready (https://openreview.net/pdf?id=K2c04ulKXn)

  35. arXiv:2211.12075  [pdf, other

    cs.MA cs.LG

    Greedy based Value Representation for Optimal Coordination in Multi-agent Reinforcement Learning

    Authors: Lipeng Wan, Zeyang Liu, Xingyu Chen, Xuguang Lan, Nanning Zheng

    Abstract: Due to the representation limitation of the joint Q value function, multi-agent reinforcement learning methods with linear value decomposition (LVD) or monotonic value decomposition (MVD) suffer from relative overgeneralization. As a result, they can not ensure optimal consistency (i.e., the correspondence between individual greedy actions and the maximal true Q value). In this paper, we derive th… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2112.04454

  36. arXiv:2211.03296  [pdf, other

    cs.HC

    The Chart Excites Me! Exploring How Data Visualization Design Influences Affective Arousal

    Authors: Xingyu Lan, Yanqiu Wu, Qing Chen, Nan Cao

    Abstract: As data visualizations have been increasingly applied in mass communication, designers often seek to grasp viewers immediately and motivate them to read more. Such goals, as suggested by previous research, are closely associated with the activation of emotion, namely affective arousal. Given this motivation, this work takes initial steps toward understanding the arousal-related factors in data vis… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

  37. arXiv:2209.03642  [pdf, other

    cs.HC

    VizBelle: A Design Space of Embellishments for Data Visualization

    Authors: Qing Chen, Ziyan Liu, Chengwei Wang, Xingyu Lan, Ying Chen, Siming Chen, Nan Cao

    Abstract: Visual embellishments, as a form of non-linguistic rhetorical figures, are used to help convey abstract concepts or attract readers' attention. Creating data visualizations with appropriate and visually pleasing embellishments is challenging since this process largely depends on the experience and the aesthetic taste of designers. To help facilitate designers in the ideation and creation process,… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

  38. arXiv:2208.07156  [pdf, other

    cs.NE cs.CE cs.MA eess.SY

    Cooperative guidance of multiple missiles: a hybrid co-evolutionary approach

    Authors: Xuejing Lan, Junda Chen, Zhijia Zhao, Tao Zou

    Abstract: Cooperative guidance of multiple missiles is a challenging task with rigorous constraints of time and space consensus, especially when attacking dynamic targets. In this paper, the cooperative guidance task is described as a distributed multi-objective cooperative optimization problem. To address the issues of non-stationarity and continuous control faced by cooperative guidance, the natural evolu… ▽ More

    Submitted 14 April, 2023; v1 submitted 15 August, 2022; originally announced August 2022.

    ACM Class: F.2.2; J.2

  39. arXiv:2208.04518  [pdf

    cs.CY cs.HC

    A Mixed-Methods Analysis of the Algorithm-Mediated Labor of Online Food Deliverers in China

    Authors: Zhilong Chen, Xiaochong Lan, Jinghua Piao, Yunke Zhang, Yong Li

    Abstract: In recent years, China has witnessed the proliferation and success of the online food delivery industry, an emerging type of the gig economy. Online food deliverers who deliver the food from restaurants to customers play a critical role in enabling this industry. Mediated by algorithms and coupled with interactions with multiple stakeholders, this emerging kind of labor has been taken by millions… ▽ More

    Submitted 26 August, 2022; v1 submitted 8 August, 2022; originally announced August 2022.

    Comments: Accepted to CSCW 2022

  40. arXiv:2208.04122  [pdf

    cs.CY cs.HC cs.IR

    Practitioners Versus Users: A Value-Sensitive Evaluation of Current Industrial Recommender System Design

    Authors: Zhilong Chen, Jinghua Piao, Xiaochong Lan, Hancheng Cao, Chen Gao, Zhicong Lu, Yong Li

    Abstract: Recommender systems are playing an increasingly important role in alleviating information overload and supporting users' various needs, e.g., consumption, socialization, and entertainment. However, limited research focuses on how values should be extensively considered in industrial deployments of recommender systems, the ignorance of which can be problematic. To fill this gap, in this paper, we a… ▽ More

    Submitted 26 August, 2022; v1 submitted 8 August, 2022; originally announced August 2022.

    Comments: Zhilong Chen and Jinghua Piao contribute equally to this work; Accepted to CSCW 2022

  41. Interpreting time-integrated polarization data of gamma-ray burst prompt emission

    Authors: R. Y. Guan, M. X. Lan

    Abstract: Aims. With the accumulation of polarization data in the gamma-ray burst (GRB) prompt phase, polarization models can be tested. Methods. We predicted the time-integrated polarizations of 37 GRBs with polarization observation. We used their observed spectral parameters to do this. In the model, the emission mechanism is synchrotron radiation, and the magnetic field configuration in the emission regi… ▽ More

    Submitted 7 October, 2022; v1 submitted 7 August, 2022; originally announced August 2022.

    Comments: 6 pages, 5 figures, with updated AstroSat data, accepted by AA

    Journal ref: A&A 670, A160 (2023)

  42. arXiv:2207.08794  [pdf, other

    cs.CV cs.RO

    DeFlowSLAM: Self-Supervised Scene Motion Decomposition for Dynamic Dense SLAM

    Authors: Weicai Ye, Xingyuan Yu, Xinyue Lan, Yuhang Ming, Jinyu Li, Hujun Bao, Zhaopeng Cui, Guofeng Zhang

    Abstract: We present a novel dual-flow representation of scene motion that decomposes the optical flow into a static flow field caused by the camera motion and another dynamic flow field caused by the objects' movements in the scene. Based on this representation, we present a dynamic SLAM, dubbed DeFlowSLAM, that exploits both static and dynamic pixels in the images to solve the camera poses, rather than si… ▽ More

    Submitted 13 January, 2023; v1 submitted 18 July, 2022; originally announced July 2022.

    Comments: Homepage: https://zju3dv.github.io/deflowslam

  43. arXiv:2207.02705  [pdf, other

    cs.NI cs.IT

    Incentivizing Proof-of-Stake Blockchain for Secured Data Collection in UAV-Assisted IoT: A Multi-Agent Reinforcement Learning Approach

    Authors: Xiao Tang, Xunqiang Lan, Lixin Li, Yan Zhang, Zhu Han

    Abstract: The Internet of Things (IoT) can be conveniently deployed while empowering various applications, where the IoT nodes can form clusters to finish certain missions collectively. In this paper, we propose to employ unmanned aerial vehicles (UAVs) to assist the clustered IoT data collection with blockchain-based security provisioning. In particular, the UAVs generate candidate blocks based on the coll… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

    Comments: 14 pages, 10 figures, submitted to IEEE Journal

  44. arXiv:2207.01610  [pdf, other

    cs.CV cs.RO

    PVO: Panoptic Visual Odometry

    Authors: Weicai Ye, Xinyue Lan, Shuo Chen, Yuhang Ming, Xingyuan Yu, Hujun Bao, Zhaopeng Cui, Guofeng Zhang

    Abstract: We present PVO, a novel panoptic visual odometry framework to achieve more comprehensive modeling of the scene motion, geometry, and panoptic segmentation information. Our PVO models visual odometry (VO) and video panoptic segmentation (VPS) in a unified view, which makes the two tasks mutually beneficial. Specifically, we introduce a panoptic update module into the VO Module with the guidance of… ▽ More

    Submitted 26 March, 2023; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: CVPR2023 Project page: https://zju3dv.github.io/pvo/ code: https://github.com/zju3dv/PVO

  45. arXiv:2203.05243  [pdf, other

    cs.CV cs.CL cs.MM

    A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach

    Authors: Xiaohan Lan, Yitian Yuan, Xin Wang, Long Chen, Zhi Wang, Lin Ma, Wenwu Zhu

    Abstract: Temporal Sentence Grounding in Videos (TSGV), which aims to ground a natural language sentence in an untrimmed video, has drawn widespread attention over the past few years. However, recent studies have found that current benchmark datasets may have obvious moment annotation biases, enabling several simple baselines even without training to achieve SOTA performance. In this paper, we take a closer… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

  46. arXiv:2203.01217  [pdf, other

    cs.CV

    Hybrid Tracker with Pixel and Instance for Video Panoptic Segmentation

    Authors: Weicai Ye, Xinyue Lan, Ge Su, Hujun Bao, Zhaopeng Cui, Guofeng Zhang

    Abstract: Video Panoptic Segmentation (VPS) aims to generate coherent panoptic segmentation and track the identities of all pixels across video frames. Existing methods predominantly utilize the trained instance embedding to keep the consistency of panoptic segmentation. However, they inevitably struggle to cope with the challenges of small objects, similar appearance but inconsistent identities, occlusion,… ▽ More

    Submitted 11 December, 2023; v1 submitted 2 March, 2022; originally announced March 2022.

  47. arXiv:2203.00865  [pdf

    cs.CY cs.HC

    Beyond Virtual Bazaar: How Social Commerce Promotes Inclusivity for the Traditionally Underserved Community in Chinese Developing Regions

    Authors: Zhilong Chen, Hancheng Cao, Xiaochong Lan, Zhicong Lu, Yong Li

    Abstract: The disadvantaged population is often underserved and marginalized in technology engagement: prior works show they are generally more reluctant and experience more barriers in adopting and engaging with mainstream technology. Here, we contribute to the HCI4D and ICTD literature through a novel "counter" case study on Chinese social commerce (e.g., Pinduoduo), which 1) first prospers among the trad… ▽ More

    Submitted 1 March, 2022; originally announced March 2022.

    Comments: Zhilong Chen and Hancheng Cao contribute equally to this work; Accepted to CHI 2022

  48. arXiv:2202.03631  [pdf, ps, other

    cs.RO

    Robotic Grasping from Classical to Modern: A Survey

    Authors: Hanbo Zhang, Jian Tang, Shiguang Sun, Xuguang Lan

    Abstract: Robotic Grasping has always been an active topic in robotics since grasping is one of the fundamental but most challenging skills of robots. It demands the coordination of robotic perception, planning, and control for robustness and intelligence. However, current solutions are still far behind humans, especially when confronting unstructured scenarios. In this paper, we survey the advances of robo… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

  49. arXiv:2112.05101  [pdf

    astro-ph.IM

    The In-Flight Realtime Trigger and Localization Software of GECAM

    Authors: Xiao-Yun Zhao, Shao-Lin Xiong, Xiang-Yang Wen, Xin-Qiao Li, Ce Cai, Shuo Xiao, Qi Luo, Wen-Xi Peng, Dong-Ya Guo, Zheng-Hua An, Ke Gong, Jin-Yuan Liao, Yan-Qiu Zhang, Yue Huang, Lu Li, Xing Wen, Fei Zhang, Jing Duan, Chen-Wei Wang, Dong-Li Shi, Peng Zhang, Qi-Bin Yi, Chao-Yang Li, Yan-Bing Xu, Xiao-Hua Liang , et al. (64 additional authors not shown)

    Abstract: Realtime trigger and localization of bursts are the key functions of GECAM, which is an all-sky gamma-ray monitor launched in Dec 10, 2020. We developed a multifunctional trigger and localization software operating on the CPU of the GECAM electronic box (EBOX). This onboard software has the following features: high trigger efficiency for real celestial bursts with a suppression of false triggers c… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: Draft, comments welcome

  50. arXiv:2112.04454  [pdf, other

    cs.MA

    Greedy-based Value Representation for Optimal Coordination in Multi-agent Reinforcement Learning

    Authors: Lipeng Wan, Zeyang Liu, Xingyu Chen, Han Wang, Xuguang Lan

    Abstract: Due to the representation limitation of the joint Q value function, multi-agent reinforcement learning methods with linear value decomposition (LVD) or monotonic value decomposition (MVD) suffer from relative overgeneralization. As a result, they can not ensure optimal consistency (i.e., the correspondence between individual greedy actions and the maximal true Q value). In this paper, we derive th… ▽ More

    Submitted 3 July, 2022; v1 submitted 8 December, 2021; originally announced December 2021.