Skip to main content

Showing 1–50 of 89 results for author: Yuan, F

  1. arXiv:2407.04521  [pdf, ps, other

    math.OC cs.LG q-fin.CP

    Unified continuous-time q-learning for mean-field game and mean-field control problems

    Authors: Xiaoli Wei, Xiang Yu, Fengyi Yuan

    Abstract: This paper studies the continuous-time q-learning in the mean-field jump-diffusion models from the representative agent's perspective. To overcome the challenge when the population distribution may not be directly observable, we introduce the integrated q-function in decoupled form (decoupled Iq-function) and establish its martingale characterization together with the value function, which provide… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  2. arXiv:2406.10246  [pdf, other

    cs.IR cs.AI

    Semantic-Enhanced Relational Metric Learning for Recommender Systems

    Authors: Mingming Li, Fuqing Zhu, Feng Yuan, Songlin Hu

    Abstract: Recently, relational metric learning methods have been received great attention in recommendation community, which is inspired by the translation mechanism in knowledge graph. Different from the knowledge graph where the entity-to-entity relations are given in advance, historical interactions lack explicit relations between users and items in recommender systems. Currently, many researchers have s… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:2405.17386  [pdf, other

    cs.CL cs.AI

    MindMerger: Efficient Boosting LLM Reasoning in non-English Languages

    Authors: Zixian Huang, Wenhao Zhu, Gong Cheng, Lei Li, Fei Yuan

    Abstract: Reasoning capabilities are crucial for Large Language Models (LLMs), yet a notable gap exists between English and non-English languages. To bridge this disparity, some works fine-tune LLMs to relearn reasoning capabilities in non-English languages, while others replace non-English inputs with an external model's outputs such as English translation text to circumvent the challenge of LLM understand… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  4. arXiv:2405.16799  [pdf, other

    cs.LG

    Dual-State Personalized Knowledge Tracing with Emotional Incorporation

    Authors: Shanshan Wang, Fangzheng Yuan, Keyang Wang, Xun Yang, Xingyi Zhang, Meng Wang

    Abstract: Knowledge tracing has been widely used in online learning systems to guide the students' future learning. However, most existing KT models primarily focus on extracting abundant information from the question sets and explore the relationships between them, but ignore the personalized student behavioral information in the learning process. This will limit the model's ability to accurately capture t… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  5. arXiv:2405.12530  [pdf, other

    cs.NI

    Multi-hop Multi-RIS Wireless Communication Systems: Multi-reflection Path Scheduling and Beamforming

    Authors: Xiaoyan Ma, Haixia Zhang, Xianhao Chen, Yuguang Fangmand Dongfeng Yuan

    Abstract: Reconfigurable intelligent surface (RIS) provides a promising way to proactively augment propagation environments for better transmission performance in wireless communications. Existing multi-RIS works mainly focus on link-level optimization with predetermined transmission paths, which cannot be directly extended to system-level management, since they neither consider the interference caused by u… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE Transactions on Wireless Communication

  6. arXiv:2405.01345  [pdf, other

    cs.CL

    The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights

    Authors: Wenhao Zhu, Shujian Huang, Fei Yuan, Cheng Chen, Jiajun Chen, Alexandra Birch

    Abstract: Bridging the significant gap between large language model's English and non-English performance presents a great challenge. While some previous studies attempt to mitigate this gap with translated training data, the recently proposed question alignment approach leverages the model's English expertise to improve multilingual performance with minimum usage of expensive, error-prone translation. In t… ▽ More

    Submitted 29 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  7. arXiv:2404.10393  [pdf, other

    cs.LG cs.AI

    Offline Trajectory Generalization for Offline Reinforcement Learning

    Authors: Ziqi Zhao, Zhaochun Ren, Liu Yang, Fajie Yuan, Pengjie Ren, Zhumin Chen, jun Ma, Xin Xin

    Abstract: Offline reinforcement learning (RL) aims to learn policies from static datasets of previously collected trajectories. Existing methods for offline RL either constrain the learned policy to the support of offline data or utilize model-based virtual environments to generate simulated rollouts. However, these methods suffer from (i) poor generalization to unseen states; and (ii) trivial improvement f… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  8. arXiv:2403.19347  [pdf, other

    cs.IR cs.AI

    Breaking the Length Barrier: LLM-Enhanced CTR Prediction in Long Textual User Behaviors

    Authors: Binzong Geng, Zhaoxin Huan, Xiaolu Zhang, Yong He, Liang Zhang, Fajie Yuan, Jun Zhou, Linjian Mo

    Abstract: With the rise of large language models (LLMs), recent works have leveraged LLMs to improve the performance of click-through rate (CTR) prediction. However, we argue that a critical obstacle remains in deploying LLMs for practical use: the efficiency of LLMs when processing long textual user behaviors. As user sequences grow longer, the current efficiency of LLMs is inadequate for training on billi… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Accepted by the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2024

  9. arXiv:2403.14734  [pdf, other

    cs.SE cs.AI cs.CL cs.PL

    A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

    Authors: Qiushi Sun, Zhirui Chen, Fangzhi Xu, Kanzhi Cheng, Chang Ma, Zhangyue Yin, Jianing Wang, Chengcheng Han, Renyu Zhu, Shuai Yuan, Qipeng Guo, Xipeng Qiu, Pengcheng Yin, Xiaoli Li, Fei Yuan, Lingpeng Kong, Xiang Li, Zhiyong Wu

    Abstract: Neural Code Intelligence -- leveraging deep learning to understand, generate, and optimize code -- holds immense potential for transformative impacts on the whole society. Bridging the gap between Natural Language and Programming Language, this domain has drawn significant attention from researchers in both research communities over the past few years. This survey presents a systematic and chronol… ▽ More

    Submitted 23 June, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: 64 pages, 6 figures, 10 tables, 692 references

  10. Local positional graphs and attentive local features for a data and runtime-efficient hierarchical place recognition pipeline

    Authors: Fangming Yuan, Stefan Schubert, Peter Protzel, Peer Neubert

    Abstract: Large-scale applications of Visual Place Recognition (VPR) require computationally efficient approaches. Further, a well-balanced combination of data-based and training-free approaches can decrease the required amount of training data and effort and can reduce the influence of distribution shifts between the training and application phases. This paper proposes a runtime and data-efficient hierarch… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: IEEE Robotics and Automation Letters (RA-L)

  11. arXiv:2403.07623  [pdf, other

    cs.IR

    Empowering Sequential Recommendation from Collaborative Signals and Semantic Relatedness

    Authors: Mingyue Cheng, Hao Zhang, Qi Liu, Fajie Yuan, Zhi Li, Zhenya Huang, Enhong Chen, Jun Zhou, Longfei Li

    Abstract: Sequential recommender systems (SRS) could capture dynamic user preferences by modeling historical behaviors ordered in time. Despite effectiveness, focusing only on the \textit{collaborative signals} from behaviors does not fully grasp user interests. It is also significant to model the \textit{semantic relatedness} reflected in content features, e.g., images and text. Towards that end, in this p… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  12. arXiv:2402.19118  [pdf, other

    cs.CV

    Continuous Sign Language Recognition Based on Motor attention mechanism and frame-level Self-distillation

    Authors: Qidan Zhu, Jing Li, Fei Yuan, Quan Gan

    Abstract: Changes in facial expression, head movement, body movement and gesture movement are remarkable cues in sign language recognition, and most of the current continuous sign language recognition(CSLR) research methods mainly focus on static images in video sequences at the frame-level feature extraction stage, while ignoring the dynamic changes in the images. In this paper, we propose a novel motor at… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 10 pages, 7 figures

  13. arXiv:2402.18070  [pdf, other

    cs.AR eess.SP

    A Hierarchical Dataflow-Driven Heterogeneous Architecture for Wireless Baseband Processing

    Authors: Limin Jiang, Yi Shi, Haiqin Hu, Qingyu Deng, Siyi Xu, Yintao Liu, Feng Yuan, Si Wang, Yihao Shen, Fangfang Ye, Shan Cao, Zhiyuan Jiang

    Abstract: Wireless baseband processing (WBP) is a key element of wireless communications, with a series of signal processing modules to improve data throughput and counter channel fading. Conventional hardware solutions, such as digital signal processors (DSPs) and more recently, graphic processing units (GPUs), provide various degrees of parallelism, yet they both fail to take into account the cyclical and… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 7 pages, 7 figures, conference

  14. arXiv:2402.02801  [pdf, other

    cs.CL cs.AI cs.LG

    KS-Lottery: Finding Certified Lottery Tickets for Multilingual Language Models

    Authors: Fei Yuan, Chang Ma, Shuai Yuan, Qiushi Sun, Lei Li

    Abstract: The lottery ticket hypothesis posits the existence of ``winning tickets'' within a randomly initialized neural network. Do winning tickets exist for LLMs in fine-tuning scenarios? How can we find such winning tickets? In this paper, we propose KS-Lottery, a method to identify a small subset of LLM parameters highly effective in multilingual fine-tuning. Our key idea is to use Kolmogorov-Smirnov Te… ▽ More

    Submitted 3 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  15. arXiv:2402.02070  [pdf, other

    cs.DB

    HotRAP: Hot Record Retention and Promotion for LSM-trees with tiered storage

    Authors: Jiansheng Qiu, Fangzhou Yuan, Huanchen Zhang

    Abstract: The multi-level design of Log-Structured Merge-trees (LSM-trees) naturally fits the tiered storage architecture: the upper levels (recently inserted/updated records) are kept in fast storage to guarantee performance while the lower levels (the majority of records) are placed in slower but cheaper storage to reduce cost. However, frequently accessed records may have been compacted and reside in slo… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

  16. arXiv:2401.07817  [pdf, other

    cs.CL

    Question Translation Training for Better Multilingual Reasoning

    Authors: Wenhao Zhu, Shujian Huang, Fei Yuan, Shuaijie She, Jiajun Chen, Alexandra Birch

    Abstract: Large language models show compelling performance on reasoning tasks but they tend to perform much worse in languages other than English. This is unsurprising given that their training data largely consists of English text and instructions. A typical solution is to translate instruction data into all languages of interest, and then train on the resulting multilingual data, which is called translat… ▽ More

    Submitted 29 June, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

    Comments: Accepted to Findings of ACL 2024

  17. arXiv:2401.05391  [pdf

    cs.AR cs.AI

    Efficient LLM inference solution on Intel GPU

    Authors: Hui Wu, Yi Gan, Feng Yuan, Jing Ma, Wei Zhu, Yutao Xu, Hong Zhu, Yuhua Zhu, Xiaoli Liu, Jinghui Gu, Peng Zhao

    Abstract: Transformer based Large Language Models (LLMs) have been widely used in many fields, and the efficiency of LLM inference becomes hot topic in real applications. However, LLMs are usually complicatedly designed in model structure with massive operations and perform inference in the auto-regressive mode, making it a challenging task to design a system with high efficiency. In this paper, we propos… ▽ More

    Submitted 23 June, 2024; v1 submitted 19 December, 2023; originally announced January 2024.

  18. arXiv:2312.09602  [pdf, other

    cs.IR

    Multi-Modality is All You Need for Transferable Recommender Systems

    Authors: Youhua Li, Hanwen Du, Yongxin Ni, Pengpeng Zhao, Qi Guo, Fajie Yuan, Xiaofang Zhou

    Abstract: ID-based Recommender Systems (RecSys), where each item is assigned a unique identifier and subsequently converted into an embedding vector, have dominated the designing of RecSys. Though prevalent, such ID-based paradigm is not suitable for developing transferable RecSys and is also susceptible to the cold-start issue. In this paper, we unleash the boundaries of the ID-based paradigm and propose a… ▽ More

    Submitted 18 December, 2023; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: ICDE'24 Accepted

  19. arXiv:2312.00336  [pdf, other

    cs.LG cs.IR

    Hypergraph Node Representation Learning with One-Stage Message Passing

    Authors: Shilin Qu, Weiqing Wang, Yuan-Fang Li, Xin Zhou, Fajie Yuan

    Abstract: Hypergraphs as an expressive and general structure have attracted considerable attention from various research domains. Most existing hypergraph node representation learning techniques are based on graph neural networks, and thus adopt the two-stage message passing paradigm (i.e. node -> hyperedge -> node). This paradigm only focuses on local information propagation and does not effectively take i… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

    Comments: 11 pages

  20. arXiv:2311.09278  [pdf, other

    cs.CL cs.AI

    Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models

    Authors: Fangzhi Xu, Zhiyong Wu, Qiushi Sun, Siyu Ren, Fei Yuan, Shuai Yuan, Qika Lin, Yu Qiao, Jun Liu

    Abstract: Although Large Language Models (LLMs) demonstrate remarkable ability in processing and generating human-like text, they do have limitations when it comes to comprehending and expressing world knowledge that extends beyond the boundaries of natural language(e.g., chemical molecular formula). Injecting a collection of symbolic data directly into the training of LLMs can be problematic, as it disrega… ▽ More

    Submitted 18 February, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: 23 pages, 13 figures

  21. arXiv:2311.09071  [pdf, other

    cs.CL cs.AI

    How Vocabulary Sharing Facilitates Multilingualism in LLaMA?

    Authors: Fei Yuan, Shuai Yuan, Zhiyong Wu, Lei Li

    Abstract: Large Language Models (LLMs), often show strong performance on English tasks, while exhibiting limitations on other languages. What is an LLM's multilingual capability when it is trained only on certain languages? The underlying mechanism remains unclear. This study endeavors to examine the multilingual capability of LLMs from the vocabulary sharing perspective by conducting an exhaustive analysis… ▽ More

    Submitted 3 June, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: ACL-2024 Findings

  22. arXiv:2310.17373  [pdf, other

    cs.IR

    FMMRec: Fairness-aware Multimodal Recommendation

    Authors: Weixin Chen, Li Chen, Yongxin Ni, Yuhan Zhao, Fajie Yuan, Yongfeng Zhang

    Abstract: Recently, multimodal recommendations have gained increasing attention for effectively addressing the data sparsity problem by incorporating modality-based representations. Although multimodal recommendations excel in accuracy, the introduction of different modalities (e.g., images, text, and audio) may expose more users' sensitive information (e.g., gender and age) to recommender systems, resultin… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  23. CylinderTag: An Accurate and Flexible Marker for Cylinder-Shape Objects Pose Estimation Based on Projective Invariants

    Authors: Shaoan Wang, Mingzhu Zhu, Yaoqing Hu, Dongyue Li, Fusong Yuan, Junzhi Yu

    Abstract: High-precision pose estimation based on visual markers has been a thriving research topic in the field of computer vision. However, the suitability of traditional flat markers on curved objects is limited due to the diverse shapes of curved surfaces, which hinders the development of high-precision pose estimation for curved objects. Therefore, this paper proposes a novel visual marker called Cylin… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 15 pages, 22 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

    Journal ref: IEEE Transactions on Visualization and Computer Graphics, 2024

  24. arXiv:2309.15379  [pdf, other

    cs.IR

    A Content-Driven Micro-Video Recommendation Dataset at Scale

    Authors: Yongxin Ni, Yu Cheng, Xiangyan Liu, Junchen Fu, Youhua Li, Xiangnan He, Yongfeng Zhang, Fajie Yuan

    Abstract: Micro-videos have recently gained immense popularity, sparking critical research in micro-video recommendation with significant implications for the entertainment, advertising, and e-commerce industries. However, the lack of large-scale public micro-video datasets poses a major challenge for developing effective recommender systems. To address this challenge, we introduce a very large micro-video… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  25. arXiv:2309.07705  [pdf, other

    cs.IR

    NineRec: A Benchmark Dataset Suite for Evaluating Transferable Recommendation

    Authors: Jiaqi Zhang, Yu Cheng, Yongxin Ni, Yunzhu Pan, Zheng Yuan, Junchen Fu, Youhua Li, Jie Wang, Fajie Yuan

    Abstract: Large foundational models, through upstream pre-training and downstream fine-tuning, have achieved immense success in the broad AI community due to improved model performance and significant reductions in repetitive engineering. By contrast, the transferable one-for-all models in the recommender system field, referred to as TransRec, have made limited progress. The development of TransRec has enco… ▽ More

    Submitted 17 March, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

  26. arXiv:2309.06789  [pdf, other

    cs.IR

    An Image Dataset for Benchmarking Recommender Systems with Raw Pixels

    Authors: Yu Cheng, Yunzhu Pan, Jiaqi Zhang, Yongxin Ni, Aixin Sun, Fajie Yuan

    Abstract: Recommender systems (RS) have achieved significant success by leveraging explicit identification (ID) features. However, the full potential of content features, especially the pure image pixel features, remains relatively unexplored. The limited availability of large, diverse, and content-driven image recommendation datasets has hindered the use of raw images as item representations. In this regar… ▽ More

    Submitted 17 September, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

  27. arXiv:2308.04948  [pdf, other

    cs.CL

    Extrapolating Large Language Models to Non-English by Aligning Languages

    Authors: Wenhao Zhu, Yunzhe Lv, Qingxiu Dong, Fei Yuan, Jingjing Xu, Shujian Huang, Lingpeng Kong, Jiajun Chen, Lei Li

    Abstract: Existing large language models show disparate capability across different languages, due to the imbalance in the training data. Their performances on English tasks are often stronger than on tasks of other languages. In this paper, we empower pre-trained LLMs on non-English languages by building semantic alignment across languages. We start from targeting individual languages by performing cross-l… ▽ More

    Submitted 9 October, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

  28. arXiv:2306.13678  [pdf

    cs.CE

    Rigid3D: a hybrid multi-sphere DEM framework for simulation of non-spherical particles in multi-phase flow

    Authors: Fei-Liang Yuan, Martin Sommerfeld, Pradeep Muramulla, Srikanth Gopireddy, Lars Pasternak, Nora Urbanetz, Thomas Profitlich

    Abstract: This article presents the development and validation of a hybrid multi-sphere discrete element framework - Rigid3D, for the simulation of granular systems with arbitrarily shaped particles in 3D space. In this DEM framework, a non-spherical particle is approximated by three different geometric models: (1) multi-sphere model with overlapping spheres (MS model), (2) particle surface with triangle me… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    Comments: Manuscript for submission to Springer Journal - Computational Particle Mechanics

  29. Exploring Adapter-based Transfer Learning for Recommender Systems: Empirical Studies and Practical Insights

    Authors: Junchen Fu, Fajie Yuan, Yu Song, Zheng Yuan, Mingyue Cheng, Shenghui Cheng, Jiaqi Zhang, Jie Wang, Yunzhu Pan

    Abstract: Adapters, a plug-in neural network module with some tunable parameters, have emerged as a parameter-efficient transfer learning technique for adapting pre-trained models to downstream tasks, especially for natural language processing (NLP) and computer vision (CV) fields. Meanwhile, learning recommendation models directly from raw item modality features -- e.g., texts of NLP and images of CV -- ca… ▽ More

    Submitted 8 December, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted by WSDM2024

  30. arXiv:2305.14859  [pdf, other

    cs.LG cs.CL cs.NE

    Utility-Probability Duality of Neural Networks

    Authors: Huang Bojun, Fei Yuan

    Abstract: It is typically understood that the training of modern neural networks is a process of fitting the probability distribution of desired output. However, recent paradoxical observations in a number of language generation tasks let one wonder if this canonical probability-based explanation can really account for the empirical success of deep learning. To resolve this issue, we propose an alternative… ▽ More

    Submitted 25 May, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  31. arXiv:2305.13140  [pdf, other

    cs.CL

    Extrapolating Multilingual Understanding Models as Multilingual Generators

    Authors: Bohong Wu, Fei Yuan, Hai Zhao, Lei Li, Jingjing Xu

    Abstract: Multilingual understanding models (or encoder-based), pre-trained via masked language modeling, have achieved promising results on many language understanding tasks (e.g., mBERT). However, these non-autoregressive (NAR) models still struggle to generate high-quality texts compared with autoregressive (AR) models. Considering that encoder-based models have the advantage of efficient generation and… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  32. arXiv:2305.11700  [pdf, other

    cs.IR

    Exploring the Upper Limits of Text-Based Collaborative Filtering Using Large Language Models: Discoveries and Insights

    Authors: Ruyu Li, Wenhao Deng, Yu Cheng, Zheng Yuan, Jiaqi Zhang, Fajie Yuan

    Abstract: Text-based collaborative filtering (TCF) has become the mainstream approach for text and news recommendation, utilizing text encoders, also known as language models (LMs), to represent items. However, existing TCF models primarily focus on using small or medium-sized LMs. It remains uncertain what impact replacing the item encoder with one of the largest and most powerful LMs, such as the 175-bill… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  33. arXiv:2303.13835  [pdf, other

    cs.IR

    Where to Go Next for Recommender Systems? ID- vs. Modality-based Recommender Models Revisited

    Authors: Zheng Yuan, Fajie Yuan, Yu Song, Youhua Li, Junchen Fu, Fei Yang, Yunzhu Pan, Yongxin Ni

    Abstract: Recommendation models that utilize unique identities (IDs) to represent distinct users and items have been state-of-the-art (SOTA) and dominated the recommender systems (RS) literature for over a decade. Meanwhile, the pre-trained modality encoders, such as BERT and ViT, have become increasingly powerful in modeling the raw modality features of an item, such as text and images. Given this, a natur… ▽ More

    Submitted 2 May, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

  34. arXiv:2303.06820  [pdf, other

    cs.CV

    Continuous sign language recognition based on cross-resolution knowledge distillation

    Authors: Qidan Zhu, Jing Li, Fei Yuan, Quan Gan

    Abstract: The goal of continuous sign language recognition(CSLR) research is to apply CSLR models as a communication tool in real life, and the real-time requirement of the models is important. In this paper, we address the model real-time problem through cross-resolution knowledge distillation. In our study, we found that keeping the frame-level feature scales consistent between the output of the student n… ▽ More

    Submitted 12 March, 2023; originally announced March 2023.

    Comments: 11 pages, 7 figures

  35. arXiv:2212.10551  [pdf, other

    cs.CL cs.AI

    Lego-MT: Learning Detachable Models for Massively Multilingual Machine Translation

    Authors: Fei Yuan, Yinquan Lu, WenHao Zhu, Lingpeng Kong, Lei Li, Yu Qiao, Jingjing Xu

    Abstract: Multilingual neural machine translation (MNMT) aims to build a unified model for many language directions. Existing monolithic models for MNMT encounter two challenges: parameter interference among languages and inefficient inference for large models. In this paper, we revisit the classic multi-way structures and develop a detachable model by assigning each language (or group of languages) to an i… ▽ More

    Submitted 19 July, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: ACL 2023 Findings

  36. arXiv:2211.11100  [pdf

    cs.SI

    Data-driven Tracking of the Bounce-back Path after Disasters: Critical Milestones of Population Activity Recovery and Their Spatial Inequality

    Authors: Yuqin Jiang, Faxi Yuan, Hamed Farahmand, Kushal Acharya, Jingdi Zhang, Ali Mostafavi

    Abstract: The ability to measure and track the speed and trajectory of a community's post-disaster recovery is essential to inform resource allocation and prioritization. The current survey-based approaches to examining community recovery, however, have significant lags and put the burden of data collection on affected people. Also, the existing literature lacks quantitative measures for important milestone… ▽ More

    Submitted 22 November, 2022; v1 submitted 20 November, 2022; originally announced November 2022.

  37. arXiv:2211.03387  [pdf, other

    cs.CV

    Temporal superimposed crossover module for effective continuous sign language

    Authors: Qidan Zhu, Jing Li, Fei Yuan, Quan Gan

    Abstract: The ultimate goal of continuous sign language recognition(CSLR) is to facilitate the communication between special people and normal people, which requires a certain degree of real-time and deploy-ability of the model. However, in the previous research on CSLR, little attention has been paid to the real-time and deploy-ability. In order to improve the real-time and deploy-ability of the model, thi… ▽ More

    Submitted 1 April, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

    Comments: 10 pages, 7 figures

  38. arXiv:2210.10629  [pdf, other

    cs.IR

    Tenrec: A Large-scale Multipurpose Benchmark Dataset for Recommender Systems

    Authors: Guanghu Yuan, Fajie Yuan, Yudong Li, Beibei Kong, Shujie Li, Lei Chen, Min Yang, Chenyun Yu, Bo Hu, Zang Li, Yu Xu, Xiaohu Qie

    Abstract: Existing benchmark datasets for recommender systems (RS) either are created at a small scale or involve very limited forms of user feedback. RS models evaluated on such datasets often lack practical values for large-scale real-world applications. In this paper, we describe Tenrec, a novel and publicly available data collection for RS that records various user feedback from four different recommend… ▽ More

    Submitted 4 June, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

  39. arXiv:2207.00928  [pdf, other

    cs.CV

    Continuous Sign Language Recognition via Temporal Super-Resolution Network

    Authors: Qidan Zhu, Jing Li, Fei Yuan, Quan Gan

    Abstract: Aiming at the problem that the spatial-temporal hierarchical continuous sign language recognition model based on deep learning has a large amount of computation, which limits the real-time application of the model, this paper proposes a temporal super-resolution network(TSRNet). The data is reconstructed into a dense feature sequence to reduce the overall model computation while keeping the final… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

    Comments: 13 pages, 11 figures

  40. arXiv:2206.13238  [pdf

    cs.CE

    SR-DEM: an efficient discrete element method for particles with surface of revolution

    Authors: Fei-Liang Yuan

    Abstract: In this paper, the surface of revolution discrete element method (SR-DEM) is introduced to simulate systems of particles with closed surfaces of revolution. Due to the cylindrical symmetry of a surface of revolution, the geometry of any cross-section about the axis of rotation remains the same. Taking advantage of this geometric feature, a node-to-cross-section contact algorithm is proposed for ef… ▽ More

    Submitted 7 January, 2024; v1 submitted 22 June, 2022; originally announced June 2022.

    Comments: Draft for Powder Tech

  41. arXiv:2206.06583  [pdf, other

    q-bio.QM cs.AI

    Exploring evolution-aware & -free protein language models as protein function predictors

    Authors: Mingyang Hu, Fajie Yuan, Kevin K. Yang, Fusong Ju, Jin Su, Hui Wang, Fei Yang, Qiuyang Ding

    Abstract: Large-scale Protein Language Models (PLMs) have improved performance in protein prediction tasks, ranging from 3D structure prediction to various function predictions. In particular, AlphaFold, a ground-breaking AI system, could potentially reshape structural biology. However, the utility of the PLM module in AlphaFold, Evoformer, has not been explored beyond structure prediction. In this paper, w… ▽ More

    Submitted 16 October, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

  42. arXiv:2206.06190  [pdf, other

    cs.IR

    TransRec: Learning Transferable Recommendation from Mixture-of-Modality Feedback

    Authors: Jie Wang, Fajie Yuan, Mingyue Cheng, Joemon M. Jose, Chenyun Yu, Beibei Kong, Xiangnan He, Zhijin Wang, Bo Hu, Zang Li

    Abstract: Learning large-scale pre-trained models on broad-ranging data and then transfer to a wide range of target tasks has become the de facto paradigm in many machine learning (ML) communities. Such big models are not only strong performers in practice but also offer a promising way to break out of the task-specific modeling restrictions, thereby enabling task-agnostic and unified ML systems. However, s… ▽ More

    Submitted 3 November, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

  43. arXiv:2204.03864  [pdf, other

    cs.CV

    Multi-scale temporal network for continuous sign language recognition

    Authors: Qidan Zhu, Jing Li, Fei Yuan, Quan Gan

    Abstract: Continuous Sign Language Recognition (CSLR) is a challenging research task due to the lack of accurate annotation on the temporal sequence of sign language data. The recent popular usage is a hybrid model based on "CNN + RNN" for CSLR. However, when extracting temporal features in these works, most of the methods using a fixed temporal receptive field and cannot extract the temporal features well… ▽ More

    Submitted 16 August, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

    Comments: 10 pages, 7 figures

  44. arXiv:2111.00429  [pdf, other

    cs.IR

    Enhancing Top-N Item Recommendations by Peer Collaboration

    Authors: Yang Sun, Fajie Yuan, Min Yang, Alexandros Karatzoglou, Shen Li, Xiaoyan Zhao

    Abstract: Deep neural networks (DNN) have achieved great success in the recommender systems (RS) domain. However, to achieve remarkable performance, DNN-based recommender models often require numerous parameters, which inevitably bring redundant neurons and weights, a phenomenon referred to as over-parameterization. In this paper, we plan to exploit such redundancy phenomena to improve the performance of RS… ▽ More

    Submitted 1 December, 2021; v1 submitted 31 October, 2021; originally announced November 2021.

    Comments: 9 pages, 6 figures

  45. arXiv:2109.02194  [pdf, other

    cs.RO cs.AI

    Learning-Based Strategy Design for Robot-Assisted Reminiscence Therapy Based on a Developed Model for People with Dementia

    Authors: Fengpei Yuan, Ran Zhang, Dania Bilal, Xiaopeng Zhao

    Abstract: In this paper, the robot-assisted Reminiscence Therapy (RT) is studied as a psychosocial intervention to persons with dementia (PwDs). We aim at a conversation strategy for the robot by reinforcement learning to stimulate the PwD to talk. Specifically, to characterize the stochastic reactions of a PwD to the robot's actions, a simulation model of a PwD is developed which features the transition pr… ▽ More

    Submitted 5 September, 2021; originally announced September 2021.

    Comments: 10 pages, conference, 2 figures

  46. arXiv:2108.13265  [pdf, other

    physics.soc-ph cs.LG

    Predicting Road Flooding Risk with Machine Learning Approaches Using Crowdsourced Reports and Fine-grained Traffic Data

    Authors: Faxi Yuan, William Mobley, Hamed Farahmand, Yuanchang Xu, Russell Blessing, Shangjia Dong, Ali Mostafavi, Samuel D. Brody

    Abstract: The objective of this study is to predict road flooding risks based on topographic, hydrologic, and temporal precipitation features using machine learning models. Predictive flood monitoring of road network flooding status plays an essential role in community hazard mitigation, preparedness, and response activities. Existing studies related to the estimation of road inundations either lack observe… ▽ More

    Submitted 14 September, 2021; v1 submitted 30 August, 2021; originally announced August 2021.

    Comments: 17 pages, 7 figures

  47. arXiv:2107.08173  [pdf, other

    cs.CL

    Continual Learning for Task-oriented Dialogue System with Iterative Network Pruning, Expanding and Masking

    Authors: Binzong Geng, Fajie Yuan, Qiancheng Xu, Ying Shen, Ruifeng Xu, Min Yang

    Abstract: This ability to learn consecutive tasks without forgetting how to perform previously trained problems is essential for developing an online dialogue system. This paper proposes an effective continual learning for the task-oriented dialogue system with iterative network pruning, expanding and masking (TPEM), which preserves performance on previously encountered tasks while accelerating learning pro… ▽ More

    Submitted 16 July, 2021; originally announced July 2021.

    Comments: Accepted by The Annual Meeting of the Association for Computational Linguistics (ACL), 2021

  48. arXiv:2107.07173  [pdf, other

    cs.IR cs.AI

    Scene-adaptive Knowledge Distillation for Sequential Recommendation via Differentiable Architecture Search

    Authors: Lei Chen, Fajie Yuan, Jiaxi Yang, Min Yang, Chengming Li

    Abstract: Sequential recommender systems (SRS) have become a research hotspot due to its power in modeling user dynamic interests and sequential behavioral patterns. To maximize model expressive ability, a default choice is to apply a larger and deeper network architecture, which, however, often brings high network latency when generating online recommendations. Naturally, we argue that compressing the heav… ▽ More

    Submitted 10 April, 2022; v1 submitted 15 July, 2021; originally announced July 2021.

  49. Iterative Network Pruning with Uncertainty Regularization for Lifelong Sentiment Classification

    Authors: Binzong Geng, Min Yang, Fajie Yuan, Shupeng Wang, Xiang Ao, Ruifeng Xu

    Abstract: Lifelong learning capabilities are crucial for sentiment classifiers to process continuous streams of opinioned information on the Web. However, performing lifelong learning is non-trivial for deep neural networks as continually training of incrementally available information inevitably results in catastrophic forgetting or interference. In this paper, we propose a novel iterative network pruning… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

    Comments: Accepted by the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2021

  50. arXiv:2106.08370  [pdf, other

    cs.SI physics.soc-ph

    Unraveling the Temporal Importance of Community-scale Human Activity Features for Rapid Assessment of Flood Impacts

    Authors: Faxi Yuan, Yang Yang, Qingchun Li, Ali Mostafavi

    Abstract: The objective of this research is to explore the temporal importance of community-scale human activity features for rapid assessment of flood impacts. Ultimate flood impact data, such as flood inundation maps and insurance claims, becomes available only weeks and months after the floods have receded. Crisis response managers, however, need near-real-time data to prioritize emergency response. This… ▽ More

    Submitted 21 June, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

    Comments: 26 pages and 15 figures