Skip to main content

Showing 1–50 of 2,424 results for author: Lin, J

  1. arXiv:2407.12963  [pdf, other

    eess.IV

    Edge Projection-Based Adaptive View Selection for Cone-Beam CT

    Authors: Jingsong Lin, Singanallur Venkatakrishnan, Gregery Buzzard, Amir Koushyar Ziabari, Charles Bouman

    Abstract: Industrial cone-beam X-ray computed tomography (CT) scans of additively manufactured components produce a 3D reconstruction from projection measurements acquired at multiple predetermined rotation angles of the component about a single axis. Typically, a large number of projections are required to achieve a high-quality reconstruction, a process that can span several hours or days depending on the… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Submitted to 2024 Asilomar Conference on Signals, Systems, and Computers

  2. arXiv:2407.12579  [pdf, other

    cs.CV cs.AI

    The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation

    Authors: Yi Yao, Chan-Feng Hsu, Jhe-Hao Lin, Hongxia Xie, Terence Lin, Yi-Ning Huang, Hong-Han Shuai, Wen-Huang Cheng

    Abstract: In spite of recent advancements in text-to-image generation, limitations persist in handling complex and imaginative prompts due to the restricted diversity and complexity of training data. This work explores how diffusion models can generate images from prompts requiring artistic creativity or specialized knowledge. We introduce the Realistic-Fantasy Benchmark (RFBench), a novel evaluation framew… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  3. arXiv:2407.11578  [pdf, other

    cs.CV eess.IV

    UP-Diff: Latent Diffusion Model for Remote Sensing Urban Prediction

    Authors: Zeyu Wang, Zecheng Hao, Jingyu Lin, Yuchao Feng, Yufei Guo

    Abstract: This study introduces a novel Remote Sensing (RS) Urban Prediction (UP) task focused on future urban planning, which aims to forecast urban layouts by utilizing information from existing urban layouts and planned change maps. To address the proposed RS UP task, we propose UP-Diff, which leverages a Latent Diffusion Model (LDM) to capture positionaware embeddings of pre-change urban layouts and pla… ▽ More

    Submitted 16 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: 5 pages, 4 figures

  4. arXiv:2407.11477  [pdf, other

    cs.LG cs.AI

    XTraffic: A Dataset Where Traffic Meets Incidents with Explainability and More

    Authors: Xiaochuan Gou, Ziyue Li, Tian Lan, Junpeng Lin, Zhishuai Li, Bingyu Zhao, Chen Zhang, Di Wang, Xiangliang Zhang

    Abstract: Long-separated research has been conducted on two highly correlated tracks: traffic and incidents. Traffic track witnesses complicating deep learning models, e.g., to push the prediction a few percent more accurate, and the incident track only studies the incidents alone, e.g., to infer the incident risk. We, for the first time, spatiotemporally aligned the two tracks in a large-scale region (16,9… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  5. arXiv:2407.11007  [pdf, other

    cs.CL cs.AI

    Panacea: A foundation model for clinical trial search, summarization, design, and recruitment

    Authors: Jiacheng Lin, Hanwen Xu, Zifeng Wang, Sheng Wang, Jimeng Sun

    Abstract: Clinical trials are fundamental in developing new drugs, medical devices, and treatments. However, they are often time-consuming and have low success rates. Although there have been initial attempts to create large language models (LLMs) for clinical trial design and patient-trial matching, these models remain task-specific and not adaptable to diverse clinical trial tasks. To address this challen… ▽ More

    Submitted 25 June, 2024; originally announced July 2024.

  6. arXiv:2407.10916  [pdf, other

    cs.LG cs.SI

    When Heterophily Meets Heterogeneity: New Graph Benchmarks and Effective Methods

    Authors: Junhong Lin, Xiaojie Guo, Shuaicheng Zhang, Dawei Zhou, Yada Zhu, Julian Shun

    Abstract: Many real-world graphs frequently present challenges for graph learning due to the presence of both heterophily and heterogeneity. However, existing benchmarks for graph learning often focus on heterogeneous graphs with homophily or homogeneous graphs with heterophily, leaving a gap in understanding how methods perform on graphs that are both heterogeneous and heterophilic. To bridge this gap, we… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  7. arXiv:2407.10759  [pdf, other

    eess.AS cs.CL cs.LG

    Qwen2-Audio Technical Report

    Authors: Yunfei Chu, Jin Xu, Qian Yang, Haojie Wei, Xipin Wei, Zhifang Guo, Yichong Leng, Yuanjun Lv, Jinzheng He, Junyang Lin, Chang Zhou, Jingren Zhou

    Abstract: We introduce the latest progress of Qwen-Audio, a large-scale audio-language model called Qwen2-Audio, which is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructions. In contrast to complex hierarchical tags, we have simplified the pre-training process by utilizing natural language prompts for different data an… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: https://github.com/QwenLM/Qwen2-Audio. Checkpoints, codes and scripts will be opensoursed soon

  8. arXiv:2407.10671  [pdf, other

    cs.CL cs.AI

    Qwen2 Technical Report

    Authors: An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin , et al. (37 additional authors not shown)

    Abstract: This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, a… ▽ More

    Submitted 17 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 25 pages, 1 figure

  9. arXiv:2407.10199  [pdf, other

    nucl-ex nucl-th

    Charge radii of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O determined from their charge-changing cross-sections and the mirror-difference charge radii

    Authors: J. W. Zhao, B. -H. Sun, I. Tanihata, J. Y. Xu, K. Y. Zhang, A. Prochazka, L. H. Zhu, S. Terashima, J. Meng, L. C. He, C. Y. Liu, G. S. Li, C. G. Lu, W. J. Lin, W. P. Lin, Z. Liu, P. P Ren, Z. Y. Sun, F. Wang, J. Wang, M. Wang, S. T. Wang, X. L. Wei, X. D. Xu, J. C. Zhang , et al. (2 additional authors not shown)

    Abstract: Charge-changing cross-sections of $^{11-16}$C, $^{13-17}$N and $^{15-18}$O on a carbon target have been determined at energies around 300 MeV/nucleon. A nucleon separation energy dependent correction factor has been introduced to the Glauber model calculation for extracting the nuclear charge radii from the experimental CCCSs. The charge radii of $^{11}$C, $^{13,16}$N and $^{15}$O thus were determ… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 3 figures, submitted to Physics Letters B

  10. arXiv:2407.09652  [pdf, other

    cs.CL

    How Chinese are Chinese Language Models? The Puzzling Lack of Language Policy in China's LLMs

    Authors: Andrea W Wen-Yi, Unso Eun Seo Jo, Lu Jia Lin, David Mimno

    Abstract: Contemporary language models are increasingly multilingual, but Chinese LLM developers must navigate complex political and business considerations of language diversity. Language policy in China aims at influencing the public discourse and governing a multi-ethnic society, and has gradually transitioned from a pluralist to a more assimilationist approach since 1949. We explore the impact of these… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Wen-Yi and Jo contributed equally to this work

  11. ecVoice: Audio Text Extraction and Optimization of Video Based on Idioms Similarity Replacement

    Authors: Jinwei Lin

    Abstract: The Text Extraction of the Audio from the Video plays an important role in multimedia editing and processing. As a popular open source toolkit, Whisper performs fast in human voice recognition. However, the recognition performance is dependent on the computing resource, which makes the low computing memory running Whisper become difficult. Our paper presents an available solution to extract the hu… ▽ More

    Submitted 20 May, 2024; originally announced July 2024.

    Comments: APSIPA ASC 2023

  12. arXiv:2407.09484  [pdf

    cs.HC cs.CY

    GPTutor: Great Personalized Tutor with Large Language Models for Personalized Learning Content Generation

    Authors: Eason Chen, Jia-En Lee, Jionghao Lin, Kenneth Koedinger

    Abstract: We developed GPTutor, a pioneering web application designed to revolutionize personalized learning by leveraging the capabilities of Generative AI at scale. GPTutor adapts educational content and practice exercises to align with individual students' interests and career goals, enhancing their engagement and understanding of critical academic concepts. The system uses a serverless architecture to d… ▽ More

    Submitted 16 May, 2024; originally announced July 2024.

  13. Systematic Evaluation of Neural Retrieval Models on the Touché 2020 Argument Retrieval Subset of BEIR

    Authors: Nandan Thakur, Luiz Bonifacio, Maik Fröbe, Alexander Bondarenko, Ehsan Kamalloo, Martin Potthast, Matthias Hagen, Jimmy Lin

    Abstract: The zero-shot effectiveness of neural retrieval models is often evaluated on the BEIR benchmark -- a combination of different IR evaluation datasets. Interestingly, previous studies found that particularly on the BEIR subset Touché 2020, an argument retrieval task, neural retrieval models are considerably less effective than BM25. Still, so far, no further investigation has been conducted on what… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: SIGIR 2024 (Resource & Reproducibility Track)

  14. Topological Transitions in a Kerr Nonlinear Oscillator

    Authors: Juan Lin, Shou-Bang Yang, Fan Wu, Zhen-Biao Yang

    Abstract: A Kerr nonlinear oscillator (KNO) supports a pair of steady eigenstates, coherent states with opposite phases, that are good for the encoding of continuous variable qubit basis states. Arbitrary control of the KNO confined within the steady state subspace allows extraction of the Berry curvature through the linear response of the physical observable to the quench velocity of the system, providing… ▽ More

    Submitted 15 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: 13 pages, 4 figures

    Journal ref: Intelligent Computing (A Science Partner Journal), 2024

  15. arXiv:2407.07397  [pdf, other

    cs.SD eess.AS

    SimuSOE: A Simulated Snoring Dataset for Obstructive Sleep Apnea-Hypopnea Syndrome Evaluation during Wakefulness

    Authors: Jie Lin, Xiuping Yang, Li Xiao, Xinhong Li, Weiyan Yi, Yuhong Yang, Weiping Tu, Xiong Chen

    Abstract: Obstructive Sleep Apnea-Hypopnea Syndrome (OSAHS) is a prevalent chronic breathing disorder caused by upper airway obstruction. Previous studies advanced OSAHS evaluation through machine learning-based systems trained on sleep snoring or speech signal datasets. However, constructing datasets for training a precise and rapid OSAHS evaluation system poses a challenge, since 1) it is time-consuming t… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  16. arXiv:2407.06886  [pdf, other

    cs.CV cs.AI cs.LG cs.MA cs.RO

    Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI

    Authors: Yang Liu, Weixing Chen, Yongjie Bai, Jingzhou Luo, Xinshuai Song, Kaixuan Jiang, Zhida Li, Ganlong Zhao, Junyi Lin, Guanbin Li, Wen Gao, Liang Lin

    Abstract: Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General Intelligence (AGI) and serves as a foundation for various applications that bridge cyberspace and the physical world. Recently, the emergence of Multi-modal Large Models (MLMs) and World Models (WMs) have attracted significant attention due to their remarkable perception, interaction, and reasoning capabilit… ▽ More

    Submitted 18 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: The first comprehensive review of Embodied AI in the era of MLMs, 37 pages. We also provide the paper list for Embodied AI: https://github.com/HCPLab-SYSU/Embodied_AI_Paper_List

  17. arXiv:2407.05652  [pdf, other

    cs.SE

    StmtTree: An Easy-to-Use yet Versatile Fortran Transformation Toolkit

    Authors: Jingbo Lin, Yi Yu, Zhang Yang, Yafan Zhao

    Abstract: The Fortran programming language continues to dominate the scientific computing community, with many production codes written in the outdated Fortran-77 dialect, yet with many non-standard extensions such as Cray poiters. This creates significant maintenance burden within the community, with tremendous efforts devoted to modernization. However, despite the modern age of advanced compiler framework… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 10 pages, 2 tables, 1 figure, submitted to ICSME 2024

  18. arXiv:2407.05117  [pdf, ps, other

    hep-ex

    Search for the baryon number and lepton number violating decays $τ^-\to Λπ^-$ and $τ^-\to \barΛπ^-$ at Belle II

    Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (349 additional authors not shown)

    Abstract: We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: 8 pages, 4 figures

    Report number: Belle II Preprint 2024-020; KEK Preprint 2024-17

  19. arXiv:2407.04960  [pdf, other

    cs.IR

    MemoCRS: Memory-enhanced Sequential Conversational Recommender Systems with Large Language Models

    Authors: Yunjia Xi, Weiwen Liu, Jianghao Lin, Bo Chen, Ruiming Tang, Weinan Zhang, Yong Yu

    Abstract: Conversational recommender systems (CRSs) aim to capture user preferences and provide personalized recommendations through multi-round natural language dialogues. However, most existing CRS models mainly focus on dialogue comprehension and preferences mining from the current dialogue session, overlooking user preferences in historical dialogue sessions. The preferences embedded in the user's histo… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  20. arXiv:2407.04925  [pdf, other

    cs.IR cs.AI cs.HC

    RAMO: Retrieval-Augmented Generation for Enhancing MOOCs Recommendations

    Authors: Jiarui Rao, Jionghao Lin

    Abstract: Massive Open Online Courses (MOOCs) have significantly enhanced educational accessibility by offering a wide variety of courses and breaking down traditional barriers related to geography, finance, and time. However, students often face difficulties navigating the vast selection of courses, especially when exploring new fields of study. Driven by this challenge, researchers have been exploring cou… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 7 pages, this paper underwent a rigorous review process and was officially accepted on May 31, 2024, for presentation at the Educational Data Mining 2024 Workshop: Leveraging Large Language Models for Next Generation Educational Technologies

  21. arXiv:2407.03535  [pdf, other

    cs.CV

    BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement

    Authors: Ruirui Lin, Nantheera Anantrasirichai, Guoxi Huang, Joanne Lin, Qi Sun, Alexandra Malyugina, David R Bull

    Abstract: Low-light videos often exhibit spatiotemporal incoherent noise, compromising visibility and performance in computer vision applications. One significant challenge in enhancing such content using deep learning is the scarcity of training data. This paper introduces a novel low-light video dataset, consisting of 40 scenes with various motion scenarios under two distinct low-lighting conditions, inco… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2402.01970

  22. arXiv:2407.02826  [pdf, other

    eess.AS

    SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech

    Authors: Jingru Lin, Meng Ge, Junyi Ao, Liqun Deng, Haizhou Li

    Abstract: It was shown that pre-trained models with self-supervised learning (SSL) techniques are effective in various downstream speech tasks. However, most such models are trained on single-speaker speech data, limiting their effectiveness in mixture speech. This motivates us to explore pre-training on mixture speech. This work presents SA-WavLM, a novel pre-trained model for mixture speech. Specifically,… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: InterSpeech 2024

  23. arXiv:2407.02818  [pdf, other

    cs.SE cs.ET cs.PL

    WizardMerge -- Save Us From Merging Without Any Clues

    Authors: Qingyu Zhang, Junzhe Li, Jiayi Lin, Jie Ding, Lanteng Lin, Chenxiong Qian

    Abstract: Modern software development necessitates efficient version-oriented collaboration among developers. While Git is the most popular version control system, it generates unsatisfactory version merging results due to textual-based workflow, leading to potentially unexpected results in the merged version of the project. Although numerous merging tools have been proposed for improving merge results, dev… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 22 pages

    ACM Class: D.2; D.3

  24. arXiv:2407.02068  [pdf, other

    cs.CV

    LPViT: Low-Power Semi-structured Pruning for Vision Transformers

    Authors: Kaixin Xu, Zhe Wang, Chunyun Chen, Xue Geng, Jie Lin, Xulei Yang, Min Wu, Xiaoli Li, Weisi Lin

    Abstract: Vision transformers have emerged as a promising alternative to convolutional neural networks for various image analysis tasks, offering comparable or superior performance. However, one significant drawback of ViTs is their resource-intensive nature, leading to increased memory footprint, computation complexity, and power consumption. To democratize this high-performance technology and make it more… ▽ More

    Submitted 12 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  25. arXiv:2407.01915  [pdf, other

    astro-ph.SR physics.plasm-ph physics.space-ph

    Unraveling the Trigger Mechanism of Explosive Reconnection in Partially Ionized Solar Plasma

    Authors: Abdullah Zafar, Lei Ni, Jun Lin, Ahmad Ali

    Abstract: Plasmoid instability is usually accounted for the onset of fast reconnection events observed in astrophysical plasmas. However, the measured reconnection rate from observations can be one order of magnitude higher than that derived from MHD simulations. In this study, we present the results of magnetic reconnection in the partially ionized low solar atmosphere based on 2.5D magnetohydrodynamics (M… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  26. arXiv:2407.01536  [pdf, other

    eess.SY

    Beyond Profit: A Multi-Objective Framework for Electric Vehicle Charging Station Operations

    Authors: Shuoyao Wang, Jiawei Lin

    Abstract: This paper explores the pricing and scheduling strategies of the electric vehicle charging stations in response to the rising demand for cleaner transportation. Most of the existing methods focus on maximizing the energy efficiency or the charging station profit, however, the reputation of EVs is also a key factor for the long-term charging station operations. To address these gaps, we propose a n… ▽ More

    Submitted 12 March, 2024; originally announced July 2024.

    Comments: Accepted By VTC24-Spring

  27. arXiv:2407.01245  [pdf, other

    cs.AI cs.CY

    SINKT: A Structure-Aware Inductive Knowledge Tracing Model with Large Language Model

    Authors: Lingyue Fu, Hao Guan, Kounianhua Du, Jianghao Lin, Wei Xia, Weinan Zhang, Ruiming Tang, Yasheng Wang, Yong Yu

    Abstract: Knowledge Tracing (KT) aims to determine whether students will respond correctly to the next question, which is a crucial task in intelligent tutoring systems (ITS). In educational KT scenarios, transductive ID-based methods often face severe data sparsity and cold start problems, where interactions between individual students and questions are sparse, and new questions and concepts consistently a… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  28. arXiv:2407.00965  [pdf, other

    hep-ex

    Measurement of the integrated luminosity of data samples collected during 2019-2022 by the Belle II experiment

    Authors: The Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, J. K. Ahn, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (382 additional authors not shown)

    Abstract: A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, diga… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 12 pages, 3 figures

    Report number: Belle II Preprint 2024-019; KEK Preprint 2024-16

  29. arXiv:2407.00421  [pdf

    physics.optics

    Multi-wavelength switchable single-frequency hyper Raman microlasers

    Authors: Chuntao Li, Ni Yao, Jintian Lin, Renhong Gao, Jianglin Guan, Guanghui Zhao, Minghui Li, Min Wang, Lingling Qiao, Ya Cheng

    Abstract: Multi-wavelength switchable single-frequency microlasers in a broad spectral range are highly desirable for integrated photonic applications due to their dynamic switching functionality, narrow linewidth, and high side-mode-suppression-ratio (SMSR). Here, a strategy based on highly efficient successive excitation of different stimulated multi-photon hyper-Raman scattering (SMPHRS) processes is pro… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 17 pages,5 figures, and 1 table

  30. arXiv:2407.00124  [pdf, other

    physics.ao-ph

    Stable Machine-Learning Parameterization of Subgrid Processes with Real Geography and Full-physics Emulation

    Authors: Zeyuan Hu, Akshay Subramaniam, Zhiming Kuang, Jerry Lin, Sungduk Yu, Walter M. Hannah, Noah D. Brenowitz, Josh Romero, Michael S. Pritchard

    Abstract: Modern climate projections often suffer from inadequate spatial and temporal resolution due to computational limitations, resulting in inaccurate representations of sub-resolution processes. A promising technique to address this is the Multiscale Modeling Framework (MMF), which embeds a small-domain, kilometer-resolution cloud-resolving model within each atmospheric column of a host climate model… ▽ More

    Submitted 16 July, 2024; v1 submitted 27 June, 2024; originally announced July 2024.

    Comments: 28 pages, 6 figures in the main text, 5 figures in appendix. This version is a minor editorial update from the initial version

  31. arXiv:2407.00118  [pdf, other

    cs.LG cs.AI

    From Efficient Multimodal Models to World Models: A Survey

    Authors: Xinji Mai, Zeng Tao, Junxiong Lin, Haoran Wang, Yang Chang, Yanlan Kang, Yan Wang, Wenqiang Zhang

    Abstract: Multimodal Large Models (MLMs) are becoming a significant research focus, combining powerful large language models with multimodal learning to perform complex tasks across different data modalities. This review explores the latest developments and challenges in MLMs, emphasizing their potential in achieving artificial general intelligence and as a pathway to world models. We provide an overview of… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  32. arXiv:2407.00025  [pdf, other

    cs.DC

    Anywhere: A Web Crawler Automation Management Interface

    Authors: Jinwei Lin

    Abstract: Web crawling projects or design is significant in the current information age. Using the web spider or crawler can automatically search and collect a huge amount of internet information. As one of the most popular web crawler frameworks, Scrapy is robust in abundant functions but weak in easy operation. In this paper, we provide a framework Anywhere, for optimising the usage feeling and improving… ▽ More

    Submitted 10 May, 2024; originally announced July 2024.

    Comments: 8 pages

  33. arXiv:2406.19647  [pdf, other

    cs.IR

    Doc2Token: Bridging Vocabulary Gap by Predicting Missing Tokens for E-commerce Search

    Authors: Kaihao Li, Juexin Lin, Tony Lee

    Abstract: Addressing the "vocabulary mismatch" issue in information retrieval is a central challenge for e-commerce search engines, because product pages often miss important keywords that customers search for. Doc2Query[1] is a popular document-expansion technique that predicts search queries for a document and includes the predicted queries with the document for retrieval. However, this approach can be in… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 9 pages, 1 figure, SIGIR 2024 Workshop on eCommerce

    ACM Class: H.3.3

  34. arXiv:2406.19394  [pdf, other

    cs.CV

    HUWSOD: Holistic Self-training for Unified Weakly Supervised Object Detection

    Authors: Liujuan Cao, Jianghang Lin, Zebo Hong, Yunhang Shen, Shaohui Lin, Chao Chen, Rongrong Ji

    Abstract: Most WSOD methods rely on traditional object proposals to generate candidate regions and are confronted with unstable training, which easily gets stuck in a poor local optimum. In this paper, we introduce a unified, high-capacity weakly supervised object detection (WSOD) network called HUWSOD, which utilizes a comprehensive self-training framework without needing external modules or additional sup… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  35. arXiv:2406.18825  [pdf, other

    cs.IR

    ELCoRec: Enhance Language Understanding with Co-Propagation of Numerical and Categorical Features for Recommendation

    Authors: Jizheng Chen, Kounianhua Du, Jianghao Lin, Bo Chen, Ruiming Tang, Weinan Zhang

    Abstract: Large language models have been flourishing in the natural language processing (NLP) domain, and their potential for recommendation has been paid much attention to. Despite the intelligence shown by the recommendation-oriented finetuned models, LLMs struggle to fully understand the user behavior patterns due to their innate weakness in interpreting numerical features and the overhead for long cont… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  36. arXiv:2406.18762  [pdf, other

    cs.CL

    Categorical Syllogisms Revisited: A Review of the Logical Reasoning Abilities of LLMs for Analyzing Categorical Syllogism

    Authors: Shi Zong, Jimmy Lin

    Abstract: There have been a huge number of benchmarks proposed to evaluate how large language models (LLMs) behave for logic inference tasks. However, it remains an open question how to properly evaluate this ability. In this paper, we provide a systematic overview of prior works on the logical reasoning ability of LLMs for analyzing categorical syllogisms. We first investigate all the possible variations f… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  37. arXiv:2406.18362  [pdf, other

    quant-ph

    Non-Markovian Quantum Exceptional Points

    Authors: Jhen-Dong Lin, Po-Chen Kuo, Neill Lambert, Adam Miranowicz, Franco Nori, Yueh-Nan Chen

    Abstract: Exceptional points (EPs) are singularities in the spectra of non-Hermitian operators, where eigenvalues and eigenvectors coalesce. Recently, open quantum systems have been increasingly explored as EP testbeds due to their natural non-Hermitian nature. However, existing works mostly focus on the Markovian limit, leaving a gap in understanding EPs in the non-Markovian regime. In this work, we addres… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 10+5 pages, 2 figures

  38. arXiv:2406.18197  [pdf, other

    cs.CV

    Human-free Prompted Based Anomaly Detection: prompt optimization with Meta-guiding prompt scheme

    Authors: Pi-Wei Chen, Jerry Chun-Wei Lin, Jia Ji, Feng-Hao Yeh, Chao-Chun Chen

    Abstract: Pre-trained vision-language models (VLMs) are highly adaptable to various downstream tasks through few-shot learning, making prompt-based anomaly detection a promising approach. Traditional methods depend on human-crafted prompts that require prior knowledge of specific anomaly types. Our goal is to develop a human-free prompt-based anomaly detection framework that optimally learns prompts through… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  39. arXiv:2406.18019  [pdf, other

    cs.RO

    Continuous Execution of High-Level Collaborative Tasks for Heterogeneous Robot Teams

    Authors: Amy Fang, Tenny Yin, Jiawei Lin, Hadas Kress-Gazit

    Abstract: We propose a control synthesis framework for a heterogeneous multi-robot system to satisfy collaborative tasks, where actions may take varying duration of time to complete. We encode tasks using the discrete logic LTL^ψ, which uses the concept of bindings to interleave robot actions and express information about relationship between specific task requirements and robot assignments. We present a sy… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Under review in IEEE Transactions on Robotics

  40. arXiv:2406.17413  [pdf, other

    cs.CV

    Depth-Guided Semi-Supervised Instance Segmentation

    Authors: Xin Chen, Jie Hu, Xiawu Zheng, Jianghang Lin, Liujuan Cao, Rongrong Ji

    Abstract: Semi-Supervised Instance Segmentation (SSIS) aims to leverage an amount of unlabeled data during training. Previous frameworks primarily utilized the RGB information of unlabeled images to generate pseudo-labels. However, such a mechanism often introduces unstable noise, as a single instance can display multiple RGB values. To overcome this limitation, we introduce a Depth-Guided (DG) SSIS framewo… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 12 pages, 6 figures, 4 tables

  41. arXiv:2406.16828  [pdf, other

    cs.IR cs.AI cs.CL

    Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track

    Authors: Ronak Pradeep, Nandan Thakur, Sahel Sharifymoghaddam, Eric Zhang, Ryan Nguyen, Daniel Campos, Nick Craswell, Jimmy Lin

    Abstract: Did you try out the new Bing Search? Or maybe you fiddled around with Google AI~Overviews? These might sound familiar because the modern-day search stack has recently evolved to include retrieval-augmented generation (RAG) systems. They allow searching and incorporating real-time data into large language models (LLMs) to provide a well-informed, attributed, concise summary in contrast to the tradi… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  42. arXiv:2406.16573  [pdf, other

    q-fin.CP

    An Improved Algorithm to Identify More Arbitrage Opportunities on Decentralized Exchanges

    Authors: Yu Zhang, Tao Yan, Jianhong Lin, Benjamin Kraner, Claudio Tessone

    Abstract: In decentralized exchanges (DEXs), the arbitrage paths exist abundantly in the form of both arbitrage loops (e.g. the arbitrage path starts from token A and back to token A again in the end, A, B,..., A) and non-loops (e.g. the arbitrage path starts from token A and stops at a different token N, A, B,..., N). The Moore-Bellman-Ford algorithm, often coupled with the ``walk to the root" technique, i… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  43. arXiv:2406.16520  [pdf

    cond-mat.str-el cond-mat.mtrl-sci cond-mat.supr-con

    Gigantic-oxidative atomically layered epitaxy for designed complex oxides

    Authors: Guangdi Zhou, Haoliang Huang, Fengzhe Wang, Heng Wang, Qishuo Yang, Zihao Nie, Wei Lv, Cui Ding, Yueying Li, Danfeng Li, Yujie Sun, Junhao Lin, Guang-Ming Zhang, Qi-Kun Xue, Zhuoyu Chen

    Abstract: In designing material functionality within the intricate realm of transition metal oxides, lattice structure and d-orbital occupancy are two principal determinants of the correlated physical properties, such as superconductivity. However, the modulation of these two factors is inherently limited by the need to balance thermodynamic stability, kinetic mobility, and synthesis precision, particularly… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  44. arXiv:2406.16473  [pdf, other

    cs.CV cs.AI

    Seeking Certainty In Uncertainty: Dual-Stage Unified Framework Solving Uncertainty in Dynamic Facial Expression Recognition

    Authors: Haoran Wang, Xinji Mai, Zeng Tao, Xuan Tong, Junxiong Lin, Yan Wang, Jiawen Yu, Boyang Wang, Shaoqi Yan, Qing Zhao, Ziheng Zhou, Shuyong Gao, Wenqiang Zhang

    Abstract: The contemporary state-of-the-art of Dynamic Facial Expression Recognition (DFER) technology facilitates remarkable progress by deriving emotional mappings of facial expressions from video content, underpinned by training on voluminous datasets. Yet, the DFER datasets encompass a substantial volume of noise data. Noise arises from low-quality captures that defy logical labeling, and instances that… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  45. arXiv:2406.16459  [pdf, other

    cs.CV

    Suppressing Uncertainties in Degradation Estimation for Blind Super-Resolution

    Authors: Junxiong Lin, Zeng Tao, Xuan Tong, Xinji Mai, Haoran Wang, Boyang Wang, Yan Wang, Qing Zhao, Jiawen Yu, Yuxuan Lin, Shaoqi Yan, Shuyong Gao, Wenqiang Zhang

    Abstract: The problem of blind image super-resolution aims to recover high-resolution (HR) images from low-resolution (LR) images with unknown degradation modes. Most existing methods model the image degradation process using blur kernels. However, this explicit modeling approach struggles to cover the complex and varied degradation processes encountered in the real world, such as high-order combinations of… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  46. arXiv:2406.15396  [pdf, other

    cs.CV cs.AI cs.LG

    Feature Purified Transformer With Cross-level Feature Guiding Decoder For Multi-class OOD and Anomaly Deteciton

    Authors: Jerry Chun-Wei Lin, Pi-Wei Chen, Chao-Chun Chen

    Abstract: Reconstruction networks are prevalently used in unsupervised anomaly and Out-of-Distribution (OOD) detection due to their independence from labeled anomaly data. However, in multi-class datasets, the effectiveness of anomaly detection is often compromised by the models' generalized reconstruction capabilities, which allow anomalies to blend within the expanded boundaries of normality resulting fro… ▽ More

    Submitted 30 April, 2024; originally announced June 2024.

    Comments: 12 pages

  47. arXiv:2406.14869  [pdf, other

    eess.SP

    Cost-Effective RF Fingerprinting Based on Hybrid CVNN-RF Classifier with Automated Multi-Dimensional Early-Exit Strategy

    Authors: Jiayan Gan, Zhixing Du, Qiang Li, Huaizong Shao, Jingran Lin, Ye Pan, Zhongyi Wen, Shafei Wang

    Abstract: While the Internet of Things (IoT) technology is booming and offers huge opportunities for information exchange, it also faces unprecedented security challenges. As an important complement to the physical layer security technologies for IoT, radio frequency fingerprinting (RFF) is of great interest due to its difficulty in counterfeiting. Recently, many machine learning (ML)-based RFF algorithms h… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE Internet of Things Journal

  48. arXiv:2406.14548  [pdf, other

    cs.LG cs.CV

    Consistency Models Made Easy

    Authors: Zhengyang Geng, Ashwini Pokle, William Luo, Justin Lin, J. Zico Kolter

    Abstract: Consistency models (CMs) are an emerging class of generative models that offer faster sampling than traditional diffusion models. CMs enforce that all points along a sampling trajectory are mapped to the same initial point. But this target leads to resource-intensive training: for example, as of 2024, training a SoTA CM on CIFAR-10 takes one week on 8 GPUs. In this work, we propose an alternative… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  49. arXiv:2406.13919  [pdf, other

    cs.AI

    SPL: A Socratic Playground for Learning Powered by Large Language Model

    Authors: Liang Zhang, Jionghao Lin, Ziyi Kuang, Sheng Xu, Mohammed Yeasin, Xiangen Hu

    Abstract: Dialogue-based Intelligent Tutoring Systems (ITSs) have significantly advanced adaptive and personalized learning by automating sophisticated human tutoring strategies within interactive dialogues. However, replicating the nuanced patterns of expert human communication remains a challenge in Natural Language Processing (NLP). Recent advancements in NLP, particularly Large Language Models (LLMs) su… ▽ More

    Submitted 20 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  50. arXiv:2406.12874  [pdf, other

    physics.ins-det hep-ex

    The Design, Implementation, and Performance of the LZ Calibration Systems

    Authors: J. Aalbers, D. S. Akerib, A. K. Al Musalhi, F. Alder, C. S. Amarasinghe, A. Ames, T. J. Anderson, N. Angelides, H. M. Araújo, J. E. Armstrong, M. Arthurs, A. Baker, S. Balashov, J. Bang, E. E. Barillier, J. W. Bargemann, K. Beattie, T. Benson, A. Bhatti, A. Biekert, T. P. Biesiadzinski, H. J. Birch, E. Bishop, G. M. Blockinger, B. Boxer , et al. (179 additional authors not shown)

    Abstract: LUX-ZEPLIN (LZ) is a tonne-scale experiment searching for direct dark matter interactions and other rare events. It is located at the Sanford Underground Research Facility (SURF) in Lead, South Dakota, USA. The core of the LZ detector is a dual-phase xenon time projection chamber (TPC), designed with the primary goal of detecting Weakly Interacting Massive Particles (WIMPs) via their induced low e… ▽ More

    Submitted 20 June, 2024; v1 submitted 2 May, 2024; originally announced June 2024.