Skip to main content

Showing 1–50 of 83 results for author: Yuan, Q

  1. arXiv:2406.15829  [pdf, other

    cs.CV

    MVOC: a training-free multiple video object composition method with diffusion models

    Authors: Wei Wang, Yaosen Chen, Yuegen Liu, Qi Yuan, Shubin Yang, Yanru Zhang

    Abstract: Video composition is the core task of video editing. Although image composition based on diffusion models has been highly successful, it is not straightforward to extend the achievement to video object composition tasks, which not only exhibit corresponding interaction effects but also ensure that the objects in the composited video maintain motion and identity consistency, which is necessary to c… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  2. arXiv:2406.06603  [pdf, other

    cs.LG cs.AI

    FPN-fusion: Enhanced Linear Complexity Time Series Forecasting Model

    Authors: Chu Li, Pingjia Xiao, Qiping Yuan

    Abstract: This study presents a novel time series prediction model, FPN-fusion, designed with linear computational complexity, demonstrating superior predictive performance compared to DLiner without increasing parameter count or computational demands. Our model introduces two key innovations: first, a Feature Pyramid Network (FPN) is employed to effectively capture time series data characteristics, bypassi… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: FPN,time series,fusion. arXiv admin note: text overlap with arXiv:2401.03001 by other authors

  3. arXiv:2405.04964  [pdf, other

    cs.CV

    Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution

    Authors: Yi Xiao, Qiangqiang Yuan, Kui Jiang, Yuzeng Chen, Qiang Zhang, Chia-Wen Lin

    Abstract: Recent progress in remote sensing image (RSI) super-resolution (SR) has exhibited remarkable performance using deep neural networks, e.g., Convolutional Neural Networks and Transformers. However, existing SR methods often suffer from either a limited receptive field or quadratic computational overhead, resulting in sub-optimal global representation and unacceptable computational costs in large-sca… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution

  4. arXiv:2405.02826  [pdf, other

    cs.CR

    Nip in the Bud: Forecasting and Interpreting Post-exploitation Attacks in Real-time through Cyber Threat Intelligence Reports

    Authors: Tiantian Zhu, Jie Ying, Tieming Chen, Chunlin Xiong, Wenrui Cheng, Qixuan Yuan, Aohan Zheng, Mingqi Lv, Yan Chen

    Abstract: Advanced Persistent Threat (APT) attacks have caused significant damage worldwide. Various Endpoint Detection and Response (EDR) systems are deployed by enterprises to fight against potential threats. However, EDR suffers from high false positives. In order not to affect normal operations, analysts need to investigate and filter detection results before taking countermeasures, in which heavy manua… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  5. arXiv:2405.02629  [pdf, other

    cs.CR

    SPARSE: Semantic Tracking and Path Analysis for Attack Investigation in Real-time

    Authors: Jie Ying, Tiantian Zhu, Wenrui Cheng, Qixuan Yuan, Mingjun Ma, Chunlin Xiong, Tieming Chen, Mingqi Lv, Yan Chen

    Abstract: As the complexity and destructiveness of Advanced Persistent Threat (APT) increase, there is a growing tendency to identify a series of actions undertaken to achieve the attacker's target, called attack investigation. Currently, analysts construct the provenance graph to perform causality analysis on Point-Of-Interest (POI) event for capturing critical events (related to the attack). However, due… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  6. arXiv:2404.16313  [pdf, ps, other

    cs.IT

    Further Investigations on Nonlinear Complexity of Periodic Binary Sequences

    Authors: Qin Yuan, Chunlei Li, Xiangyong Zeng, Tor Helleseth, Debiao He

    Abstract: Nonlinear complexity is an important measure for assessing the randomness of sequences. In this paper we investigate how circular shifts affect the nonlinear complexities of finite-length binary sequences and then reveal a more explicit relation between nonlinear complexities of finite-length binary sequences and their corresponding periodic sequences. Based on the relation, we propose two algorit… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  7. arXiv:2404.09624  [pdf, other

    cs.CV

    AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception

    Authors: Yipo Huang, Xiangfei Sheng, Zhichao Yang, Quan Yuan, Zhichao Duan, Pengfei Chen, Leida Li, Weisi Lin, Guangming Shi

    Abstract: The highly abstract nature of image aesthetics perception (IAP) poses significant challenge for current multimodal large language models (MLLMs). The lack of human-annotated multi-modality aesthetic data further exacerbates this dilemma, resulting in MLLMs falling short of aesthetics perception capabilities. To address the above challenge, we first introduce a comprehensively annotated Aesthetic M… ▽ More

    Submitted 18 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  8. arXiv:2403.17853  [pdf, other

    cs.CL cs.LG

    Using Domain Knowledge to Guide Dialog Structure Induction via Neural Probabilistic Soft Logic

    Authors: Connor Pryor, Quan Yuan, Jeremiah Liu, Mehran Kazemi, Deepak Ramachandran, Tania Bedrax-Weiss, Lise Getoor

    Abstract: Dialog Structure Induction (DSI) is the task of inferring the latent dialog structure (i.e., a set of dialog states and their temporal transitions) of a given goal-oriented dialog. It is a critical component for modern dialog system design and discourse analysis. Existing DSI approaches are often purely data-driven, deploy models that infer latent states without access to domain knowledge, underpe… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  9. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  10. arXiv:2401.08276  [pdf, other

    cs.CV cs.CL

    AesBench: An Expert Benchmark for Multimodal Large Language Models on Image Aesthetics Perception

    Authors: Yipo Huang, Quan Yuan, Xiangfei Sheng, Zhichao Yang, Haoning Wu, Pengfei Chen, Yuzhe Yang, Leida Li, Weisi Lin

    Abstract: With collective endeavors, multimodal large language models (MLLMs) are undergoing a flourishing development. However, their performances on image aesthetics perception remain indeterminate, which is highly desired in real-world applications. An obvious obstacle lies in the absence of a specific benchmark to evaluate the effectiveness of MLLMs on aesthetic perception. This blind groping may impede… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  11. arXiv:2401.07139  [pdf, other

    cs.CV cs.AI eess.IV

    Deep Blind Super-Resolution for Satellite Video

    Authors: Yi Xiao, Qiangqiang Yuan, Qiang Zhang, Liangpei Zhang

    Abstract: Recent efforts have witnessed remarkable progress in Satellite Video Super-Resolution (SVSR). However, most SVSR methods usually assume the degradation is fixed and known, e.g., bicubic downsampling, which makes them vulnerable in real-world scenes with multiple and unknown degradations. To alleviate this issue, blind SR has thus become a research hotspot. Nevertheless, existing approaches are mai… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    Comments: Published in IEEE TGRS

    Journal ref: IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1-16, 2023, Art no. 5516316

  12. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  13. arXiv:2311.13622  [pdf, other

    cs.CV eess.IV

    TDiffDe: A Truncated Diffusion Model for Remote Sensing Hyperspectral Image Denoising

    Authors: Jiang He, Yajie Li, Jie L, Qiangqiang Yuan

    Abstract: Hyperspectral images play a crucial role in precision agriculture, environmental monitoring or ecological analysis. However, due to sensor equipment and the imaging environment, the observed hyperspectral images are often inevitably corrupted by various noise. In this study, we proposed a truncated diffusion model, called TDiffDe, to recover the useful information in hyperspectral images gradually… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

  14. arXiv:2310.19288  [pdf, other

    eess.IV cs.CV

    EDiffSR: An Efficient Diffusion Probabilistic Model for Remote Sensing Image Super-Resolution

    Authors: Yi Xiao, Qiangqiang Yuan, Kui Jiang, Jiang He, Xianyu Jin, Liangpei Zhang

    Abstract: Recently, convolutional networks have achieved remarkable development in remote sensing image Super-Resoltuion (SR) by minimizing the regression objectives, e.g., MSE loss. However, despite achieving impressive performance, these methods often suffer from poor visual quality with over-smooth issues. Generative adversarial networks have the potential to infer intricate details, but they are easy to… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Submitted to IEEE TGRS

  15. arXiv:2309.16372  [pdf, other

    cs.CV eess.IV

    Aperture Diffraction for Compact Snapshot Spectral Imaging

    Authors: Tao Lv, Hao Ye, Quan Yuan, Zhan Shi, Yibo Wang, Shuming Wang, Xun Cao

    Abstract: We demonstrate a compact, cost-effective snapshot spectral imaging system named Aperture Diffraction Imaging Spectrometer (ADIS), which consists only of an imaging lens with an ultra-thin orthogonal aperture mask and a mosaic filter sensor, requiring no additional physical footprint compared to common RGB cameras. Then we introduce a new optical design that each point in the object space is multip… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: accepted by International Conference on Computer Vision (ICCV) 2023

  16. arXiv:2308.15299  [pdf, other

    cs.CL

    TaskLAMA: Probing the Complex Task Understanding of Language Models

    Authors: Quan Yuan, Mehran Kazemi, Xin Xu, Isaac Noble, Vaiva Imbrasaite, Deepak Ramachandran

    Abstract: Structured Complex Task Decomposition (SCTD) is the problem of breaking down a complex real-world task (such as planning a wedding) into a directed acyclic graph over individual steps that contribute to achieving the task, with edges specifying temporal dependencies between them. SCTD is an important component of assistive planning tools, and a challenge for commonsense reasoning systems. We probe… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  17. arXiv:2307.00729  [pdf, other

    cs.SD cs.CL eess.AS

    An End-to-End Multi-Module Audio Deepfake Generation System for ADD Challenge 2023

    Authors: Sheng Zhao, Qilong Yuan, Yibo Duan, Zhuoyue Chen

    Abstract: The task of synthetic speech generation is to generate language content from a given text, then simulating fake human voice.The key factors that determine the effect of synthetic speech generation mainly include speed of generation, accuracy of word segmentation, naturalness of synthesized speech, etc. This paper builds an end-to-end multi-module synthetic speech generation model, including speake… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

  18. arXiv:2306.09245  [pdf

    cs.CR cs.CE cs.CV

    Image encryption for Offshore wind power based on 2D-LCLM and Zhou Yi Eight Trigrams

    Authors: Lei Kou, Jinbo Wu, Fangfang Zhang, Peng Ji, Wende Ke, Junhe Wan, Hailin Liu, Yang Li, Quande Yuan

    Abstract: Offshore wind power is an important part of the new power system, due to the complex and changing situation at ocean, its normal operation and maintenance cannot be done without information such as images, therefore, it is especially important to transmit the correct image in the process of information transmission. In this paper, we propose a new encryption algorithm for offshore wind power based… ▽ More

    Submitted 27 June, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: accepted by Int. J. of Bio-Inspired Computation

    MSC Class: 68P25 ACM Class: E.3

    Journal ref: International Journal of Bio-Inspired Computation.vol. 22, no. 1,pp 53-64 (2023)

  19. arXiv:2306.07934  [pdf, other

    cs.CL cs.AI cs.LG

    BoardgameQA: A Dataset for Natural Language Reasoning with Contradictory Information

    Authors: Mehran Kazemi, Quan Yuan, Deepti Bhatia, Najoung Kim, Xin Xu, Vaiva Imbrasaite, Deepak Ramachandran

    Abstract: Automated reasoning with unstructured natural text is a key requirement for many potential applications of NLP and for developing robust AI systems. Recently, Language Models (LMs) have demonstrated complex reasoning capacities even without any finetuning. However, existing evaluation for automated reasoning assumes access to a consistent and coherent set of information over which models reason. W… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  20. arXiv:2305.13918  [pdf

    cs.CV cs.RO eess.IV

    Development and Whole-Body Validation of Personalizable Female and Male Pedestrian SAFER Human Body Models

    Authors: Natalia Lindgren, Qiantailang Yuan, Bengt Pipkorn, Svein Kleiven, Xiaogai Li

    Abstract: Vulnerable road users are overrepresented in the worldwide number of road-traffic injury victims. Developing biofidelic male and female pedestrian HBMs representing a range of anthropometries is imperative to follow through with the efforts to increase road safety and propose intervention strategies. In this study, a 50th percentile male and female pedestrian of the SAFER HBM was developed via a n… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

  21. Local-Global Temporal Difference Learning for Satellite Video Super-Resolution

    Authors: Yi Xiao, Qiangqiang Yuan, Kui Jiang, Xianyu Jin, Jiang He, Liangpei Zhang, Chia-wen Lin

    Abstract: Optical-flow-based and kernel-based approaches have been extensively explored for temporal compensation in satellite Video Super-Resolution (VSR). However, these techniques are less generalized in large-scale or complex scenarios, especially in satellite videos. In this paper, we propose to exploit the well-defined temporal difference for efficient and effective temporal compensation. To fully uti… ▽ More

    Submitted 30 October, 2023; v1 submitted 10 April, 2023; originally announced April 2023.

    Comments: Accepted by IEEE TCSVT

    Journal ref: IEEE Transactions on Circuits and Systems for Video Technology, 2023

  22. arXiv:2304.02401  [pdf, other

    cs.CR

    PrivGraph: Differentially Private Graph Data Publication by Exploiting Community Information

    Authors: Quan Yuan, Zhikun Zhang, Linkang Du, Min Chen, Peng Cheng, Mingyang Sun

    Abstract: Graph data is used in a wide range of applications, while analyzing graph data without protection is prone to privacy breach risks. To mitigate the privacy risks, we resort to the standard technique of differential privacy to publish a synthetic graph. However, existing differentially private graph synthesis approaches either introduce excessive noise by directly perturbing the adjacency matrix, o… ▽ More

    Submitted 13 October, 2023; v1 submitted 5 April, 2023; originally announced April 2023.

    Comments: The extended version of the USENIX Security '23 paper

  23. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  24. arXiv:2302.05807  [pdf, other

    cs.LG stat.ML

    Pushing the Accuracy-Group Robustness Frontier with Introspective Self-play

    Authors: Jeremiah Zhe Liu, Krishnamurthy Dj Dvijotham, Jihyeon Lee, Quan Yuan, Martin Strobel, Balaji Lakshminarayanan, Deepak Ramachandran

    Abstract: Standard empirical risk minimization (ERM) training can produce deep neural network (DNN) models that are accurate on average but under-perform in under-represented population subgroups, especially when there are imbalanced group distributions in the long-tailed training data. Therefore, approaches that improve the accuracy-group robustness trade-off frontier of a DNN model (i.e. improving worst-g… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

    Comments: Accepted to ICLR 2023. Included additional contribution from Martin Strobel

  25. arXiv:2302.03916  [pdf, other

    cs.LG

    QS-ADN: Quasi-Supervised Artifact Disentanglement Network for Low-Dose CT Image Denoising by Local Similarity Among Unpaired Data

    Authors: Yuhui Ruan, Qiao Yuan, Chuang Niu, Chen Li, Yudong Yao, Ge Wang, Yueyang Teng

    Abstract: Deep learning has been successfully applied to low-dose CT (LDCT) image denoising for reducing potential radiation risk. However, the widely reported supervised LDCT denoising networks require a training set of paired images, which is expensive to obtain and cannot be perfectly simulated. Unsupervised learning utilizes unpaired data and is highly desirable for LDCT denoising. As an example, an art… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

  26. arXiv:2301.12230  [pdf, other

    cs.LG cs.AI

    Continual Graph Learning: A Survey

    Authors: Qiao Yuan, Sheng-Uei Guan, Pin Ni, Tianlun Luo, Ka Lok Man, Prudence Wong, Victor Chang

    Abstract: Research on continual learning (CL) mainly focuses on data represented in the Euclidean space, while research on graph-structured data is scarce. Furthermore, most graph learning models are tailored for static graphs. However, graphs usually evolve continually in the real world. Catastrophic forgetting also emerges in graph learning models when being trained incrementally. This leads to the need t… ▽ More

    Submitted 28 January, 2023; originally announced January 2023.

    Comments: 38 pages, 7 figures

  27. arXiv:2212.05891  [pdf

    cs.IR cs.CL cs.LG

    Text Mining-Based Patent Analysis for Automated Rule Checking in AEC

    Authors: Zhe Zheng, Bo-Rui Kang, Qi-Tian Yuan, Yu-Cheng Zhou, Xin-Zheng Lu, Jia-Rui Lin

    Abstract: Automated rule checking (ARC), which is expected to promote the efficiency of the compliance checking process in the architecture, engineering, and construction (AEC) industry, is gaining increasing attention. Throwing light on the ARC application hotspots and forecasting its trends are useful to the related research and drive innovations. Therefore, this study takes the patents from the database… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

  28. Quantitative Method for Security Situation of the Power Information Network Based on the Evolutionary Neural Network

    Authors: Quande Yuan, Yuzhen Pi, Lei Kou, Fangfang Zhang, Bo Ye

    Abstract: Cybersecurity is the security cornerstone of digital transformation of the power grid and construction of new power systems. The traditional network security situation quantification method only analyzes from the perspective of network performance, ignoring the impact of various power application services on the security situation, so the quantification results cannot fully reflect the power infor… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: Frontiers in Energy Research

    MSC Class: 68T99 ACM Class: I.2

  29. A Random Forest and Current Fault Texture Feature-Based Method for Current Sensor Fault Diagnosis in Three-Phase PWM VSR

    Authors: Lei Kou, Xiao-dong Gong, Yi Zheng, Xiu-hui Ni, Yang Li, Quan-de Yuan, Ya-nan Dong

    Abstract: Three-phase PWM voltage-source rectifier (VSR) systems have been widely used in various energy conversion systems, where current sensors are the key component for state monitoring and system control. The current sensor faults may bring hidden danger or damage to the whole system; therefore, this paper proposed a random forest (RF) and current fault texture feature-based method for current sensor f… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: Frontiers in Energy Research

    MSC Class: 68Q04 ACM Class: I.2

  30. Data-driven design of fault diagnosis for three-phase PWM rectifier using random forests technique with transient synthetic features

    Authors: Lei Kou, Chuang Liu, Guo-wei Cai, Jia-ning Zhou, Quan-de Yuan

    Abstract: A three-phase pulse-width modulation (PWM) rectifier can usually maintain operation when open-circuit faults occur in insulated-gate bipolar transistors (IGBTs), which will lead the system to be unstable and unsafe. Aiming at this problem, based on random forests with transient synthetic features, a data-driven online fault diagnosis method is proposed to locate the open-circuit faults of IGBTs ti… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

    Comments: IET Power Electronics

    MSC Class: 68T99 ACM Class: I.2

  31. arXiv:2211.00221  [pdf

    cs.AI eess.SY

    Review on Monitoring, Operation and Maintenance of Smart Offshore Wind Farms

    Authors: Lei Kou, Yang Li, Fangfang Zhang, Xiaodong Gong, Yinghong Hu, Quande Yuan, Wende Ke

    Abstract: In recent years, with the development of wind energy, the number and scale of wind farms are developing rapidly. Since offshore wind farm has the advantages of stable wind speed, clean, renewable, non-polluting and no occupation of cultivated land, which has gradually become a new trend of wind power industry all over the world. The operation and maintenance mode of offshore wind power is developi… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

    Comments: accepted by Sensors

    MSC Class: 90B25 ACM Class: I.2

    Journal ref: Sensors 2022, 22, 2822

  32. Fault diagnosis for open-circuit faults in NPC inverter based on knowledge-driven and data-driven approaches

    Authors: Lei Kou, Chuang Liu, Guo-wei Cai, Jia-ning Zhou, Quan-de Yuan, Si-miao Pang

    Abstract: In this study, the open-circuit faults diagnosis and location issue of the neutral-point-clamped (NPC) inverters are analysed. A novel fault diagnosis approach based on knowledge driven and data driven was presented for the open-circuit faults in insulated-gate bipolar transistors (IGBTs) of NPC inverter, and Concordia transform (knowledge driven) and random forests (RFs) technique (data driven) a… ▽ More

    Submitted 31 October, 2022; originally announced October 2022.

    Comments: IET Power Electronics

    MSC Class: 68T05 ACM Class: I.2

  33. arXiv:2208.07059  [pdf, other

    cs.CV

    UPST-NeRF: Universal Photorealistic Style Transfer of Neural Radiance Fields for 3D Scene

    Authors: Yaosen Chen, Qi Yuan, Zhiqiang Li, Yuegen Liu, Wei Wang, Chaoping Xie, Xuming Wen, Qien Yu

    Abstract: 3D scenes photorealistic stylization aims to generate photorealistic images from arbitrary novel views according to a given style image while ensuring consistency when rendering from different viewpoints. Some existing stylization methods with neural radiance fields can effectively predict stylized scenes by combining the features of the style image with multi-view images to train 3D scenes. Howev… ▽ More

    Submitted 21 August, 2022; v1 submitted 15 August, 2022; originally announced August 2022.

    Comments: arXiv admin note: text overlap with arXiv:2205.12183 by other authors

  34. arXiv:2203.11383  [pdf, other

    cs.IR cs.CY cs.LG

    DIANES: A DEI Audit Toolkit for News Sources

    Authors: Xiaoxiao Shang, Zhiyuan Peng, Qiming Yuan, Sabiq Khan, Lauren Xie, Yi Fang, Subramaniam Vincent

    Abstract: Professional news media organizations have always touted the importance that they give to multiple perspectives. However, in practice the traditional approach to all-sides has favored people in the dominant culture. Hence it has come under ethical critique under the new norms of diversity, equity, and inclusion (DEI). When DEI is applied to journalism, it goes beyond conventional notions of impart… ▽ More

    Submitted 28 April, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

  35. arXiv:2202.03632  [pdf, other

    cs.LG cs.AI q-bio.QM

    ECRECer: Enzyme Commission Number Recommendation and Benchmarking based on Multiagent Dual-core Learning

    Authors: Zhenkun Shi, Qianqian Yuan, Ruoyu Wang, Hoaran Li, Xiaoping Liao, Hongwu Ma

    Abstract: Enzyme Commission (EC) numbers, which associate a protein sequence with the biochemical reactions it catalyzes, are essential for the accurate understanding of enzyme functions and cellular metabolism. Many ab-initio computational approaches were proposed to predict EC numbers for given input sequences directly. However, the prediction performance (accuracy, recall, precision), usability, and effi… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

    Comments: 16 pages, 14 figures

    Report number: research.0153 MSC Class: I.2.6

    Journal ref: Research. 2023:6;0153

  36. arXiv:2201.10005  [pdf, other

    cs.CL cs.LG

    Text and Code Embeddings by Contrastive Pre-Training

    Authors: Arvind Neelakantan, Tao Xu, Raul Puri, Alec Radford, Jesse Michael Han, Jerry Tworek, Qiming Yuan, Nikolas Tezak, Jong Wook Kim, Chris Hallacy, Johannes Heidecke, Pranav Shyam, Boris Power, Tyna Eloundou Nekoul, Girish Sastry, Gretchen Krueger, David Schnurr, Felipe Petroski Such, Kenny Hsu, Madeleine Thompson, Tabarak Khan, Toki Sherbakov, Joanne Jang, Peter Welinder, Lilian Weng

    Abstract: Text embeddings are useful features in many applications such as semantic search and computing text similarity. Previous work typically trains models customized for different use cases, varying in dataset choice, training objective and model architecture. In this work, we show that contrastive pre-training on unsupervised data at scale leads to high quality vector representations of text and code.… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

  37. arXiv:2112.04263  [pdf, other

    cs.NI

    Artificial Intelligence Powered Mobile Networks: From Cognition to Decision

    Authors: Guiyang Luo, Quan Yuan, Jinglin Li, Shangguang Wang, Fangchun Yang

    Abstract: Mobile networks (MN) are anticipated to provide unprecedented opportunities to enable a new world of connected experiences and radically shift the way people interact with everything. MN are becoming more and more complex, driven by ever-increasingly complicated configuration issues and blossoming new service requirements. This complexity poses significant challenges in deployment, management, ope… ▽ More

    Submitted 8 December, 2021; originally announced December 2021.

    Journal ref: IEEE Network 2021

  38. arXiv:2110.08702  [pdf, other

    cs.CV

    SIN:Superpixel Interpolation Network

    Authors: Qing Yuan, Songfeng Lu, Yan Huang, Wuxin Sha

    Abstract: Superpixels have been widely used in computer vision tasks due to their representational and computational efficiency. Meanwhile, deep learning and end-to-end framework have made great progress in various fields including computer vision. However, existing superpixel algorithms cannot be integrated into subsequent tasks in an end-to-end way. Traditional algorithms and deep learning-based algorithm… ▽ More

    Submitted 16 October, 2021; originally announced October 2021.

    Comments: 15 pages, 8 figures, to be published in PRICAI-2021

  39. arXiv:2108.07200  [pdf, other

    eess.IV cs.CV

    Continuous-Time Spatiotemporal Calibration of a Rolling Shutter Camera-IMU System

    Authors: Jianzhu Huai, Yuan Zhuang, Qicheng Yuan, Yukai Lin

    Abstract: The rolling shutter (RS) mechanism is widely used by consumer-grade cameras, which are essential parts in smartphones and autonomous vehicles. The RS effect leads to image distortion upon relative motion between a camera and the scene. This effect needs to be considered in video stabilization, structure from motion, and vision-aided odometry, for which recent studies have improved earlier global s… ▽ More

    Submitted 16 August, 2021; originally announced August 2021.

    Comments: 11 pages, 9 figures

  40. Coupling Model-Driven and Data-Driven Methods for Remote Sensing Image Restoration and Fusion

    Authors: Huanfeng Shen, Menghui Jiang, Jie Li, Chenxia Zhou, Qiangqiang Yuan, Liangpei Zhang

    Abstract: In the fields of image restoration and image fusion, model-driven methods and data-driven methods are the two representative frameworks. However, both approaches have their respective advantages and disadvantages. The model-driven methods consider the imaging mechanism, which is deterministic and theoretically reasonable; however, they cannot easily model complicated nonlinear problems. The data-d… ▽ More

    Submitted 13 August, 2021; originally announced August 2021.

    Journal ref: IEEE Geoscience and Remote Sensing Magazine, vol. 10, no. 2, pp. 231-249, June 2022

  41. arXiv:2107.13848  [pdf, other

    cs.OS

    Revisiting Swapping in User-space with Lightweight Threading

    Authors: Kan Zhong, Wenlin Cui, Youyou Lu, Quanzhang Liu, Xiaodan Yan, Qizhao Yuan, Siwei Luo, Keji Huang

    Abstract: Memory-intensive applications, such as in-memory databases, caching systems and key-value stores, are increasingly demanding larger main memory to fit their working sets. Conventional swapping can enlarge the memory capacity by paging out inactive pages to disks. However, the heavy I/O stack makes the traditional kernel-based swapping suffers from several critical performance issues. In this pap… ▽ More

    Submitted 29 July, 2021; originally announced July 2021.

  42. arXiv:2107.08355  [pdf

    eess.IV cs.CV

    Fully Polarimetric SAR and Single-Polarization SAR Image Fusion Network

    Authors: Liupeng Lin, Jie Li, Huanfeng Shen, Lingli Zhao, Qiangqiang Yuan, Xinghua Li

    Abstract: The data fusion technology aims to aggregate the characteristics of different data and obtain products with multiple data advantages. To solves the problem of reduced resolution of PolSAR images due to system limitations, we propose a fully polarimetric synthetic aperture radar (PolSAR) images and single-polarization synthetic aperture radar SAR (SinSAR) images fusion network to generate high-reso… ▽ More

    Submitted 17 July, 2021; originally announced July 2021.

  43. arXiv:2107.03374  [pdf, other

    cs.LG

    Evaluating Large Language Models Trained on Code

    Authors: Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter , et al. (33 additional authors not shown)

    Abstract: We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we release to measure functional correctness for synthesizing programs from docstrings, our model solves 28.8% of the problems, while GPT-3 solves 0% and GPT-J sol… ▽ More

    Submitted 14 July, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

    Comments: corrected typos, added references, added authors, added acknowledgements

  44. arXiv:2102.07911  [pdf, other

    cs.CV

    MITNet: GAN Enhanced Magnetic Induction Tomography Based on Complex CNN

    Authors: Zuohui Chen, Qing Yuan, Xujie Song, Cheng Chen, Dan Zhang, Yun Xiang, Ruigang Liu, Qi Xuan

    Abstract: Magnetic induction tomography (MIT) is an efficient solution for long-term brain disease monitoring, which focuses on reconstructing bio-impedance distribution inside the human brain using non-intrusive electromagnetic fields. However, high-quality brain image reconstruction remains challenging since reconstructing images from the measured weak signals is a highly non-linear and ill-conditioned pr… ▽ More

    Submitted 15 February, 2021; originally announced February 2021.

  45. arXiv:2101.04882  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    Asymmetric self-play for automatic goal discovery in robotic manipulation

    Authors: OpenAI OpenAI, Matthias Plappert, Raul Sampedro, Tao Xu, Ilge Akkaya, Vineet Kosaraju, Peter Welinder, Ruben D'Sa, Arthur Petron, Henrique P. d. O. Pinto, Alex Paino, Hyeonwoo Noh, Lilian Weng, Qiming Yuan, Casey Chu, Wojciech Zaremba

    Abstract: We train a single, goal-conditioned policy that can solve many robotic manipulation tasks, including tasks with previously unseen goals and objects. We rely on asymmetric self-play for goal discovery, where two agents, Alice and Bob, play a game. Alice is asked to propose challenging goals and Bob aims to solve them. We show that this method can discover highly diverse and complex goals without an… ▽ More

    Submitted 13 January, 2021; originally announced January 2021.

    Comments: Videos are shown at https://robotics-self-play.github.io

  46. Modeling and Understanding Ethereum Transaction Records via a Complex Network Approach

    Authors: Dan Lin, Jiajing Wu, Qi Yuan, Zibin Zheng

    Abstract: As the largest public blockchain-based platform supporting smart contracts, Ethereum has accumulated a large number of user transaction records since its debut in 2014. Analysis of Ethereum transaction records, however, is still relatively unexplored till now. Modeling the transaction records as a static simple graph, existing methods are unable to accurately characterize the temporal and multiple… ▽ More

    Submitted 31 December, 2020; originally announced December 2020.

    Comments: 5 pages, 6 figures. arXiv admin note: substantial text overlap with arXiv:1905.08038

    Journal ref: IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 67, no. 11, pp. 2737 - 2741, November 2020

  47. arXiv:2012.13169  [pdf, other

    cs.LG

    SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II

    Authors: Xiangjun Wang, Junxiao Song, Penghui Qi, Peng Peng, Zhenkun Tang, Wei Zhang, Weimin Li, Xiongjun Pi, Jujie He, Chao Gao, Haitao Long, Quan Yuan

    Abstract: AlphaStar, the AI that reaches GrandMaster level in StarCraft II, is a remarkable milestone demonstrating what deep reinforcement learning can achieve in complex Real-Time Strategy (RTS) games. However, the complexities of the game, algorithms and systems, and especially the tremendous amount of computation needed are big obstacles for the community to conduct further research in this direction. W… ▽ More

    Submitted 9 June, 2021; v1 submitted 24 December, 2020; originally announced December 2020.

    Comments: ICML 2021 camera ready

  48. arXiv:2011.09701  [pdf

    eess.IV cs.CV

    Spectral Response Function Guided Deep Optimization-driven Network for Spectral Super-resolution

    Authors: Jiang He, Jie Li, Qiangqiang Yuan, Huanfeng Shen, Liangpei Zhang

    Abstract: Hyperspectral images are crucial for many research works. Spectral super-resolution (SSR) is a method used to obtain high spatial resolution (HR) hyperspectral images from HR multispectral images. Traditional SSR methods include model-driven algorithms and deep learning. By unfolding a variational method, this paper proposes an optimization-driven convolutional neural network (CNN) with a deep spa… ▽ More

    Submitted 8 December, 2020; v1 submitted 19 November, 2020; originally announced November 2020.

  49. arXiv:2011.08968  [pdf, other

    cs.LG stat.ML

    Contrastive Weight Regularization for Large Minibatch SGD

    Authors: Qiwei Yuan, Weizhe Hua, Yi Zhou, Cunxi Yu

    Abstract: The minibatch stochastic gradient descent method (SGD) is widely applied in deep learning due to its efficiency and scalability that enable training deep networks with a large volume of data. Particularly in the distributed setting, SGD is usually applied with large batch size. However, as opposed to small-batch SGD, neural network models trained with large-batch SGD can hardly generalize well, i.… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

  50. arXiv:2010.05525  [pdf, ps, other

    cs.IR

    Large Scale Product Graph Construction for Recommendation in E-commerce

    Authors: Xiaoyong Yang, Yadong Zhu, Yi Zhang, Xiaobo Wang, Quan Yuan

    Abstract: Building a recommendation system that serves billions of users on daily basis is a challenging problem, as the system needs to make astronomical number of predictions per second based on real-time user behaviors with O(1) time complexity. Such kind of large scale recommendation systems usually rely heavily on pre-built index of products to speedup the recommendation service so that online user wai… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.