Skip to main content

Showing 1–50 of 73 results for author: Long, Z

  1. Sign Language Recognition Based On Facial Expression and Hand Skeleton

    Authors: Zhiyu Long, Xingyou Liu, Jiaqi Qiao, Zhi Li

    Abstract: Sign language is a visual language used by the deaf and dumb community to communicate. However, for most recognition methods based on monocular cameras, the recognition accuracy is low and the robustness is poor. Even if the effect is good on some data, it may perform poorly in other data with different interference due to the inability to extract effective features. To solve these problems, we pr… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 2023 38th Youth Academic Annual Conference of Chinese Association of Automation (YAC)

  2. arXiv:2406.16619  [pdf, other

    cs.LG cs.NE

    No More Sliding-Windows: Dynamic Functional Connectivity Based On Random Convolutions Without Learning

    Authors: Yongjie Duan, Zhiying Long

    Abstract: In the field of dynamic functional connectivity, the sliding-window method is widely used and its stability is generally recognized. However, the sliding-window method's data processing within the window is overly simplistic, which to some extent limits its effectiveness. This study proposes a feature expansion method based on random convolution, which achieves better and more noise-resistant resu… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  3. arXiv:2406.14859  [pdf, other

    cs.CL cs.AI

    From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking

    Authors: Siyuan Wang, Zhuohan Long, Zhihao Fan, Zhongyu Wei

    Abstract: The rapid development of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) has exposed vulnerabilities to various adversarial attacks. This paper provides a comprehensive overview of jailbreaking research targeting both LLMs and MLLMs, highlighting recent advancements in evaluation benchmarks, attack techniques and defense strategies. Compared to the more advanced state of… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  4. arXiv:2405.10329   

    stat.AP cs.AI

    Causal inference approach to appraise long-term effects of maintenance policy on functional performance of asphalt pavements

    Authors: Lingyun You, Nanning Guo, Zhengwu Long, Fusong Wang, Chundi Si, Aboelkasim Diab

    Abstract: Asphalt pavements as the most prevalent transportation infrastructure, are prone to serious traffic safety problems due to functional or structural damage caused by stresses or strains imposed through repeated traffic loads and continuous climatic cycles. The good quality or high serviceability of infrastructure networks is vital to the urbanization and industrial development of nations. In order… ▽ More

    Submitted 2 July, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

    Comments: The arXiv version needs to be withdrawn since the model needs to be validated and updated with advanced machine learning technologies to enhance the accuracy of the model, and there are some crucial definition errors of symbols in the arXiv version

  5. arXiv:2405.07759  [pdf, other

    cs.MM cs.AI cs.NI eess.IV

    MADRL-Based Rate Adaptation for 360° Video Streaming with Multi-Viewpoint Prediction

    Authors: Haopeng Wang, Zijian Long, Haiwei Dong, Abdulmotaleb El Saddik

    Abstract: Over the last few years, 360° video traffic on the network has grown significantly. A key challenge of 360° video playback is ensuring a high quality of experience (QoE) with limited network bandwidth. Currently, most studies focus on tile-based adaptive bitrate (ABR) streaming based on single viewport prediction to reduce bandwidth consumption. However, the performance of models for single-viewpo… ▽ More

    Submitted 17 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE Internet of Things Journal

  6. arXiv:2404.06107  [pdf, other

    cs.CL

    Exploring the Necessity of Visual Modality in Multimodal Machine Translation using Authentic Datasets

    Authors: Zi Long, Zhenhao Tang, Xianghua Fu, Jian Chen, Shilong Hou, Jinze Lyu

    Abstract: Recent research in the field of multimodal machine translation (MMT) has indicated that the visual modality is either dispensable or offers only marginal advantages. However, most of these conclusions are drawn from the analysis of experimental results based on a limited set of bilingual sentence-image pairs, such as Multi30k. In these kinds of datasets, the content of one bilingual parallel sente… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: bucc 2024 accepted

  7. arXiv:2403.09107  [pdf, other

    cs.LG cs.CV

    S^2MVTC: a Simple yet Efficient Scalable Multi-View Tensor Clustering

    Authors: Zhen Long, Qiyuan Wang, Yazhou Ren, Yipeng Liu, Ce Zhu

    Abstract: Anchor-based large-scale multi-view clustering has attracted considerable attention for its effectiveness in handling massive datasets. However, current methods mainly seek the consensus embedding feature for clustering by exploring global correlations between anchor graphs or projection matrices.In this paper, we propose a simple yet efficient scalable multi-view tensor clustering (S^2MVTC) appro… ▽ More

    Submitted 11 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  8. arXiv:2403.09096  [pdf, other

    eess.IV cs.CV

    Deep unfolding Network for Hyperspectral Image Super-Resolution with Automatic Exposure Correction

    Authors: Yuan Fang, Yipeng Liu, Jie Chen, Zhen Long, Ao Li, Chong-Yung Chi, Ce Zhu

    Abstract: In recent years, the fusion of high spatial resolution multispectral image (HR-MSI) and low spatial resolution hyperspectral image (LR-HSI) has been recognized as an effective method for HSI super-resolution (HSI-SR). However, both HSI and MSI may be acquired under extreme conditions such as night or poorly illuminating scenarios, which may cause different exposure levels, thereby seriously downgr… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  9. arXiv:2403.06289  [pdf, other

    cs.CV cs.AI cs.LG

    Understanding and Mitigating Human-Labelling Errors in Supervised Contrastive Learning

    Authors: Zijun Long, Lipeng Zhuang, George Killick, Richard McCreadie, Gerardo Aragon Camarasa, Paul Henderson

    Abstract: Human-annotated vision datasets inevitably contain a fraction of human mislabelled examples. While the detrimental effects of such mislabelling on supervised learning are well-researched, their influence on Supervised Contrastive Learning (SCL) remains largely unexplored. In this paper, we show that human-labelling errors not only differ significantly from synthetic label errors, but also pose uni… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2311.16481

  10. arXiv:2403.05388  [pdf, other

    cs.CV

    Generalized Correspondence Matching via Flexible Hierarchical Refinement and Patch Descriptor Distillation

    Authors: Yu Han, Ziwei Long, Yanting Zhang, Jin Wu, Zhijun Fang, Rui Fan

    Abstract: Correspondence matching plays a crucial role in numerous robotics applications. In comparison to conventional hand-crafted methods and recent data-driven approaches, there is significant interest in plug-and-play algorithms that make full use of pre-trained backbone networks for multi-scale feature extraction and leverage hierarchical refinement strategies to generate matched correspondences. The… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  11. arXiv:2403.04782  [pdf, other

    cs.CL cs.AI

    A Survey on Temporal Knowledge Graph: Representation Learning and Applications

    Authors: Li Cai, Xin Mao, Yuhao Zhou, Zhaoguang Long, Changxu Wu, Man Lan

    Abstract: Knowledge graphs have garnered significant research attention and are widely used to enhance downstream applications. However, most current studies mainly focus on static knowledge graphs, whose facts do not change with time, and disregard their dynamic evolution over time. As a result, temporal knowledge graphs have attracted more attention because a large amount of structured knowledge exists on… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  12. arXiv:2402.15276  [pdf, other

    cs.IR cs.AI cs.CV

    CFIR: Fast and Effective Long-Text To Image Retrieval for Large Corpora

    Authors: Zijun Long, Xuri Ge, Richard Mccreadie, Joemon Jose

    Abstract: Text-to-image retrieval aims to find the relevant images based on a text query, which is important in various use-cases, such as digital libraries, e-commerce, and multimedia databases. Although Multimodal Large Language Models (MLLMs) demonstrate state-of-the-art performance, they exhibit limitations in handling large-scale, diverse, and ambiguous real-world needs of retrieval, due to the computa… ▽ More

    Submitted 2 April, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

  13. arXiv:2402.14551  [pdf, other

    cs.CV cs.AI cs.LG

    CLCE: An Approach to Refining Cross-Entropy and Contrastive Learning for Optimized Learning Fusion

    Authors: Zijun Long, George Killick, Lipeng Zhuang, Gerardo Aragon-Camarasa, Zaiqiao Meng, Richard Mccreadie

    Abstract: State-of-the-art pre-trained image models predominantly adopt a two-stage approach: initial unsupervised pre-training on large-scale datasets followed by task-specific fine-tuning using Cross-Entropy loss~(CE). However, it has been demonstrated that CE can compromise model generalization and stability. While recent works employing contrastive learning address some of these limitations by enhancing… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2308.14893

  14. arXiv:2402.11443  [pdf, other

    cs.CL

    Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation

    Authors: Siyuan Wang, Zhuohan Long, Zhihao Fan, Zhongyu Wei, Xuanjing Huang

    Abstract: This paper presents a benchmark self-evolving framework to dynamically evaluate rapidly advancing Large Language Models (LLMs), aiming for a more accurate assessment of their capabilities and limitations. We utilize a multi-agent system to manipulate the context or question of original instances, reframing new evolving instances with high confidence that dynamically extend existing benchmarks. Tow… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

  15. arXiv:2402.02503  [pdf

    cs.CV cs.CL

    GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering

    Authors: Ziyu Ma, Shutao Li, Bin Sun, Jianfei Cai, Zuxiang Long, Fuyan Ma

    Abstract: Knowledge-based visual question answering (VQA) requires world knowledge beyond the image for accurate answer. Recently, instead of extra knowledge bases, a large language model (LLM) like GPT-3 is activated as an implicit knowledge engine to jointly acquire and reason the necessary knowledge for answering by converting images into textual information (e.g., captions and answer candidates). Howeve… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: 17 pages

  16. arXiv:2401.02982  [pdf, other

    cs.CL cs.AI

    FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models

    Authors: Shu Liu, Shangqing Zhao, Chenghao Jia, Xinlin Zhuang, Zhaoguang Long, Jie Zhou, Aimin Zhou, Man Lan, Qingquan Wu, Chong Yang

    Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of tasks. However, their proficiency and reliability in the specialized domain of financial data analysis, particularly focusing on data-driven thinking, remain uncertain. To bridge this gap, we introduce \texttt{FinDABench}, a comprehensive benchmark designed to evaluate the financial data analysis capabili… ▽ More

    Submitted 14 June, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

  17. arXiv:2401.02838  [pdf, ps, other

    cs.CV cs.AI cs.MM cs.SI

    CrisisViT: A Robust Vision Transformer for Crisis Image Classification

    Authors: Zijun Long, Richard McCreadie, Muhammad Imran

    Abstract: In times of emergency, crisis response agencies need to quickly and accurately assess the situation on the ground in order to deploy relevant services and resources. However, authorities often have to make decisions based on limited information, as data on affected regions can be scarce until local response services can provide first-hand reports. Fortunately, the widespread availability of smartp… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Journal ref: Proceedings of the 20th International ISCRAM Conference 2023, pp. 309--319

  18. Human-Centric Resource Allocation for the Metaverse With Multiaccess Edge Computing

    Authors: Zijian Long, Haiwei Dong, Abdulmotaleb El Saddik

    Abstract: Multi-access edge computing (MEC) is a promising solution to the computation-intensive, low-latency rendering tasks of the metaverse. However, how to optimally allocate limited communication and computation resources at the edge to a large number of users in the metaverse is quite challenging. In this paper, we propose an adaptive edge resource allocation method based on multi-agent soft actor-cri… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Journal ref: IEEE Internet of Things Journal, vol. 10, no. 22, pp. 19993-20005, 2023

  19. arXiv:2312.06718  [pdf, other

    cs.AI

    Large Scale Foundation Models for Intelligent Manufacturing Applications: A Survey

    Authors: Haotian Zhang, Semujju Stuart Dereck, Zhicheng Wang, Xianwei Lv, Kang Xu, Liang Wu, Ye Jia, Jing Wu, Zhuo Long, Wensheng Liang, X. G. Ma, Ruiyan Zhuang

    Abstract: Although the applications of artificial intelligence especially deep learning had greatly improved various aspects of intelligent manufacturing, they still face challenges for wide employment due to the poor generalization ability, difficulties to establish high-quality training datasets, and unsatisfactory performance of deep learning methods. The emergence of large scale foundational models(LSFM… ▽ More

    Submitted 22 December, 2023; v1 submitted 10 December, 2023; originally announced December 2023.

  20. arXiv:2311.16481  [pdf, other

    cs.CV

    Elucidating and Overcoming the Challenges of Label Noise in Supervised Contrastive Learning

    Authors: Zijun Long, George Killick, Lipeng Zhuang, Richard McCreadie, Gerardo Aragon Camarasa, Paul Henderson

    Abstract: Image classification datasets exhibit a non-negligible fraction of mislabeled examples, often due to human error when one class superficially resembles another. This issue poses challenges in supervised contrastive learning (SCL), where the goal is to cluster together data points of the same class in the embedding space while distancing those of disparate classes. While such methods outperform tho… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  21. arXiv:2310.20343  [pdf, other

    cs.IR cs.MM

    Large Multi-modal Encoders for Recommendation

    Authors: Zixuan Yi, Zijun Long, Iadh Ounis, Craig Macdonald, Richard Mccreadie

    Abstract: In recent years, the rapid growth of online multimedia services, such as e-commerce platforms, has necessitated the development of personalised recommendation approaches that can encode diverse content about each item. Indeed, modern multi-modal recommender systems exploit diverse features obtained from raw images and item descriptions to enhance the recommendation performance. However, the existi… ▽ More

    Submitted 3 November, 2023; v1 submitted 31 October, 2023; originally announced October 2023.

  22. arXiv:2310.15205  [pdf, other

    cs.CL

    DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning

    Authors: Wei Chen, Qiushi Wang, Zefei Long, Xianyin Zhang, Zhongtian Lu, Bingxuan Li, Siyuan Wang, Jiarong Xu, Xiang Bai, Xuanjing Huang, Zhongyu Wei

    Abstract: We propose Multiple Experts Fine-tuning Framework to build a financial large language model (LLM), DISC-FinLLM. Our methodology improves general LLMs by endowing them with multi-turn question answering abilities, domain text processing capabilities, mathematical computation skills, and retrieval-enhanced generation capabilities. We build a financial instruction-tuning dataset named DISC-FIN-SFT, i… ▽ More

    Submitted 25 October, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: 18 pages, 13 figures, 7 tables

  23. arXiv:2310.10221  [pdf, other

    cs.RO cs.CV

    RoboLLM: Robotic Vision Tasks Grounded on Multimodal Large Language Models

    Authors: Zijun Long, George Killick, Richard McCreadie, Gerardo Aragon Camarasa

    Abstract: Robotic vision applications often necessitate a wide range of visual perception tasks, such as object detection, segmentation, and identification. While there have been substantial advances in these individual tasks, integrating specialized models into a unified vision pipeline presents significant engineering challenges and costs. Recently, Multimodal Large Language Models (MLLMs) have emerged as… ▽ More

    Submitted 23 February, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  24. arXiv:2309.01516  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    MultiWay-Adapater: Adapting large-scale multi-modal models for scalable image-text retrieval

    Authors: Zijun Long, George Killick, Richard McCreadie, Gerardo Aragon Camarasa

    Abstract: As Multimodal Large Language Models (MLLMs) grow in size, adapting them to specialized tasks becomes increasingly challenging due to high computational and memory demands. Indeed, traditional fine-tuning methods are costly, due to the need for extensive, task-specific training. While efficient adaptation methods exist that aim to reduce these costs, in practice they suffer from shallow inter-modal… ▽ More

    Submitted 5 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

  25. arXiv:2308.14893  [pdf, other

    cs.CV cs.AI cs.LG

    When hard negative sampling meets supervised contrastive learning

    Authors: Zijun Long, George Killick, Richard McCreadie, Gerardo Aragon Camarasa, Zaiqiao Meng

    Abstract: State-of-the-art image models predominantly follow a two-stage strategy: pre-training on large datasets and fine-tuning with cross-entropy loss. Many studies have shown that using cross-entropy can result in sub-optimal generalisation and stability. While the supervised contrastive loss addresses some limitations of cross-entropy loss by focusing on intra-class similarities and inter-class differe… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

  26. arXiv:2308.11419  [pdf

    stat.ML cs.AI

    Tensor Regression

    Authors: Jiani Liu, Ce Zhu, Zhen Long, Yipeng Liu

    Abstract: Regression analysis is a key area of interest in the field of data analysis and machine learning which is devoted to exploring the dependencies between variables, often using vectors. The emergence of high dimensional data in technologies such as neuroimaging, computer vision, climatology and social networks, has brought challenges to traditional data representation methods. Tensors, as high dimen… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: 187 pages, 32 figures, 10 tables

    Journal ref: Foundations and Trends in Machine Learning: Vol. 14: No. 4, pp 379-565 (2021)

  27. arXiv:2305.09095  [pdf, other

    cs.CV

    Multi-view MERA Subspace Clustering

    Authors: Zhen Long, Ce Zhu, Jie Chen, Zihan Li, Yazhou Ren, Yipeng Liu

    Abstract: Tensor-based multi-view subspace clustering (MSC) can capture high-order correlation in the self-representation tensor. Current tensor decompositions for MSC suffer from highly unbalanced unfolding matrices or rotation sensitivity, failing to fully explore inter/intra-view information. Using the advanced tensor network, namely, multi-scale entanglement renormalization ansatz (MERA), we propose a l… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

  28. arXiv:2305.00716  [pdf, other

    cs.CV

    Adaptively Topological Tensor Network for Multi-view Subspace Clustering

    Authors: Yipeng Liu, Yingcong Lu, Weiting Ou, Zhen Long, Ce Zhu

    Abstract: Multi-view subspace clustering methods have employed learned self-representation tensors from different tensor decompositions to exploit low rank information. However, the data structures embedded with self-representation tensors may vary in different multi-view datasets. Therefore, a pre-defined tensor decomposition may not fully exploit low rank information for a certain dataset, resulting in su… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

  29. arXiv:2303.18013  [pdf, other

    cs.CV cs.AI

    LaCViT: A Label-aware Contrastive Fine-tuning Framework for Vision Transformers

    Authors: Zijun Long, Zaiqiao Meng, Gerardo Aragon Camarasa, Richard McCreadie

    Abstract: Vision Transformers (ViTs) have emerged as popular models in computer vision, demonstrating state-of-the-art performance across various tasks. This success typically follows a two-stage strategy involving pre-training on large-scale datasets using self-supervised signals, such as masked random patches, followed by fine-tuning on task-specific labeled datasets with cross-entropy loss. However, this… ▽ More

    Submitted 5 February, 2024; v1 submitted 31 March, 2023; originally announced March 2023.

  30. arXiv:2303.06380  [pdf, other

    cs.CV

    Semi-supervised Hand Appearance Recovery via Structure Disentanglement and Dual Adversarial Discrimination

    Authors: Zimeng Zhao, Binghui Zuo, Zhiyu Long, Yangang Wang

    Abstract: Enormous hand images with reliable annotations are collected through marker-based MoCap. Unfortunately, degradations caused by markers limit their application in hand appearance reconstruction. A clear appearance recovery insight is an image-to-image translation trained with unpaired data. However, most frameworks fail because there exists structure inconsistency from a degraded hand to a bare one… ▽ More

    Submitted 11 March, 2023; originally announced March 2023.

    Comments: Accepted by CVPR2023

  31. arXiv:2303.02668  [pdf, other

    cs.LG cs.AI cs.DC

    Knowledge-Enhanced Semi-Supervised Federated Learning for Aggregating Heterogeneous Lightweight Clients in IoT

    Authors: Jiaqi Wang, Shenglai Zeng, Zewei Long, Yaqing Wang, Houping Xiao, Fenglong Ma

    Abstract: Federated learning (FL) enables multiple clients to train models collaboratively without sharing local data, which has achieved promising results in different areas, including the Internet of Things (IoT). However, end IoT devices do not have abilities to automatically annotate their collected data, which leads to the label shortage issue at the client side. To collaboratively train an FL model, w… ▽ More

    Submitted 5 March, 2023; originally announced March 2023.

    Comments: This paper is acceptted by SDM-2023. Jiaqi Wang and Shenglai Zeng are of equal contribution

  32. arXiv:2212.10295  [pdf, other

    cs.MM cs.HC cs.NI

    Interacting with New York City Data by HoloLens through Remote Rendering

    Authors: Zijian Long, Haiwei Dong, Abdulmotaleb El Saddik

    Abstract: In the digital era, Extended Reality (XR) is considered the next frontier. However, XR systems are computationally intensive, and they must be implemented within strict latency constraints. Thus, XR devices with finite computing resources are limited in terms of quality of experience (QoE) they can offer, particularly in cases of big 3D data. This problem can be effectively addressed by offloading… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Journal ref: IEEE Consumer Electronics Magazine, vol. 11, no. 5, pp. 64-72, 2022

  33. arXiv:2210.12638  [pdf, other

    cs.LG

    Tucker-O-Minus Decomposition for Multi-view Tensor Subspace Clustering

    Authors: Yingcong Lu, Yipeng Liu, Zhen Long, Zhangxin Chen, Ce Zhu

    Abstract: With powerful ability to exploit latent structure of self-representation information, different tensor decompositions have been employed into low rank multi-view clustering (LRMVC) models for achieving significant performance. However, current approaches suffer from a series of problems related to those tensor decomposition, such as the unbalanced matricization scheme, rotation sensitivity, defici… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

  34. arXiv:2208.00767  [pdf, other

    cs.CV cs.AI cs.CL cs.IR

    Multimodal Neural Machine Translation with Search Engine Based Image Retrieval

    Authors: ZhenHao Tang, XiaoBing Zhang, Zi Long, XiangHua Fu

    Abstract: Recently, numbers of works shows that the performance of neural machine translation (NMT) can be improved to a certain extent with using visual information. However, most of these conclusions are drawn from the analysis of experimental results based on a limited set of bilingual sentence-image pairs, such as Multi30K. In these kinds of datasets, the content of one bilingual parallel sentence pair… ▽ More

    Submitted 3 September, 2022; v1 submitted 26 July, 2022; originally announced August 2022.

    Comments: 9 pages, 5 figures

  35. arXiv:2203.16037  [pdf, other

    cs.SD cs.LG eess.AS

    Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAE

    Authors: Ziang Long, Yunling Zheng, Meng Yu, Jack Xin

    Abstract: Variational auto-encoder (VAE) is an effective neural network architecture to disentangle a speech utterance into speaker identity and linguistic content latent embeddings, then generate an utterance for a target speaker from that of a source speaker. This is possible by concatenating the identity embedding of the target speaker and the content embedding of the source speaker uttering a desired se… ▽ More

    Submitted 22 August, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

  36. arXiv:2201.06174  [pdf, other

    cs.CV eess.IV

    A novel attention model for salient structure detection in seismic volumes

    Authors: Muhammad Amir Shafiq, Zhiling Long, Haibin Di, Ghassan AlRegib

    Abstract: A new approach to seismic interpretation is proposed to leverage visual perception and human visual system modeling. Specifically, a saliency detection algorithm based on a novel attention model is proposed for identifying subsurface structures within seismic data volumes. The algorithm employs 3D-FFT and a multi-dimensional spectral projection, which decomposes local spectra into three distinct c… ▽ More

    Submitted 16 January, 2022; originally announced January 2022.

    Comments: Published in Applied Computing and Intelligence, Nov. 2021

    Journal ref: Applied Computing and Intelligence, vol. 1, no. 1, pp. 31-45, Nov. 2021

  37. arXiv:2111.10840  [pdf

    cs.SI physics.soc-ph

    WEM: A Node Importance Algorithm in Weighted Networks

    Authors: Linjie Chen, Na Zhao, Jie Li, Zhen Long, Ming Jing, Jian Wang

    Abstract: In view of the node importance in weighted networks, weighted expected method (WEM), was proposed in this paper, which take an advantages of uncertain graph algorithm. First, a weight processing method is proposed based on the relationship between the weight of edges and the intensity of contact between nodes, and the calculation method of the contribution of the weight of edges to the node import… ▽ More

    Submitted 21 November, 2021; originally announced November 2021.

  38. arXiv:2111.00739  [pdf, other

    cs.IR cs.AI

    URIR: Recommendation algorithm of user RNN encoder and item encoder based on knowledge graph

    Authors: Na zhao, Zhen Long, Zhi-Dan Zhao, Jian Wang

    Abstract: Due to a large amount of information, it is difficult for users to find what they are interested in among the many choices. In order to improve users' experience, recommendation systems have been widely used in music recommendations, movie recommendations, online shopping, and other scenarios. Recently, Knowledge Graph (KG) has been proven to be an effective tool to improve the performance of reco… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

  39. arXiv:2109.05612  [pdf, other

    cs.LG

    FedTriNet: A Pseudo Labeling Method with Three Players for Federated Semi-supervised Learning

    Authors: Liwei Che, Zewei Long, Jiaqi Wang, Yaqing Wang, Houping Xiao, Fenglong Ma

    Abstract: Federated Learning has shown great potentials for the distributed data utilization and privacy protection. Most existing federated learning approaches focus on the supervised setting, which means all the data stored in each client has labels. However, in real-world applications, the client data are impossible to be fully labeled. Thus, how to exploit the unlabeled data should be a new challenge fo… ▽ More

    Submitted 11 December, 2021; v1 submitted 12 September, 2021; originally announced September 2021.

    Comments: Accepted by BigData 2021

  40. arXiv:2109.04533  [pdf, other

    cs.LG

    FedCon: A Contrastive Framework for Federated Semi-Supervised Learning

    Authors: Zewei Long, Jiaqi Wang, Yaqing Wang, Houping Xiao, Fenglong Ma

    Abstract: Federated Semi-Supervised Learning (FedSSL) has gained rising attention from both academic and industrial researchers, due to its unique characteristics of co-training machine learning models with isolated yet unlabeled data. Most existing FedSSL methods focus on the classical scenario, i.e, the labeled and unlabeled data are stored at the client side. However, in real world applications, client u… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

  41. arXiv:2107.01343  [pdf

    cs.LG eess.SP

    Short-term probabilistic photovoltaic power forecast based on deep convolutional long short-term memory network and kernel density estimation

    Authors: Mingliang Bai, Xinyu Zhao, Zhenhua Long, Jinfu Liu, Daren Yu

    Abstract: Solar energy is a clean and renewable energy. Photovoltaic (PV) power is an important way to utilize solar energy. Accurate PV power forecast is crucial to the large-scale application of PV power and the stability of electricity grid. This paper proposes a novel method for short-term photovoltaic power forecast using deep convolutional long short-term memory (ConvLSTM) network and kernel density e… ▽ More

    Submitted 3 July, 2021; originally announced July 2021.

  42. arXiv:2105.08826  [pdf, other

    eess.IV cs.CV cs.LG

    Real-Time Video Super-Resolution on Smartphones with Deep Learning, Mobile AI 2021 Challenge: Report

    Authors: Andrey Ignatov, Andres Romero, Heewon Kim, Radu Timofte, Chiu Man Ho, Zibo Meng, Kyoung Mu Lee, Yuxiang Chen, Yutong Wang, Zeyu Long, Chenhao Wang, Yifei Chen, Boshen Xu, Shuhang Gu, Lixin Duan, Wen Li, Wang Bofei, Zhang Diankai, Zheng Chengjian, Liu Shaoli, Gao Si, Zhang Xiaofeng, Lu Kaidi, Xu Tianyu, Zheng Hui , et al. (6 additional authors not shown)

    Abstract: Video super-resolution has recently become one of the most important mobile-related problems due to the rise of video communication and streaming services. While many solutions have been proposed for this task, the majority of them are too computationally expensive to run on portable devices with limited hardware resources. To address this problem, we introduce the first Mobile AI challenge, where… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: Mobile AI 2021 Workshop and Challenges: https://ai-benchmark.com/workshops/mai/2021/. arXiv admin note: substantial text overlap with arXiv:2105.07825. substantial text overlap with arXiv:2105.08629, arXiv:2105.07809, arXiv:2105.08630

  43. arXiv:2012.05529  [pdf, other

    cs.LG

    Recurrence of Optimum for Training Weight and Activation Quantized Networks

    Authors: Ziang Long, Penghang Yin, Jack Xin

    Abstract: Deep neural networks (DNNs) are quantized for efficient inference on resource-constrained platforms. However, training deep learning models with low-precision weights and activations involves a demanding optimization task, which calls for minimizing a stage-wise loss function subject to a discrete set-constraint. While numerous training methods have been proposed, existing studies for full quantiz… ▽ More

    Submitted 21 May, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

  44. arXiv:2012.03292  [pdf, other

    cs.LG

    FedSiam: Towards Adaptive Federated Semi-Supervised Learning

    Authors: Zewei Long, Liwei Che, Yaqing Wang, Muchao Ye, Junyu Luo, Jinze Wu, Houping Xiao, Fenglong Ma

    Abstract: Federated learning (FL) has emerged as an effective technique to co-training machine learning models without actually sharing data and leaking privacy. However, most existing FL methods focus on the supervised setting and ignore the utilization of unlabeled data. Although there are a few existing studies trying to incorporate unlabeled data into FL, they all fail to maintain performance guarantees… ▽ More

    Submitted 5 July, 2021; v1 submitted 6 December, 2020; originally announced December 2020.

  45. arXiv:2011.11256  [pdf, other

    cs.LG

    Learning Quantized Neural Nets by Coarse Gradient Method for Non-linear Classification

    Authors: Ziang Long, Penghang Yin, Jack Xin

    Abstract: Quantized or low-bit neural networks are attractive due to their inference efficiency. However, training deep neural networks with quantized activations involves minimizing a discontinuous and piecewise constant loss function. Such a loss function has zero gradients almost everywhere (a.e.), which makes the conventional gradient-based algorithms inapplicable. To this end, we study a novel class of… ▽ More

    Submitted 13 June, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

  46. arXiv:2007.01056  [pdf, other

    eess.IV cs.LG

    Hyperspectral Image Denoising with Partially Orthogonal Matrix Vector Tensor Factorization

    Authors: Zhen Long, Yipeng Liu, Sixing Zeng, Jiani Liu, Fei Wen, Ce Zhu

    Abstract: Hyperspectral image (HSI) has some advantages over natural image for various applications due to the extra spectral information. During the acquisition, it is often contaminated by severe noises including Gaussian noise, impulse noise, deadlines, and stripes. The image quality degeneration would badly effect some applications. In this paper, we present a HSI restoration method named smooth and rob… ▽ More

    Submitted 28 June, 2020; originally announced July 2020.

  47. arXiv:2007.01055  [pdf, other

    stat.ML cs.LG cs.MM

    Bayesian Low Rank Tensor Ring Model for Image Completion

    Authors: Zhen Long, Ce Zhu, Jiani Liu, Yipeng Liu

    Abstract: Low rank tensor ring model is powerful for image completion which recovers missing entries in data acquisition and transformation. The recently proposed tensor ring (TR) based completion algorithms generally solve the low rank optimization problem by alternating least squares method with predefined ranks, which may easily lead to overfitting when the unknown ranks are set too large and only a few… ▽ More

    Submitted 28 June, 2020; originally announced July 2020.

  48. arXiv:2006.02377  [pdf, other

    math.NA cs.LG physics.comp-ph stat.ML

    RODE-Net: Learning Ordinary Differential Equations with Randomness from Data

    Authors: Junyu Liu, Zichao Long, Ranran Wang, Jie Sun, Bin Dong

    Abstract: Random ordinary differential equations (RODEs), i.e. ODEs with random parameters, are often used to model complex dynamics. Most existing methods to identify unknown governing RODEs from observed data often rely on strong prior knowledge. Extracting the governing equations from data with less prior knowledge remains a great challenge. In this paper, we propose a deep neural network, called RODE-Ne… ▽ More

    Submitted 3 June, 2020; originally announced June 2020.

    Comments: 11 pages, 6 figures, 3 tables

  49. arXiv:2003.07725  [pdf, other

    eess.IV cs.CV

    Fabric Surface Characterization: Assessment of Deep Learning-based Texture Representations Using a Challenging Dataset

    Authors: Yuting Hu, Zhiling Long, Anirudha Sundaresan, Motaz Alfarraj, Ghassan AlRegib, Sungmee Park, Sundaresan Jayaraman

    Abstract: Tactile sensing or fabric hand plays a critical role in an individual's decision to buy a certain fabric from the range of available fabrics for a desired application. Therefore, textile and clothing manufacturers have long been in search of an objective method for assessing fabric hand, which can then be used to engineer fabrics with a desired hand. Recognizing textures and materials in real-worl… ▽ More

    Submitted 16 March, 2020; originally announced March 2020.

    Comments: arXiv admin note: text overlap with arXiv:1905.09907

  50. arXiv:2002.12563  [pdf, other

    cs.LG math.OC stat.ML

    Global Convergence and Geometric Characterization of Slow to Fast Weight Evolution in Neural Network Training for Classifying Linearly Non-Separable Data

    Authors: Ziang Long, Penghang Yin, Jack Xin

    Abstract: In this paper, we study the dynamics of gradient descent in learning neural networks for classification problems. Unlike in existing works, we consider the linearly non-separable case where the training data of different classes lie in orthogonal subspaces. We show that when the network has sufficient (but not exceedingly large) number of neurons, (1) the corresponding minimization problem has a d… ▽ More

    Submitted 10 December, 2020; v1 submitted 28 February, 2020; originally announced February 2020.