subscribe to arXiv mailings

A Survey of Models for Cognitive Diagnosis: New Developments and Future Directions

Authors: Fei Wang, Weibo Gao, Qi Liu, Jiatong Li, Guanhao Zhao, Zheng Zhang, Zhenya Huang, Mengxiao Zhu, Shijin Wang, Wei Tong, Enhong Chen

Abstract: Cognitive diagnosis has been developed for decades as an effective measurement tool to evaluate human cognitive status such as ability level and knowledge mastery. It has been applied to a wide range of fields including education, sport, psychological diagnosis, etc. By providing better awareness of cognitive status, it can serve as the basis for personalized services such as well-designed medical… ▽ More Cognitive diagnosis has been developed for decades as an effective measurement tool to evaluate human cognitive status such as ability level and knowledge mastery. It has been applied to a wide range of fields including education, sport, psychological diagnosis, etc. By providing better awareness of cognitive status, it can serve as the basis for personalized services such as well-designed medical treatment, teaching strategy and vocational training. This paper aims to provide a survey of current models for cognitive diagnosis, with more attention on new developments using machine learning-based methods. By comparing the model structures, parameter estimation algorithms, model evaluation methods and applications, we provide a relatively comprehensive review of the recent trends in cognitive diagnosis models. Further, we discuss future directions that are worthy of exploration. In addition, we release two Python libraries: EduData for easy access to some relevant public datasets we have collected, and EduCDM that implements popular CDMs to facilitate both applications and research purposes. △ Less

Submitted 7 July, 2024; originally announced July 2024.

arXiv:2407.03581 [pdf, ps, other]

Topologically nontrivial $1/3$-magnetization plateau state in a spin-1/2 trimer chain

Authors: Y. Y. Han, B. C. Yu, Z. Du, L. S. Ling, L. Zhang, W. Tong, C. Y. Xi, J. L. Zhang, T. Shang, Li Pi, Long Ma

Abstract: Topologically nontrivial Haldane phase is theoretically proposed to be realized in the 1/3-magnetization ($M$) plateau of spin-1/2 trimer systems. However, the spin excitation gap, typical characteristic of Haldane phase, is not yet experimentally verified. Here, we report the nuclear magnetic resonance investigations into the low-energy spin dynamics in the $S=1/2$ spin-trimer antiferromagnetic c… ▽ More Topologically nontrivial Haldane phase is theoretically proposed to be realized in the 1/3-magnetization ($M$) plateau of spin-1/2 trimer systems. However, the spin excitation gap, typical characteristic of Haldane phase, is not yet experimentally verified. Here, we report the nuclear magnetic resonance investigations into the low-energy spin dynamics in the $S=1/2$ spin-trimer antiferromagnetic chain compound Na$_2$Cu$_3$Ge$_{4-x}$Si$_{x}$O$_{12}$ ($x=0, 0.1\sim1.5$). In the parent compound ($x=0$), the spin-lattice relaxation rate (1/$T_1$) shows significantly different temperature dependence when the external magnetic field is increased above the critical field of $μ_0$$H_{c}$ = 29 T. The spin excitation gap is evidenced from the thermally activated behavior of $1/T_1(T)$ in the 1/3-$M$ plateau state. By substituting Ge$^{4+}$ with Si$^{4+}$, the critical field for the 1/3-$M$ plateau significantly decreases, e.g. $μ_0H_{c}=17$ T in $x=1.0$ samples, which results from the suppressed inter-trimer coupling $J_2$. The gapped spin excitation is confirmed again above 17 T, whose size shows temperature-dependent behavior for $μ_0H\geq25.72$ T. These observations provide further insights into the Haldane physics. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 6 pages, 4 figures

arXiv:2407.03555 [pdf, other]

Adaptive Perturbation Enhanced SCL Decoder for Polar Codes

Authors: Xianbin Wang, Huazi Zhang, Jiajie Tong, Jun Wang, Wen Tong

Abstract: For polar codes, successive cancellation list (SCL) decoding algorithm significantly improves finite-length performance compared to SC decoding. SCL-flip decoding can further enhance the performance but the gain diminishes as code length increases, due to the difficulty in locating the first error bit position. In this work, we introduce an SCL-perturbation decoding algorithm to address this issue… ▽ More For polar codes, successive cancellation list (SCL) decoding algorithm significantly improves finite-length performance compared to SC decoding. SCL-flip decoding can further enhance the performance but the gain diminishes as code length increases, due to the difficulty in locating the first error bit position. In this work, we introduce an SCL-perturbation decoding algorithm to address this issue. A basic version of the algorithm introduces small random perturbations to the received symbols before each SCL decoding attempt, and exhibits non-diminishing gain at large block lengths. Its enhanced version adaptively performs random perturbations or directional perturbation on each received symbol according to previous decoding results, and managed to correct more errors with fewer decoding attempts. Extensive simulation results demonstrate stable gains across various code rates, lengths and list sizes. To the best of our knowledge, this is the first SCL enhancement with non-diminishing gains as code length increases, and achieves unprecedented efficiency. With only one additional SCL-$L$ decoding attempt (in total two), the proposed algorithm achieves SCL-$2L$-equivalent performance. Since the gain is obtained without increasing list size, the algorithm is best suited for hardware implementation. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2406.19466 [pdf, other]

doi 10.1145/3658644.3670298

Data Poisoning Attacks to Locally Differentially Private Frequent Itemset Mining Protocols

Authors: Wei Tong, Haoyu Chen, Jiacheng Niu, Sheng Zhong

Abstract: Local differential privacy (LDP) provides a way for an untrusted data collector to aggregate users' data without violating their privacy. Various privacy-preserving data analysis tasks have been studied under the protection of LDP, such as frequency estimation, frequent itemset mining, and machine learning. Despite its privacy-preserving properties, recent research has demonstrated the vulnerabili… ▽ More Local differential privacy (LDP) provides a way for an untrusted data collector to aggregate users' data without violating their privacy. Various privacy-preserving data analysis tasks have been studied under the protection of LDP, such as frequency estimation, frequent itemset mining, and machine learning. Despite its privacy-preserving properties, recent research has demonstrated the vulnerability of certain LDP protocols to data poisoning attacks. However, existing data poisoning attacks are focused on basic statistics under LDP, such as frequency estimation and mean/variance estimation. As an important data analysis task, the security of LDP frequent itemset mining has yet to be thoroughly examined. In this paper, we aim to address this issue by presenting novel and practical data poisoning attacks against LDP frequent itemset mining protocols. By introducing a unified attack framework with composable attack operations, our data poisoning attack can successfully manipulate the state-of-the-art LDP frequent itemset mining protocols and has the potential to be adapted to other protocols with similar structures. We conduct extensive experiments on three datasets to compare the proposed attack with four baseline attacks. The results demonstrate the severity of the threat and the effectiveness of the proposed attack. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: To appear in ACM Conference on Computer and Communications Security (ACM CCS 2024)

arXiv:2406.18008 [pdf, other]

Rate-Distortion-Perception Tradeoff for Gaussian Vector Sources

Authors: Jingjing Qian, Sadaf Salehkalaibar, Jun Chen, Ashish Khisti, Wei Yu, Wuxian Shi, Yiqun Ge, Wen Tong

Abstract: This paper studies the rate-distortion-perception (RDP) tradeoff for a Gaussian vector source coding problem where the goal is to compress the multi-component source subject to distortion and perception constraints. The purpose of imposing a perception constraint is to ensure visually pleasing reconstructions. This paper studies this RDP setting with either the Kullback-Leibler (KL) divergence or… ▽ More This paper studies the rate-distortion-perception (RDP) tradeoff for a Gaussian vector source coding problem where the goal is to compress the multi-component source subject to distortion and perception constraints. The purpose of imposing a perception constraint is to ensure visually pleasing reconstructions. This paper studies this RDP setting with either the Kullback-Leibler (KL) divergence or Wasserstein-2 metric as the perception loss function, and shows that for Gaussian vector sources, jointly Gaussian reconstructions are optimal. We further demonstrate that the optimal tradeoff can be expressed as an optimization problem, which can be explicitly solved. An interesting property of the optimal solution is as follows. Without the perception constraint, the traditional reverse water-filling solution for characterizing the rate-distortion (RD) tradeoff of a Gaussian vector source states that the optimal rate allocated to each component depends on a constant, called the water-level. If the variance of a specific component is below the water-level, it is assigned a {zero} compression rate. However, with active distortion and perception constraints, we show that the optimal rates allocated to the different components are always {positive}. Moreover, the water-levels that determine the optimal rate allocation for different components are unequal. We further treat the special case of perceptually perfect reconstruction and study its RDP function in the high-distortion and low-distortion regimes to obtain insight to the structure of the optimal solution. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.14318 [pdf, other]

The Fire Thief Is Also the Keeper: Balancing Usability and Privacy in Prompts

Authors: Zhili Shen, Zihang Xi, Ying He, Wei Tong, Jingyu Hua, Sheng Zhong

Abstract: The rapid adoption of online chatbots represents a significant advancement in artificial intelligence. However, this convenience brings considerable privacy concerns, as prompts can inadvertently contain sensitive information exposed to large language models (LLMs). Limited by high computational costs, reduced task usability, and excessive system modifications, previous works based on local deploy… ▽ More The rapid adoption of online chatbots represents a significant advancement in artificial intelligence. However, this convenience brings considerable privacy concerns, as prompts can inadvertently contain sensitive information exposed to large language models (LLMs). Limited by high computational costs, reduced task usability, and excessive system modifications, previous works based on local deployment, embedding perturbation, and homomorphic encryption are inapplicable to online prompt-based LLM applications. To address these issues, this paper introduces Prompt Privacy Sanitizer (i.e., ProSan), an end-to-end prompt privacy protection framework that can produce anonymized prompts with contextual privacy removed while maintaining task usability and human readability. It can also be seamlessly integrated into the online LLM service pipeline. To achieve high usability and dynamic anonymity, ProSan flexibly adjusts its protection targets and strength based on the importance of the words and the privacy leakage risk of the prompts. Additionally, ProSan is capable of adapting to diverse computational resource conditions, ensuring privacy protection even for mobile devices with limited computing power. Our experiments demonstrate that ProSan effectively removes private information across various tasks, including question answering, text summarization, and code generation, with minimal reduction in task performance. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2405.15618 [pdf, other]

MLPs Learn In-Context

Authors: William L. Tong, Cengiz Pehlevan

Abstract: In-context learning (ICL), the remarkable ability to solve a task from only input exemplars, has commonly been assumed to be a unique hallmark of Transformer models. In this study, we demonstrate that multi-layer perceptrons (MLPs) can also learn in-context. Moreover, we find that MLPs, and the closely related MLP-Mixer models, learn in-context competitively with Transformers given the same comput… ▽ More In-context learning (ICL), the remarkable ability to solve a task from only input exemplars, has commonly been assumed to be a unique hallmark of Transformer models. In this study, we demonstrate that multi-layer perceptrons (MLPs) can also learn in-context. Moreover, we find that MLPs, and the closely related MLP-Mixer models, learn in-context competitively with Transformers given the same compute budget. We further show that MLPs outperform Transformers on a subset of ICL tasks designed to test relational reasoning. These results suggest that in-context learning is not exclusive to Transformers and highlight the potential of exploring this phenomenon beyond attention-based architectures. In addition, MLPs' surprising success on relational tasks challenges prior assumptions about simple connectionist models. Altogether, our results endorse the broad trend that ``less inductive bias is better" and contribute to the growing interest in all-MLP alternatives to task-specific architectures. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 29 pages, 9 figures, code available at https://github.com/wtong98/mlp-icl

arXiv:2404.16821 [pdf, other]

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

Authors: Zhe Chen, Weiyun Wang, Hao Tian, Shenglong Ye, Zhangwei Gao, Erfei Cui, Wenwen Tong, Kongzhi Hu, Jiapeng Luo, Zheng Ma, Ji Ma, Jiaqi Wang, Xiaoyi Dong, Hang Yan, Hewei Guo, Conghui He, Botian Shi, Zhenjiang Jin, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai , et al. (10 additional authors not shown)

Abstract: In this report, we introduce InternVL 1.5, an open-source multimodal large language model (MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding. We introduce three simple improvements: (1) Strong Vision Encoder: we explored a continuous learning strategy for the large-scale vision foundation model -- InternViT-6B, boosting its visual… ▽ More In this report, we introduce InternVL 1.5, an open-source multimodal large language model (MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding. We introduce three simple improvements: (1) Strong Vision Encoder: we explored a continuous learning strategy for the large-scale vision foundation model -- InternViT-6B, boosting its visual understanding capabilities, and making it can be transferred and reused in different LLMs. (2) Dynamic High-Resolution: we divide images into tiles ranging from 1 to 40 of 448$\times$448 pixels according to the aspect ratio and resolution of the input images, which supports up to 4K resolution input. (3) High-Quality Bilingual Dataset: we carefully collected a high-quality bilingual dataset that covers common scenes, document images, and annotated them with English and Chinese question-answer pairs, significantly enhancing performance in OCR- and Chinese-related tasks. We evaluate InternVL 1.5 through a series of benchmarks and comparative studies. Compared to both open-source and proprietary models, InternVL 1.5 shows competitive performance, achieving state-of-the-art results in 8 of 18 benchmarks. Code has been released at https://github.com/OpenGVLab/InternVL. △ Less

Submitted 29 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

Comments: Technical report

arXiv:2403.13277 [pdf]

Quantum valley Hall states in low-buckled counterparts of graphene bilayer

Authors: Yu-Hao Shen, Jun-Ding Zheng, Wen-Yi Tong, Chun-Gang Duan

Abstract: With low-buckled structure for each layer in graphene bilayer system, there breaks inversion symmetry (P-symmetry) for one stacking when both A and B sublattices in top layer are aligned with those in bottom layer. In consideration of spin-orbit coupling (SOC), there opens nontrivial topological gap in each monolayer system to achieve quantum spin Hall effect (QSHE). As long as time-reversal symme… ▽ More With low-buckled structure for each layer in graphene bilayer system, there breaks inversion symmetry (P-symmetry) for one stacking when both A and B sublattices in top layer are aligned with those in bottom layer. In consideration of spin-orbit coupling (SOC), there opens nontrivial topological gap in each monolayer system to achieve quantum spin Hall effect (QSHE). As long as time-reversal symmetry (T-symmetry) is preserved the gapless edge states is robust in each individual layer even for the bilayer absent of PT symmetry. Based on this platform and through tight-binding (TB) model calculations we find it becomes a typical system that can exhibit quantum valley Hall effect (QVHE) when introduced a layer-resolved Rashba SOC that leads to band inversion at each K valley in the hexagonal Brillion zone (BZ). The topological transition comes from that the valley Chern number Cv = CK - CK' switches from 0 to 2, which characterizes the nontrivial QVHE phase transited from two coupled Z2 topological insulators. We also point that the layer-resolved Rashba SOC can be introduced equivalently by twisting two van der Waals touched layers. And through TB calculations, it is shown that the K bands inverts in its corresponding mini BZ when the two layers twisted by a small angle. Our findings advance potential applications for the devices design in topological valleytronics and twistronics. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.12475 [pdf]

Dielectric response in twisted MoS2 bilayer facilitated by spin-orbit coupling effect

Authors: Yu-Hao Shen, Jun-Ding Zheng, Wen-Yi Tong, Zhi-Qiang Bao, Xian-Gang Wan, Chun-Gang Duan

Abstract: Twisted van der Waals bilayers offer ideal two-dimensional (2D) platforms for exploring the intricate interplay between the spin and charge degrees of freedom of electrons. By investigating twisted MoS2 bilayer, featuring two distinct stackings but with identical commensurate supercell sizes, we reveal an unusual dielectric response behavior inherent to this system. Our first-principles calculatio… ▽ More Twisted van der Waals bilayers offer ideal two-dimensional (2D) platforms for exploring the intricate interplay between the spin and charge degrees of freedom of electrons. By investigating twisted MoS2 bilayer, featuring two distinct stackings but with identical commensurate supercell sizes, we reveal an unusual dielectric response behavior inherent to this system. Our first-principles calculations demonstrate that the application of an out-of-plane electric field gives different responses in electronic polarization. Upon further analysis, it becomes apparent that this dielectric response comes from the planar charge redistribution associated with spin-orbit coupling (SOC) effect. The underlying mechanism lies in the fact that the external electric field tends to modify the internal pseudo-spin texture σ, subsequently generating an out-of-plane (pseudo-) spin current j_s \propto σ\times B_R as response to an in-plane pseudomagnetic field B_R through Rashba SOC. It is found that the generated j_s is opposite for the two distinct stackings, resulting in opposite in-plane electric susceptibility. As a consequence, through magnetoelectric coupling within such nonmagnetic system, there give rise to opposite tendency to redistribute charge, ultimately leading to an amplified or suppressed dielectric response. △ Less

Submitted 11 July, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

arXiv:2402.13533 [pdf, other]

FinGPT-HPC: Efficient Pretraining and Finetuning Large Language Models for Financial Applications with High-Performance Computing

Authors: Xiao-Yang Liu, Jie Zhang, Guoxuan Wang, Weiqing Tong, Anwar Walid

Abstract: Large language models (LLMs) are computationally intensive. The computation workload and the memory footprint grow quadratically with the dimension (layer width). Most of LLMs' parameters come from the linear layers of the transformer structure and are highly redundant. These linear layers contribute more than 80% of the computation workload and 99% of the model size. To pretrain and finetune LLMs… ▽ More Large language models (LLMs) are computationally intensive. The computation workload and the memory footprint grow quadratically with the dimension (layer width). Most of LLMs' parameters come from the linear layers of the transformer structure and are highly redundant. These linear layers contribute more than 80% of the computation workload and 99% of the model size. To pretrain and finetune LLMs efficiently, there are three major challenges to address: 1) reducing redundancy of the linear layers; 2) reducing GPU memory footprint; 3) improving GPU utilization when using distributed training. Prior methods, such as LoRA and QLoRA, utilized low-rank matrices and quantization to reduce the number of trainable parameters and model size, respectively. However, the resulting model still consumes a large amount of GPU memory. In this paper, we present high-performance GPU-based methods that exploit low-rank structures to pretrain and finetune LLMs for financial applications. We replace one conventional linear layer of the transformer structure with two narrower linear layers, which allows us to reduce the number of parameters by several orders of magnitude. By quantizing the parameters into low precision (8-bit and 4-bit), the memory consumption of the resulting model is further reduced. Compared with existing LLMs, our methods achieve a speedup of 1.3X and a model compression ratio of 2.64X for pretaining without accuracy drop. For finetuning, our methods achieve an average accuracy increase of 6.3% and 24.0% in general tasks and financial tasks, respectively, and GPU memory consumption ratio of 6.3X. The sizes of our models are smaller than 0.59 GB, allowing inference on a smartphone. △ Less

Submitted 21 February, 2024; originally announced February 2024.

arXiv:2402.09457 [pdf]

Self-Healing Effects in OAM Beams Observed on a 28 GHz Experimental Link

Authors: Marek Klemes, Lan Hu, Greg Bowles, Mohammad Akbari, Soulideth Thirakoune, Michael Schwartzman, Kevin Zhang, Tan Huy Ho, David Wessel, Wen Tong

Abstract: In this paper we document for the first time some of the effects of self-healing, a property of orbital-angular-momentum (OAM) or vortex beams, as observed on a millimeter-wave experimental communications link in an outdoors line-of-sight (LOS) scenario. The OAM beams have a helical phase and polarization structure and have conical amplitude shape in the far field. The Poynting vectors of the OAM… ▽ More In this paper we document for the first time some of the effects of self-healing, a property of orbital-angular-momentum (OAM) or vortex beams, as observed on a millimeter-wave experimental communications link in an outdoors line-of-sight (LOS) scenario. The OAM beams have a helical phase and polarization structure and have conical amplitude shape in the far field. The Poynting vectors of the OAM beams also possess helical structures, orthogonal to the corresponding helical phase-fronts. Due to such non-planar structure in the direction orthogonal to the beam axis, OAM beams are a subset of structured light beams. Such structured beams are known to possess self-healing properties when partially obstructed along their propagation axis, especially in their near fields, resulting in partial reconstruction of their structures at larger distances along their beam axis. Various theoretical rationales have been proposed to explain, model and experimentally verify the self-healing physical effects in structured optical beams, using various types of obstructions and experimental techniques. Based on these models, we hypothesize that any self-healing observed will be greater as the OAM order increases. Here we observe the self-healing effects for the first time in structured OAM radio beams, in terms of communication signals and channel parameters rather than beam structures. We capture the effects of partial near-field obstructions of OAM beams of different orders on the communications signals and provide a physical rationale to substantiate that the self-healing effect was observed to increase with the order of OAM, agreeing with our hypothesis. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 9 pages, 10 figures, pending submission to IEEE Access journal

arXiv:2402.04991 [pdf, other]

Exploring the Opportunity of Augmented Reality (AR) in Supporting Older Adults Explore and Learn Smartphone Applications

Authors: Xiaofu Jin, Wai Tong, Xiaoying Wei, Xian Wang, Emily Kuang, Xiaoyu Mo, Huamin Qu, Mingming Fan

Abstract: The global aging trend compels older adults to navigate the evolving digital landscape, presenting a substantial challenge in mastering smartphone applications. While Augmented Reality (AR) holds promise for enhancing learning and user experience, its role in aiding older adults' smartphone app exploration remains insufficiently explored. Therefore, we conducted a two-phase study: (1) a workshop w… ▽ More The global aging trend compels older adults to navigate the evolving digital landscape, presenting a substantial challenge in mastering smartphone applications. While Augmented Reality (AR) holds promise for enhancing learning and user experience, its role in aiding older adults' smartphone app exploration remains insufficiently explored. Therefore, we conducted a two-phase study: (1) a workshop with 18 older adults to identify app exploration challenges and potential AR interventions, and (2) tech-probe participatory design sessions with 15 participants to co-create AR support tools. Our research highlights AR's effectiveness in reducing physical and cognitive strain among older adults during app exploration, especially during multi-app usage and the trial-and-error learning process. We also examined their interactional experiences with AR, yielding design considerations on tailoring AR tools for smartphone app exploration. Ultimately, our study unveils the prospective landscape of AR in supporting the older demographic, both presently and in future scenarios. △ Less

Submitted 7 February, 2024; originally announced February 2024.

arXiv:2401.13276 [pdf, other]

SCNet: Sparse Compression Network for Music Source Separation

Authors: Weinan Tong, Jiaxu Zhu, Jun Chen, Shiyin Kang, Tao Jiang, Yang Li, Zhiyong Wu, Helen Meng

Abstract: Deep learning-based methods have made significant achievements in music source separation. However, obtaining good results while maintaining a low model complexity remains challenging in super wide-band music source separation. Previous works either overlook the differences in subbands or inadequately address the problem of information loss when generating subband features. In this paper, we propo… ▽ More Deep learning-based methods have made significant achievements in music source separation. However, obtaining good results while maintaining a low model complexity remains challenging in super wide-band music source separation. Previous works either overlook the differences in subbands or inadequately address the problem of information loss when generating subband features. In this paper, we propose SCNet, a novel frequency-domain network to explicitly split the spectrogram of the mixture into several subbands and introduce a sparsity-based encoder to model different frequency bands. We use a higher compression ratio on subbands with less information to improve the information density and focus on modeling subbands with more information. In this way, the separation performance can be significantly improved using lower computational consumption. Experiment results show that the proposed model achieves a signal to distortion ratio (SDR) of 9.0 dB on the MUSDB18-HQ dataset without using extra data, which outperforms state-of-the-art methods. Specifically, SCNet's CPU inference time is only 48% of HT Demucs, one of the previous state-of-the-art models. △ Less

Submitted 24 January, 2024; originally announced January 2024.

Comments: Accepted by ICASSP 2024

arXiv:2401.11091 [pdf]

A family of rare-earth Quasi-One-Dimensional spin-chain compounds K2RENb5O15 (RE=Ce,Pr,Nd,Sm,Gd-Ho) with large interchain distance

Authors: Qingyuan Zeng, Han Ge, Maofeng Wu, Shaoheng Ruan, Tiantian Li, Zhaosheng Wang, Jingxin Li, Langsheng Ling, Wei Tong, Shuai Huang, Andi Liu, Jin Zhou, Zhengcai Xia, Jieming Sheng, Liusuo Wu, Zhaoming Tian

Abstract: One-dimensional spin chain systems have received special attention to discover the novel magnetic ground states and emergent phenomena, while the magnetic studies on rare-earth (RE)-based 1D spin chain materials are still rare. Here, we report the synthesis, structure and magnetic behaviors on a family of tetragonal tungsten-bronze structure K2RENb5O15 (RE = Ce, Pr, Nd, Sm, Gd-Ho) compounds, which… ▽ More One-dimensional spin chain systems have received special attention to discover the novel magnetic ground states and emergent phenomena, while the magnetic studies on rare-earth (RE)-based 1D spin chain materials are still rare. Here, we report the synthesis, structure and magnetic behaviors on a family of tetragonal tungsten-bronze structure K2RENb5O15 (RE = Ce, Pr, Nd, Sm, Gd-Ho) compounds, which consist of 1D linear spin-chain structure built by RE3+ ions along the c-axis and well spatially separated by the nonmagnetic K/Nb-O polyhedrons with large interchain distances of ~ 8.80-8.88 Å in the ab-plane. The low temperature magnetic measurements reveal the absence of long-range magnetic order down to 1.8 K for all serial K2RENb5O15 compounds and the dominant ferromagnetic interactions for RE=Ce,Dy and antiferromagnetic interactions for other members. Among them, K2GdNb5O15 with spin only magnetic moment S=7/2, exhibits a long-range magnetic order with TN~0.31 K and strong spin fluctuations at low temperatures due to its low-dimension characteristics. Moreover, a large magnetocaloric effect under low field change of 0-2 T is realized at temperatures below 1 K for K2GdNb5O15, letting it as an ideal candidate for adiabatic magnetic refrigeration applications at sub-kelvin temperatures. The K2RENb5O15 become a rare family of insulting RE-based magnets to explore the novel 1D spin chain physics beyond the 3d TM-based counterparts, in terms of its combination of low dimension, strong spin-orbital coupling and the rich diversity of RE ions. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: 27 pages, 11 figures

arXiv:2401.00646 [pdf, ps, other]

doi 10.1103/PhysRevB.108.214108

High magnetic field phase diagram and weak FM breaking in (Ni0.93Co0.07)3V2O8

Authors: Jiating Wu, Minjie Zhang, Ke Shi, Huxin Yin, Yuyan Han, Lansheng Ling, Wei Tong, Chuanying Xi, Li Pi, Zhaosheng Wang

Abstract: We present magnetostriction and thermal expansion measurements on multiferroic (Ni0.93Co0.07)3V2O8. The high field phase diagrams up to 33 T along the a, b and c directions are built. For H//a, as the magnetic field increases, two intermediate phases appear between the incommensurate phase and the paramagnetic phase at about 7 K, and then a magnetically induced phase appears above the paramagnetic… ▽ More We present magnetostriction and thermal expansion measurements on multiferroic (Ni0.93Co0.07)3V2O8. The high field phase diagrams up to 33 T along the a, b and c directions are built. For H//a, as the magnetic field increases, two intermediate phases appear between the incommensurate phase and the paramagnetic phase at about 7 K, and then a magnetically induced phase appears above the paramagnetic phase. For H//b,thermal expansion measurement indicates a mutation in the spin lattice coupling of the high field phases. The interlaced phase boundary suggests a mixed state in the optical high field phase. For H//c, an intermediate phase between the commensurate phase and the incommensurate phase is detected. A nonlinear boundary between the intermediate phase and the low temperature incommensurate phase, and a clear boundary between the commensurate phase and the paramagnetic phase are found. These results indicate that doping Co2+ breaks the weak ferromagnetic moment of the commensurate phase, which exists in the parent compound Ni3V2O8 and (Ni0.9Co0.1)3V2O8. This nonlinear influence reflects complicated spin modulation in Ni3V2O8 by doping Co2+. △ Less

Submitted 31 December, 2023; originally announced January 2024.

Comments: 7 pages, 4 figures

Journal ref: Phys. Rev. B 108, 214108(2023)

arXiv:2312.12381 [pdf, other]

Blockchain-Based Identity Authentication Oriented to Multi-Cluster UAV Networking

Authors: Zesong Dong, Wei Tong, Zhiwei Zhang, Jian Li, Weidong Yang, Yulong Shen

Abstract: Unmanned Aerial Vehicle (UAV) networking is increasingly used in field environments such as power inspection, agricultural plant protection, and emergency rescue. To guarantee UAV networking security, UAV identity authentication attracts wide attention, especially in the field environment without perfect infrastructure. Some blockchain-based UAV identity authentication solutions are proposed to es… ▽ More Unmanned Aerial Vehicle (UAV) networking is increasingly used in field environments such as power inspection, agricultural plant protection, and emergency rescue. To guarantee UAV networking security, UAV identity authentication attracts wide attention, especially in the field environment without perfect infrastructure. Some blockchain-based UAV identity authentication solutions are proposed to establish decentralized and trusted authentication systems without relying on infrastructure. However, these solutions do not support disconnected UAV reconnection or even disband a cluster directly after its head UAV disconnection, which compromises cluster robustness and task result integrity. In this paper, we propose a blockchain-based identity authentication solution oriented to multi-cluster UAV networking with a UAV disconnection mechanism and a task result backup mechanism. Specifically, we build a blockchain maintained by head UAVs of all clusters, managing identity information to guarantee the security of decentralized identity management. The UAV disconnection mechanism permits a verified distributed UAV reconnection to ensure the robustness of the UAV cluster, and on this basis, the task result backup mechanism ensures the integrity of the task results stored in a cluster even any UAV disconnection. Finally, extensive experimental results prove the superiority of our solutions in terms of robustness, integrity, delay, and energy consumption. △ Less

Submitted 14 November, 2023; originally announced December 2023.

arXiv:2312.09245 [pdf, other]

DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving

Authors: Wenhai Wang, Jiangwei Xie, ChuanYang Hu, Haoming Zou, Jianan Fan, Wenwen Tong, Yang Wen, Silei Wu, Hanming Deng, Zhiqi Li, Hao Tian, Lewei Lu, Xizhou Zhu, Xiaogang Wang, Yu Qiao, Jifeng Dai

Abstract: Large language models (LLMs) have opened up new possibilities for intelligent agents, endowing them with human-like thinking and cognitive abilities. In this work, we delve into the potential of large language models (LLMs) in autonomous driving (AD). We introduce DriveMLM, an LLM-based AD framework that can perform close-loop autonomous driving in realistic simulators. To this end, (1) we bridge… ▽ More Large language models (LLMs) have opened up new possibilities for intelligent agents, endowing them with human-like thinking and cognitive abilities. In this work, we delve into the potential of large language models (LLMs) in autonomous driving (AD). We introduce DriveMLM, an LLM-based AD framework that can perform close-loop autonomous driving in realistic simulators. To this end, (1) we bridge the gap between the language decisions and the vehicle control commands by standardizing the decision states according to the off-the-shelf motion planning module. (2) We employ a multi-modal LLM (MLLM) to model the behavior planning module of a module AD system, which uses driving rules, user commands, and inputs from various sensors (e.g., camera, lidar) as input and makes driving decisions and provide explanations; This model can plug-and-play in existing AD systems such as Apollo for close-loop driving. (3) We design an effective data engine to collect a dataset that includes decision state and corresponding explanation annotation for model training and evaluation. We conduct extensive experiments and show that our model achieves 76.1 driving score on the CARLA Town05 Long, and surpasses the Apollo baseline by 4.7 points under the same settings, demonstrating the effectiveness of our model. We hope this work can serve as a baseline for autonomous driving with LLMs. Code and models shall be released at https://github.com/OpenGVLab/DriveMLM. △ Less

Submitted 25 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: Technical Report

arXiv:2312.06200 [pdf, ps, other]

Achieving the Fundamental Limit of Lossless Analog Compression via Polarization

Authors: Shuai Yuan, Liuquan Yao, Yuan Li, Huazi Zhang, Jun Wang, Wen Tong, Zhiming Ma

Abstract: In this paper, we study the lossless analog compression for i.i.d. nonsingular signals via the polarization-based framework. We prove that for nonsingular source, the error probability of maximum a posteriori (MAP) estimation polarizes under the Hadamard transform, which extends the polarization phenomenon to analog domain. Building on this insight, we propose partial Hadamard compression and deve… ▽ More In this paper, we study the lossless analog compression for i.i.d. nonsingular signals via the polarization-based framework. We prove that for nonsingular source, the error probability of maximum a posteriori (MAP) estimation polarizes under the Hadamard transform, which extends the polarization phenomenon to analog domain. Building on this insight, we propose partial Hadamard compression and develop the corresponding analog successive cancellation (SC) decoder. The proposed scheme consists of deterministic measurement matrices and non-iterative reconstruction algorithm, providing benefits in both space and computational complexity. Using the polarization of error probability, we prove that our approach achieves the information-theoretical limit for lossless analog compression developed by Wu and Verdu. △ Less

Submitted 19 January, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

Comments: 48 pages, 5 figures. This work was presented in part at the 2023 IEEE Global Communications Conference

arXiv:2311.13106 [pdf, other]

Ten issues of NetGPT

Authors: Wen Tong, Chenghui Peng, Tingting Yang, Fei Wang, Juan Deng, Rongpeng Li, Lu Yang, Honggang Zhang, Dong Wang, Ming Ai, Li Yang, Guangyi Liu, Yang Yang, Yao Xiao, Liexiang Yue, Wanfei Sun, Zexu Li, Wenwen Sun

Abstract: With the rapid development and application of foundation models (FMs), it is foreseeable that FMs will play an important role in future wireless communications. As current Artificial Intelligence (AI) algorithms applied in wireless networks are dedicated models that aim for different neural network architectures and objectives, drawbacks in aspects of generality, performance gain, management, coll… ▽ More With the rapid development and application of foundation models (FMs), it is foreseeable that FMs will play an important role in future wireless communications. As current Artificial Intelligence (AI) algorithms applied in wireless networks are dedicated models that aim for different neural network architectures and objectives, drawbacks in aspects of generality, performance gain, management, collaboration, etc. need to be conquered. In this paper, we define NetGPT (Network Generative Pre-trained Transformer) -- the foundation models for wireless communications, and summarize ten issues regarding design and application of NetGPT. △ Less

Submitted 21 November, 2023; originally announced November 2023.

arXiv:2311.08937 [pdf]

Ba6RE2Ti4O17 (RE= Nd, Sm,Gd, Dy-Yb): A family of Rare-earth based layered triangular lattice magnets

Authors: Fangyuan Song, Andi Liu, Qiao Chen, Jin Zhou, Jingxin Li, Wei Tong, Shun Wang, Yanhong Wang, Hongcheng Lu, Songliu Yuan, Hanjie Guo, Zhaoming Tian

Abstract: Rare-earth-based triangular-lattice magnets provide the fertile ground to explore the exotic quantum magnetic state. Herein, we report a new family of RE-based triangular-lattice magnets Ba6RE2Ti4O17(RE= rare earth ions) crystallized into the hexagonal structure with space group of P63 mmc, where magnetic rare earth ions form an ideal triangular lattice within the ab-plane and stack in an AA -type… ▽ More Rare-earth-based triangular-lattice magnets provide the fertile ground to explore the exotic quantum magnetic state. Herein, we report a new family of RE-based triangular-lattice magnets Ba6RE2Ti4O17(RE= rare earth ions) crystallized into the hexagonal structure with space group of P63 mmc, where magnetic rare earth ions form an ideal triangular lattice within the ab-plane and stack in an AA -type fashion along the c-axis. The low-temperature magnetic susceptibility results reveal all the serial compounds have the dominant antiferromagnetic interactions and an absence of magnetic ordering down to 1.8 K. The magnetization and electron spin resonance results indicate distinct magnetic anisotropy for the compounds with different RE ions. Moreover, Ba6Nd2Ti4O17 single crystal is successfully grown and it exhibits strong Ising like anisotropy with magnetic easy-axis perpendicular to the triangle-lattice plane, being a candidate to explore quantum spin liquid state with dominant Ising-type interaction. △ Less

Submitted 8 March, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

Comments: 20 pages, 8 figures

arXiv:2311.04320 [pdf, other]

Proprioceptive Invariant Robot State Estimation

Authors: Tzu-Yuan Lin, Tingjun Li, Wenzhe Tong, Maani Ghaffari

Abstract: This paper reports on developing a real-time invariant proprioceptive robot state estimation framework called DRIFT. A didactic introduction to invariant Kalman filtering is provided to make this cutting-edge symmetry-preserving approach accessible to a broader range of robotics applications. Furthermore, this work dives into the development of a proprioceptive state estimation framework for dead… ▽ More This paper reports on developing a real-time invariant proprioceptive robot state estimation framework called DRIFT. A didactic introduction to invariant Kalman filtering is provided to make this cutting-edge symmetry-preserving approach accessible to a broader range of robotics applications. Furthermore, this work dives into the development of a proprioceptive state estimation framework for dead reckoning that only consumes data from an onboard inertial measurement unit and kinematics of the robot, with two optional modules, a contact estimator and a gyro filter for low-cost robots, enabling a significant capability on a variety of robotics platforms to track the robot's state over long trajectories in the absence of perceptual data. Extensive real-world experiments using a legged robot, an indoor wheeled robot, a field robot, and a full-size vehicle, as well as simulation results with a marine robot, are provided to understand the limits of DRIFT. △ Less

Submitted 20 February, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

arXiv:2310.04826 [pdf, other]

doi 10.1145/3313831.3376436

Augmenting Static Visualizations with PapARVis Designer

Authors: Chen Zhu-Tian, Wai Tong, Qianwen Wang, Benjamin Bach, Huamin Qu

Abstract: This paper presents an authoring environment for augmenting static visualizations with virtual content in augmented reality. Augmenting static visualizations can leverage the best of both physical and digital worlds, but its creation currently involves different tools and devices, without any means to explicitly design and debug both static and virtual content simultaneously. To address these issu… ▽ More This paper presents an authoring environment for augmenting static visualizations with virtual content in augmented reality. Augmenting static visualizations can leverage the best of both physical and digital worlds, but its creation currently involves different tools and devices, without any means to explicitly design and debug both static and virtual content simultaneously. To address these issues, we design an environment that seamlessly integrates all steps of a design and deployment workflow through its main features: i) an extension to Vega, ii) a preview, and iii) debug hints that facilitate valid combinations of static and augmented content. We inform our design through a design space with four ways to augment static visualizations. We demonstrate the expressiveness of our tool through examples, including books, posters, projections, wall-sized visualizations. A user study shows high user satisfaction of our environment and confirms that participants can create augmented visualizations in an average of 4.63 minutes. △ Less

Submitted 10 May, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

arXiv:2309.02459 [pdf, other]

doi 10.21437/Interspeech.2023-1378

Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation

Authors: Jiaxu Zhu, Weinan Tong, Yaoxun Xu, Changhe Song, Zhiyong Wu, Zhao You, Dan Su, Dong Yu, Helen Meng

Abstract: Mapping two modalities, speech and text, into a shared representation space, is a research topic of using text-only data to improve end-to-end automatic speech recognition (ASR) performance in new domains. However, the length of speech representation and text representation is inconsistent. Although the previous method up-samples the text representation to align with acoustic modality, it may not… ▽ More Mapping two modalities, speech and text, into a shared representation space, is a research topic of using text-only data to improve end-to-end automatic speech recognition (ASR) performance in new domains. However, the length of speech representation and text representation is inconsistent. Although the previous method up-samples the text representation to align with acoustic modality, it may not match the expected actual duration. In this paper, we proposed novel representations match strategy through down-sampling acoustic representation to align with text modality. By introducing a continuous integrate-and-fire (CIF) module generating acoustic representations consistent with token length, our ASR model can learn unified representations from both modalities better, allowing for domain adaptation using text-only data of the target domain. Experiment results of new domain data demonstrate the effectiveness of the proposed method. △ Less

Submitted 7 October, 2023; v1 submitted 4 September, 2023; originally announced September 2023.

Comments: Proceedings of Interspeech. arXiv admin note: text overlap with arXiv:2309.01437

arXiv:2308.13789 [pdf]

Sensiverse: A dataset for ISAC study

Authors: Jiajin Luo, Baojian Zhou, Yang Yu, Ping Zhang, Xiaohui Peng, Jianglei Ma, Peiying Zhu, Jianmin Lu, Wen Tong

Abstract: In order to address the lack of applicable channel models for ISAC research and evaluation, we release Sensiverse, a dataset that can be used for ISAC research. In this paper, we present the method of generating Sensiverse, including the acquisition and formatting of the 3D scene models, the generation of the channel data and associations with Tx/Rx deployment. The file structure and usage of the… ▽ More In order to address the lack of applicable channel models for ISAC research and evaluation, we release Sensiverse, a dataset that can be used for ISAC research. In this paper, we present the method of generating Sensiverse, including the acquisition and formatting of the 3D scene models, the generation of the channel data and associations with Tx/Rx deployment. The file structure and usage of the dataset are also described, and finally the use of the dataset is illustrated with examples through the evaluation of use cases such as 3D environment reconstruction and moving targets. △ Less

Submitted 26 August, 2023; originally announced August 2023.

arXiv:2308.10492 [pdf, ps, other]

doi 10.1063/5.0166209

Huge magnetostriction in superconducting single-crystalline BaFe$_{1.908}$Ni$_{0.092}$As$_{2}$

Authors: Minjie Zhang, Jiating Wu, Ke Shi, Langsheng Ling, Wei Tong, Chuanying Xi, Li Pi, J. Wosnitza, Huiqian Luo, Zhaosheng Wang

Abstract: The performance of iron-based superconductors in high magnetic fields plays an important role for their practical application. In this work, we measured the magnetostriction and magnetization of BaFe$_{1.908}$Ni$_{0.092}$As$_{2}$ single crystals using pulsed magnetic fields up to 60 T and static magnetic fields up to 33 T, respectively. A huge longitudinal magnetostriction (of the order of 10… ▽ More The performance of iron-based superconductors in high magnetic fields plays an important role for their practical application. In this work, we measured the magnetostriction and magnetization of BaFe$_{1.908}$Ni$_{0.092}$As$_{2}$ single crystals using pulsed magnetic fields up to 60 T and static magnetic fields up to 33 T, respectively. A huge longitudinal magnetostriction (of the order of 10$ ^{-4} $) was observed in the direction of the twin boundaries. The magnetization measurements evidence a high critical-current density due to strong bulk pinning. By using magnetization data with an exponential flux-pinning model, we can reproduce the magnetostriction curves qualitatively. This result shows that the magnetostriction of BaFe$_{1.908}$Ni$_{0.092}$As$_{2}$ can be well explained by a flux-pinning-induced mechanism. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: 4 pages, 3 figures

Journal ref: Appl. Phys. Lett. 123, 072602 (2023)

arXiv:2307.06515 [pdf]

Nanotube ferroelectric tunnel junctions with giant tunneling electroresistance ratio

Authors: Jiu-Long Wang, Yi-Feng Zhao, Wen Xu, Jun-Ding Zheng, Ya-Ping Shao, Wen-Yi Tong, Chun-Gang Duan

Abstract: Low-dimensional ferroelectric tunnel junctions are appealing for the realization of nanoscale nonvolatile memory devices due to their inherent advantage of device miniaturization. Those based on current mechanisms still have restrictions including low tunneling electroresistance (TER) effects and complex heterostructures. Here, we introduce an entirely new TER mechanism to construct the nanotube f… ▽ More Low-dimensional ferroelectric tunnel junctions are appealing for the realization of nanoscale nonvolatile memory devices due to their inherent advantage of device miniaturization. Those based on current mechanisms still have restrictions including low tunneling electroresistance (TER) effects and complex heterostructures. Here, we introduce an entirely new TER mechanism to construct the nanotube ferroelectric tunnel junction with ferroelectric nanotubes as the tunneling region. When rolling a ferroelectric monolayer into a nanotube, due to the coexistence of its intrinsic ferroelectric polarization with the flexoelectric polarization induced by bending, there occurs metal-insulator transition depending on radiative polarization states. For the pristine monolayer, its out-of-plane polarization is tunable by an in-plane electric field, the conducting states of the ferroelectric nanotube can thus be tuned between metallic and insulating via axial electric means. Using α-In2Se3 as an example, our first-principles density functional theory calculations and nonequilibrium Green's function formalism confirm the feasibility of the TER mechanism and indicate an ultrahigh TER ratio exceeding 9.9*10^10% of the proposed nanotube ferroelectric tunnel junctions. Our findings provide a promising approach based on simple homogeneous structures for high density ferroelectric microelectronic devices with excellent ON/OFF performance. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: 15 pages, 5 figures

arXiv:2306.14082 [pdf, ps, other]

doi 10.1103/PhysRevB.107.245134

High-field NMR study of the spin correlations in the spin-cluster mineral Na$_2$Cu$_3$O(SO$_4$)$_3$

Authors: Long Ma, J. X. Li, L. S. Ling, Y. Y. Han, L. Zhang, L. Hu, W. Tong, C. Y. Xi, Li Pi

Abstract: We report NMR study on the spin correlations in the spin-cluster based mineral Na$_2$Cu$_3$O(SO$_4$)$_3$ with magnetic fields ranged from 1 T to 33 T. The long-range magnetic order is observed from both the sudden spectral broadening at $T_N$ and critical slowing down behavior in the temperature dependence of spin-lattice relaxation rates ($1/T_1(T)$). The hump behavior of $1/T_1(T)$ persists to… ▽ More We report NMR study on the spin correlations in the spin-cluster based mineral Na$_2$Cu$_3$O(SO$_4$)$_3$ with magnetic fields ranged from 1 T to 33 T. The long-range magnetic order is observed from both the sudden spectral broadening at $T_N$ and critical slowing down behavior in the temperature dependence of spin-lattice relaxation rates ($1/T_1(T)$). The hump behavior of $1/T_1(T)$ persists to $μ_0H=7.25$ T, above which a spin excitation gap is observed from the thermally activated temperature dependence of $1/T_1$. The gap size shows a linear field dependence, whose slope and intercept respectively yield an effective magnetic moment of 2.54 $μ_B$ and a 0.94 meV spin excitation gap under zero magnetic field. These results indicate the existence of short-range order and prominent easy-plane spin anisotropy, which are important for understanding the spin excitation spectrum in A$_2$Cu$_3$O(SO$_4$)$_3$. △ Less

Submitted 24 June, 2023; originally announced June 2023.

Comments: 5 pages, 4 figures

Journal ref: Phys. Rev. B 107, 245134 (2023)

arXiv:2306.09695 [pdf, other]

Bose-Einstein condensation of a two-magnon bound state in a spin-one triangular lattice

Authors: Jieming Sheng, Jia-Wei Mei, Le Wang, Wenrui Jiang, Lei Xu, Han Ge, Nan Zhao, Tiantian Li, Andrea Candini, Bin Xi, Jize Zhao, Ying Fu, Jiong Yang, Yuanzhu Zhang, Giorgio Biasiol, Shanmin Wang, Jinlong Zhu, Ping Miao, Xin Tong, Dapeng Yu, Richard Mole, Long Ma, Zhitao Zhang, Zhongwen Ouyang, Wei Tong , et al. (6 additional authors not shown)

Abstract: Interactions of collective excitations often lead to rich emergent phenomena in many-particle quantum systems. In ordered magnets, the elementary excitations are spin waves (magnons), which obey Bose-Einstein statistics. Similar to the Cooper pairs in superconductors, magnons can be paired into bound states under attractive interactions. Even more interestingly, the Zeeman coupling to a magnetic f… ▽ More Interactions of collective excitations often lead to rich emergent phenomena in many-particle quantum systems. In ordered magnets, the elementary excitations are spin waves (magnons), which obey Bose-Einstein statistics. Similar to the Cooper pairs in superconductors, magnons can be paired into bound states under attractive interactions. Even more interestingly, the Zeeman coupling to a magnetic field acts as a chemical potential that can tune the particle density through a quantum critical point (QCP), beyond which a ``hidden order'' is predicted to exist. However, experimental confirmation of this QCP and the associated new state of matter remain elusive. Here we report direct observation of the Bose-Einstein condensation (BEC) of the two-magnon bound state in Na$_2$BaNi(PO$_4$)$_2$. Comprehensive thermodynamic measurements confirmed the existence of a two-dimensional BEC-QCP at the saturation field. Inelastic neutron scattering experiments were performed to accurately establish the magnetic exchange model. An exact solution of the model found stable 2-magnon bound states that were further confirmed by an electron spin resonance (ESR) experiment, demonstrating that the QCP is due to the pair condensation and the phase below saturation field is the long-sought-after spin nematic (SN) phase. △ Less

Submitted 16 June, 2023; originally announced June 2023.

Comments: 53 pages, 28 figures

arXiv:2306.02851 [pdf, other]

Scene as Occupancy

Authors: Chonghao Sima, Wenwen Tong, Tai Wang, Li Chen, Silei Wu, Hanming Deng, Yi Gu, Lewei Lu, Ping Luo, Dahua Lin, Hongyang Li

Abstract: Human driver can easily describe the complex traffic scene by visual system. Such an ability of precise perception is essential for driver's planning. To achieve this, a geometry-aware representation that quantizes the physical 3D scene into structured grid map with semantic labels per cell, termed as 3D Occupancy, would be desirable. Compared to the form of bounding box, a key insight behind occu… ▽ More Human driver can easily describe the complex traffic scene by visual system. Such an ability of precise perception is essential for driver's planning. To achieve this, a geometry-aware representation that quantizes the physical 3D scene into structured grid map with semantic labels per cell, termed as 3D Occupancy, would be desirable. Compared to the form of bounding box, a key insight behind occupancy is that it could capture the fine-grained details of critical obstacles in the scene, and thereby facilitate subsequent tasks. Prior or concurrent literature mainly concentrate on a single scene completion task, where we might argue that the potential of this occupancy representation might obsess broader impact. In this paper, we propose OccNet, a multi-view vision-centric pipeline with a cascade and temporal voxel decoder to reconstruct 3D occupancy. At the core of OccNet is a general occupancy embedding to represent 3D physical world. Such a descriptor could be applied towards a wide span of driving tasks, including detection, segmentation and planning. To validate the effectiveness of this new representation and our proposed algorithm, we propose OpenOcc, the first dense high-quality 3D occupancy benchmark built on top of nuScenes. Empirical experiments show that there are evident performance gain across multiple tasks, e.g., motion planning could witness a collision rate reduction by 15%-58%, demonstrating the superiority of our method. △ Less

Submitted 26 June, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

Comments: Project link: https://github.com/OpenDriveLab/OccNet

arXiv:2305.12214 [pdf]

Ba9RE2(SiO4)6 (RE=Ho-Yb): A New Family of Rare-earth based Honeycomb Lattice Magnets

Authors: Andi Liu, Fangyuan Song, Zhaohu Li, Malik Ashtar, Yuqi Qin, Dingjun Liu, Zhengcai Xia, Jingxin Li, Zhitao Zhang, Wei Tong, Hanjie Guo, Zhaoming Tian

Abstract: Rare-earth (RE) based honeycomb-lattice materials with strong spin-orbit coupled Jeff=1/2 moments have attracted great interest as a platform to realize Kitaev quantum spin liquid (QSL) state. Herein, we report the discovery of a new family of RE based honeycomb-lattice magnets Ba9RE2(SiO4)6(RE=Ho-Yb), which crystallize into the rhombohedral structure with space group R-3. In these serial compound… ▽ More Rare-earth (RE) based honeycomb-lattice materials with strong spin-orbit coupled Jeff=1/2 moments have attracted great interest as a platform to realize Kitaev quantum spin liquid (QSL) state. Herein, we report the discovery of a new family of RE based honeycomb-lattice magnets Ba9RE2(SiO4)6(RE=Ho-Yb), which crystallize into the rhombohedral structure with space group R-3. In these serial compounds, magnetic RE3+ ions are arranged on a perfect honeycomb lattice within the ab-plane and stacked in the ABCABC-type fashion along the c-axis. All Ba9RE2(SiO4)6(RE=Ho-Yb) polycrystals exhibit the dominant antiferromagnetic interactions and absence of magnetic order down to 2 K. In combination with the magnetization and electron spin resonance (ESR) results, distinct anisotropic magnetic behaviors are proposed for compounds with different RE ions. Moreover, the synthesized Ba9Yb2Si6O24 single crystals show large magnetic frustration and no long-range magnetic ordering down to 0.15 K, being a possible QSL candidate state. These serial compounds are attractive for exploring the exotic magnetic phases of Kitaev materials with 4f electrons. △ Less

Submitted 20 May, 2023; originally announced May 2023.

Comments: 19 pages, 8 figures

arXiv:2305.05861 [pdf]

Template-based eukaryotic genome editing directed by SviCas3

Authors: Wang-Yu Tong, Yong Li, Shou-Dong Ye, An-Jing Wang, Yan-Yan Tang, Mei-Li Li, Zhong-Fan Yu, Ting-Ting Xia, Qing-Yang Liu, Si-Qi Zhu

Abstract: RNA-guided gene editing based on the CRISPR-Cas system is currently the most effective genome editing technique. Here, we report that the SviCas3 from the subtype I-B-Svi Cas system in Streptomyces virginiae IBL14 is an RNA-guided and DNA-guided DNA endonuclease suitable for the HDR-directed gene and/or base editing of eukaryotic cell genomes. The genome editing efficiency of SviCas3 guided by DNA… ▽ More RNA-guided gene editing based on the CRISPR-Cas system is currently the most effective genome editing technique. Here, we report that the SviCas3 from the subtype I-B-Svi Cas system in Streptomyces virginiae IBL14 is an RNA-guided and DNA-guided DNA endonuclease suitable for the HDR-directed gene and/or base editing of eukaryotic cell genomes. The genome editing efficiency of SviCas3 guided by DNA is no less than that of SviCas3 guided by RNA. In particular, t-DNA, as a template and a guide, does not require a proto-spacer-adjacent motif, demonstrating that CRISPR, as the basis for crRNA design, is not required for the SviCas3-mediated gene and base editing. This discovery will broaden our understanding of enzyme diversity in CRISPR-Cas systems, will provide important tools for the creation and modification of living things and the treatment of human genetic diseases, and will usher in a new era of DNA-guided gene editing and base editing. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: 113 pages, 12 figures and 4 tables

arXiv:2305.05093 [pdf]

Prokaryotic genome editing based on the subtype I-B-Svi CRISPR-Cas system

Authors: Wang-Yu Tong, De-Xiang Yong, Xin Xu, Cai-Hua Qiu, Yan Zhang, Xing-Wang Yang, Ting-Ting Xia, Qing-Yang Liu, Su-Li Cao, Yan Sun, Xue Li

Abstract: Type I CRISPR-Cas systems are the most common among six types of CRISPR-Cas systems, however, non-self-targeting genome editing based on a single Cas3 of type I CRISPR-Cas systems has not been reported. Here, we present the subtype I-B-Svi CRISPR-Cas system (with three confirmed CRISPRs and a cas gene cluster) and genome editing based on this system found in Streptomyces virginiae IBL14. Important… ▽ More Type I CRISPR-Cas systems are the most common among six types of CRISPR-Cas systems, however, non-self-targeting genome editing based on a single Cas3 of type I CRISPR-Cas systems has not been reported. Here, we present the subtype I-B-Svi CRISPR-Cas system (with three confirmed CRISPRs and a cas gene cluster) and genome editing based on this system found in Streptomyces virginiae IBL14. Importantly, like the animal-derived bacterial protein SpCas9 (1368 amino-acids), the single, compact, non-animal-derived bacterial protein SviCas3 (771 amino-acids) can also direct template-based microbial genome editing through the target cell's own homology-directed repair system, which breaks the view that the genome editing based on type I CRISPR-Cas systems requires a full Cascade. Notably, no off-target changes or indel-formation were detected in the analysis of potential off-target sites. This discovery broadens our understanding of the diversity of type I CRISPR-Cas systems and will facilitate new developments in genome editing tools. △ Less

Submitted 8 May, 2023; originally announced May 2023.

Comments: 113 pages, 10 figures, and 6 tables

arXiv:2304.01485 [pdf]

Quasi-invariance of scattering properties of multicellular cyanobacterial aggregates

Authors: Chunyang Ma, Qian Lu, Yen Wah Tong

Abstract: The radiative/scattering properties of cyanobacterial aggregates are crucial for understanding microalgal cultivation. This study analyzed scattering matrix elements and cross-sections of cyanobacterial aggregates using the discrete dipole approximation (DDA) method. The stochastic random walk approach was adopted to generate a force-biased packing model for multicellular filamentous cyanobacteria… ▽ More The radiative/scattering properties of cyanobacterial aggregates are crucial for understanding microalgal cultivation. This study analyzed scattering matrix elements and cross-sections of cyanobacterial aggregates using the discrete dipole approximation (DDA) method. The stochastic random walk approach was adopted to generate a force-biased packing model for multicellular filamentous cyanobacterial aggregates. The effects of shape and size of multicellular cyanobacterial aggregates on their scattering properties were investigated by this work. The possibility of invariance in the scattering properties was explored for cyanobacterial aggregates. The invariance interpretation intuitively represented the radiative property characteristics of the aggregates. The presented results show that the ratios of the matrix elements of cyanobacterial aggregates are nearly shape, size, and wavelength invariant. The extinction and absorption cross-sections (EACSs) per unit volume were shape and approximate size invariance of cyanobacterial aggregates, respectively. The absorption cross-section of aggregates is not merely a volumetric phenomenon for aggregates that exceed a certain size. Furthermore, the absorption cross-sections per unit volume are independent of the volumetric distribution of the microalgae cells. The invariance interpretation presents crucial characteristics of the scattering properties of cyanobacterial aggregates. The existence of invariance greatly improves our understanding of the scattering properties of microalgal aggregates. The scattering properties of microalgal aggregates are the most critical aspects of light propagation in the design, optimization, and operation of photobioreactors. △ Less

Submitted 3 April, 2023; originally announced April 2023.

Comments: 30 pages, 11 figures

arXiv:2303.10340 [pdf, other]

3D Data Augmentation for Driving Scenes on Camera

Authors: Wenwen Tong, Jiangwei Xie, Tianyu Li, Hanming Deng, Xiangwei Geng, Ruoyi Zhou, Dingchen Yang, Bo Dai, Lewei Lu, Hongyang Li

Abstract: Driving scenes are extremely diverse and complicated that it is impossible to collect all cases with human effort alone. While data augmentation is an effective technique to enrich the training data, existing methods for camera data in autonomous driving applications are confined to the 2D image plane, which may not optimally increase data diversity in 3D real-world scenarios. To this end, we prop… ▽ More Driving scenes are extremely diverse and complicated that it is impossible to collect all cases with human effort alone. While data augmentation is an effective technique to enrich the training data, existing methods for camera data in autonomous driving applications are confined to the 2D image plane, which may not optimally increase data diversity in 3D real-world scenarios. To this end, we propose a 3D data augmentation approach termed Drive-3DAug, aiming at augmenting the driving scenes on camera in the 3D space. We first utilize Neural Radiance Field (NeRF) to reconstruct the 3D models of background and foreground objects. Then, augmented driving scenes can be obtained by placing the 3D objects with adapted location and orientation at the pre-defined valid region of backgrounds. As such, the training database could be effectively scaled up. However, the 3D object modeling is constrained to the image quality and the limited viewpoints. To overcome these problems, we modify the original NeRF by introducing a geometric rectified loss and a symmetric-aware training strategy. We evaluate our method for the camera-only monocular 3D detection task on the Waymo and nuScences datasets. The proposed data augmentation approach contributes to a gain of 1.7% and 1.4% in terms of detection accuracy, on Waymo and nuScences respectively. Furthermore, the constructed 3D models serve as digital driving assets and could be recycled for different detectors or other 3D perception tasks. △ Less

Submitted 18 March, 2023; originally announced March 2023.

arXiv:2302.14536 [pdf, other]

On the Road to 6G: Visions, Requirements, Key Technologies and Testbeds

Authors: Cheng-Xiang Wang, Xiaohu You, Xiqi Gao, Xiuming Zhu, Zixin Li, Chuan Zhang, Haiming Wang, Yongming Huang, Yunfei Chen, Harald Haas, John S. Thompson, Erik G. Larsson, Marco Di Renzo, Wen Tong, Peiying Zhu, Xuemin, Shen, H. Vincent Poor, Lajos Hanzo

Abstract: Fifth generation (5G) mobile communication systems have entered the stage of commercial development, providing users with new services and improved user experiences as well as offering a host of novel opportunities to various industries. However, 5G still faces many challenges. To address these challenges, international industrial, academic, and standards organizations have commenced research on s… ▽ More Fifth generation (5G) mobile communication systems have entered the stage of commercial development, providing users with new services and improved user experiences as well as offering a host of novel opportunities to various industries. However, 5G still faces many challenges. To address these challenges, international industrial, academic, and standards organizations have commenced research on sixth generation (6G) wireless communication systems. A series of white papers and survey papers have been published, which aim to define 6G in terms of requirements, application scenarios, key technologies, etc. Although ITU-R has been working on the 6G vision and it is expected to reach a consensus on what 6G will be by mid-2023, the related global discussions are still wide open and the existing literature has identified numerous open issues. This paper first provides a comprehensive portrayal of the 6G vision, technical requirements, and application scenarios, covering the current common understanding of 6G. Then, a critical appraisal of the 6G network architecture and key technologies is presented. Furthermore, existing testbeds and advanced 6G verification platforms are detailed for the first time. In addition, future research directions and open challenges are identified for stimulating the on-going global debate. Finally, lessons learned to date concerning 6G networks are discussed. △ Less

Submitted 28 February, 2023; originally announced February 2023.

arXiv:2302.13549 [pdf]

Random-Order Enumeration for Self-Reducible NP-Problems

Authors: Pengyu Chen, Dongjing Miao, Weitian Tong, Zizheng Guo, Jianzhong Li, Zhipeng Cai

Abstract: In plenty of data analysis tasks, a basic and time-consuming process is to produce a large number of solutions and feed them into downstream processing. Various enumeration algorithms have been developed for this purpose. An enumeration algorithm produces all solutions of a problem instance without repetition. To be a statistically meaningful representation of the solution space, solutions are req… ▽ More In plenty of data analysis tasks, a basic and time-consuming process is to produce a large number of solutions and feed them into downstream processing. Various enumeration algorithms have been developed for this purpose. An enumeration algorithm produces all solutions of a problem instance without repetition. To be a statistically meaningful representation of the solution space, solutions are required to be enumerated in uniformly random order. This paper studies a set of self-reducible NP-problems in three hierarchies, where the problems are polynomially countable ($Sr_{NP}^{FP}$), admit FPTAS ($Sr_{NP}^{FPTAS}$), and admit FPRAS ($Sr_{NP}^{FPRAS}$), respectively. The trivial algorithm based on a (almost) uniform generator is in fact inefficient. We provide a new insight that the (almost) uniform generator is not the end of the story. More efficient algorithmic frameworks are proposed to enumerate solutions in uniformly random order for problems in these three hierarchies. (1) For problems in $Sr_{NP}^{FP}$, we show a random-order enumeration algorithm with polynomial delay (PDREnum); (2) For problems in $Sr_{NP}^{FPTAS}$, we show a Las Vegas random-order enumeration algorithm with expected polynomial delay (PDLVREnum); (3) For problems in $Sr_{NP}^{FPRAS}$, we devise a fully polynomial delay Atlantic City random-order enumeration algorithm with expected delay polynomial in the input size and the given error probability $δ$ (FPACREnum), which has a probability of at least $1-δ$ becoming a Las Vegas random-order enumeration algorithm. Finally, to further improve the efficiency of the random-order enumeration algorithms, based on the master/slave paradigm, we present a parallelization with 1.5-optimal enumeration delay and running time, along with the theoretical analysis. △ Less

Submitted 27 February, 2023; originally announced February 2023.

arXiv:2302.09591 [pdf]

Insight into China's Economically Motivated Adulteration Risk in Online Raw Agricultural Product Sales

Authors: Hengyu Liu, Wen Tong

Abstract: Uncertainty in quality and the inspectors' imperfect testing capability leave raw agricultural products (e.g., fresh produce, seafood, livestock and poultry products, etc.) wide open to economically motivated adulteration (EMA), and the strong demand for online shopping of these products in China makes this situation even worse. In this paper, we develop a game-theoretic framework to investigate o… ▽ More Uncertainty in quality and the inspectors' imperfect testing capability leave raw agricultural products (e.g., fresh produce, seafood, livestock and poultry products, etc.) wide open to economically motivated adulteration (EMA), and the strong demand for online shopping of these products in China makes this situation even worse. In this paper, we develop a game-theoretic framework to investigate online raw agricultural product sellers' preemptive EMA behavior on an ecommerce platform (EP). Particularly, the sellers differ from each other in the original quality of their products. We characterize the sellers' equilibrium pricing and adulteration decisions and the EP's optimal take rate decision, and analyze how the sampling inspections and adulteration penalty jointly impact these decisions. Moreover, we investigate three managerial levers, such as claiming a higher-than-law-requires penalty, that the administrative departments or the EP can use to deter EMA. Finally, we use the real-word data from Taobao.com to calibrate our model and derive more managerial insights from the analytical findings. We find that the heterogenous sellers' adulteration decisions are symmetric and their ex-post pricing decisions lead them to evenly share the market on the EP. Interestingly, we show that the EP's higher take rate will inhibit the sellers' adulteration behavior. However, the profit-maximizing EP may indulge the sellers' adulteration behavior by intentionally decreasing this rate. Our results highlight a penalty-inspection-centered approach as essential to combat EMA, and the three levels can play a role as supplements to this approach under certain conditions. △ Less

Submitted 19 February, 2023; originally announced February 2023.

arXiv:2302.08743

Multi-View Clustering from the Perspective of Mutual Information

Authors: Fu Lele, Zhang Lei, Wang Tong, Chen Chuan, Zhang Chuanfu, Zheng Zibin

Abstract: Exploring the complementary information of multi-view data to improve clustering effects is a crucial issue in multi-view clustering. In this paper, we propose a novel model based on information theory termed Informative Multi-View Clustering (IMVC), which extracts the common and view-specific information hidden in multi-view data and constructs a clustering-oriented comprehensive representation.… ▽ More Exploring the complementary information of multi-view data to improve clustering effects is a crucial issue in multi-view clustering. In this paper, we propose a novel model based on information theory termed Informative Multi-View Clustering (IMVC), which extracts the common and view-specific information hidden in multi-view data and constructs a clustering-oriented comprehensive representation. More specifically, we concatenate multiple features into a unified feature representation, then pass it through a encoder to retrieve the common representation across views. Simultaneously, the features of each view are sent to a encoder to produce a compact view-specific representation, respectively. Thus, we constrain the mutual information between the common representation and view-specific representations to be minimal for obtaining multi-level information. Further, the common representation and view-specific representation are spliced to model the refined representation of each view, which is fed into a decoder to reconstruct the initial data with maximizing their mutual information. In order to form a comprehensive representation, the common representation and all view-specific representations are concatenated. Furthermore, to accommodate the comprehensive representation better for the clustering task, we maximize the mutual information between an instance and its k-nearest neighbors to enhance the intra-cluster aggregation, thus inducing well separation of different clusters at the overall aspect. Finally, we conduct extensive experiments on six benchmark datasets, and the experimental results indicate that the proposed IMVC outperforms other methods. △ Less

Submitted 29 May, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

Comments: We think the paper writing isn't good enough, so we would like to withdraw the paper and renew the writing manner

arXiv:2302.01966 [pdf, other]

Towards an Understanding of Distributed Asymmetric Collaborative Visualization on Problem-solving

Authors: Wai Tong, Meng Xia, Kam Kwai Wong, Doug A. Bowman, Ting-Chuen Pong, Huamin Qu, Yalong Yang

Abstract: This paper provided empirical knowledge of the user experience for using collaborative visualization in a distributed asymmetrical setting through controlled user studies. With the ability to access various computing devices, such as Virtual Reality (VR) head-mounted displays, scenarios emerge when collaborators have to or prefer to use different computing environments in different places. However… ▽ More This paper provided empirical knowledge of the user experience for using collaborative visualization in a distributed asymmetrical setting through controlled user studies. With the ability to access various computing devices, such as Virtual Reality (VR) head-mounted displays, scenarios emerge when collaborators have to or prefer to use different computing environments in different places. However, we still lack an understanding of using VR in an asymmetric setting for collaborative visualization. To get an initial understanding and better inform the designs for asymmetric systems, we first conducted a formative study with 12 pairs of participants. All participants collaborated in asymmetric (PC-VR) and symmetric settings (PC-PC and VR-VR). We then improved our asymmetric design based on the key findings and observations from the first study. Another ten pairs of participants collaborated with enhanced PC-VR and PC-PC conditions in a follow-up study. We found that a well-designed asymmetric collaboration system could be as effective as a symmetric system. Surprisingly, participants using PC perceived less mental demand and effort in the asymmetric setting (PC-VR) compared to the symmetric setting (PC-PC). We provided fine-grained discussions about the trade-offs between different collaboration settings. △ Less

Submitted 3 February, 2023; originally announced February 2023.

Comments: 11 pages, 12 figures, accepted at IEEE VR 2023

arXiv:2301.00816 [pdf]

Thermo-optic phase shifter based on hydrogen-doped indium oxide microheater

Authors: Weiyu Tong, Erqi Yang, Yu Pang, Haobo Yang, Xin Qian, Ronggui Yang, Bin Hu, Jianji Dong, Xinliang Zhang

Abstract: Thermo-optic (TO) phase shifters are very fundamental units in large-scale active silicon photonic integrated circuits (PICs). However, due to the limitation of microheater materials with a trade-off between heating efficiency and absorption loss, designs reported so far typically suffer from slow response time, high power consumption, low yields, and so on. Here, we demonstrate an energy-efficien… ▽ More Thermo-optic (TO) phase shifters are very fundamental units in large-scale active silicon photonic integrated circuits (PICs). However, due to the limitation of microheater materials with a trade-off between heating efficiency and absorption loss, designs reported so far typically suffer from slow response time, high power consumption, low yields, and so on. Here, we demonstrate an energy-efficient, fast-response, and low-loss TO phase shifter by introducing hydrogen-doped indium oxide (IHO) films as microheater, and the optimized electron concentration with enhanced mobility endows the IHO high conductivity as well as high near-infrared (NIR) transparency, which allow it to directly contact the silicon waveguide without any insulating layer for efficient tuning and fast response. The TO phase shifter achieves a sub-microsecond response time (970 ns/980 ns) with a π phase shift power consumption of 9.6 mW. And the insertion loss introduced by the IHO microheater is ~ 0.5 dB. The proposed IHO-based microheaters with compatible processing technology illustrate the great potential of such material in the application of large-scale silicon PICs. △ Less

Submitted 2 January, 2023; originally announced January 2023.

Comments: 10 pages, 4 figures, journal

arXiv:2211.06769 [pdf, other]

Realistic Bokeh Effect Rendering on Mobile GPUs, Mobile AI & AIM 2022 challenge: Report

Authors: Andrey Ignatov, Radu Timofte, Jin Zhang, Feng Zhang, Gaocheng Yu, Zhe Ma, Hongbin Wang, Minsu Kwon, Haotian Qian, Wentao Tong, Pan Mu, Ziping Wang, Guangjing Yan, Brian Lee, Lei Fei, Huaijin Chen, Hyebin Cho, Byeongjun Kwon, Munchurl Kim, Mingyang Qian, Huixin Ma, Yanan Li, Xiaotao Wang, Lei Lei

Abstract: As mobile cameras with compact optics are unable to produce a strong bokeh effect, lots of interest is now devoted to deep learning-based solutions for this task. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based bokeh effect rendering approach that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale EBB!… ▽ More As mobile cameras with compact optics are unable to produce a strong bokeh effect, lots of interest is now devoted to deep learning-based solutions for this task. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based bokeh effect rendering approach that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale EBB! bokeh dataset consisting of 5K shallow / wide depth-of-field image pairs captured using the Canon 7D DSLR camera. The runtime of the resulting models was evaluated on the Kirin 9000's Mali GPU that provides excellent acceleration results for the majority of common deep learning ops. A detailed description of all models developed in this challenge is provided in this paper. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2211.03885; text overlap with arXiv:2105.07809, arXiv:2211.04470, arXiv:2211.05256, arXiv:2211.05910

arXiv:2211.05440 [pdf, other]

Reliable Extraction of Semantic Information and Rate of Innovation Estimation for Graph Signals

Authors: Mert Kalfa, Sadik Yagiz Yetim, Arda Atalik, Mehmetcan Gok, Yiqun Ge, Rong Li, Wen Tong, Tolga Mete Duman, Orhan Arikan

Abstract: Semantic signal processing and communications are poised to play a central part in developing the next generation of sensor devices and networks. A crucial component of a semantic system is the extraction of semantic signals from the raw input signals, which has become increasingly tractable with the recent advances in machine learning (ML) and artificial intelligence (AI) techniques. The accurate… ▽ More Semantic signal processing and communications are poised to play a central part in developing the next generation of sensor devices and networks. A crucial component of a semantic system is the extraction of semantic signals from the raw input signals, which has become increasingly tractable with the recent advances in machine learning (ML) and artificial intelligence (AI) techniques. The accurate extraction of semantic signals using the aforementioned ML and AI methods, and the detection of semantic innovation for scheduling transmission and/or storage events are critical tasks for reliable semantic signal processing and communications. In this work, we propose a reliable semantic information extraction framework based on our previous work on semantic signal representations in a hierarchical graph-based structure. The proposed framework includes a time integration method to increase fidelity of ML outputs in a class-aware manner, a graph-edit-distance based metric to detect innovation events at the graph-level and filter out sporadic errors, and a Hidden Markov Model (HMM) to produce smooth and reliable graph signals. The proposed methods within the framework are demonstrated individually and collectively through simulations and case studies based on real-world computer vision examples. △ Less

Submitted 10 November, 2022; originally announced November 2022.

arXiv:2210.14426 [pdf]

Liquid Metal Printed Ultrathin Oxides for Monolayer WS2 Top-Gate Transistors

Authors: Yiyu Zhang, Dasari Venkatakrishnarao, Michel Bosman, Wei Fu, Sarthak Das, Fabio Bussolotti, Rainer Lee, Siew Lang Teo, Ding Huang, Ivan Verzhbitskiy, Zhuojun Jiang, Zhuoling Jiang, Jian Wei Chai, Shi Wun Tong, Zi-En Ooi, Calvin Pei Yu Wong, Yee Sin Ang, Kuan Eng Johnson Goh, Chit Siong Lau

Abstract: Two-dimensional (2D) semiconductors are promising channel materials for continued downscaling of complementary metal-oxide-semiconductor (CMOS) logic circuits. However, their full potential continues to be limited by a lack of scalable high-k dielectrics that can achieve atomically smooth interfaces, small equivalent oxide thicknesses (EOT), excellent gate control, and low leakage currents. Here,… ▽ More Two-dimensional (2D) semiconductors are promising channel materials for continued downscaling of complementary metal-oxide-semiconductor (CMOS) logic circuits. However, their full potential continues to be limited by a lack of scalable high-k dielectrics that can achieve atomically smooth interfaces, small equivalent oxide thicknesses (EOT), excellent gate control, and low leakage currents. Here, we report liquid metal printed ultrathin and scalable Ga2O3 dielectric for 2D electronics and electro-optical devices. We directly visualize the atomically smooth Ga2O3/WS2 interfaces enabled by the conformal nature of liquid metal printing. We demonstrate atomic layer deposition compatibility with high-k Ga2O3/HfO2 top-gate dielectric stacks on chemical vapour deposition grown monolayer WS2, achieving EOTs of ~1 nm and subthreshold swings down to 84.9 mV/dec. Gate leakage currents are well within requirements for ultra-scaled low-power logic circuits. Our results show that liquid metal printed oxides can bridge a crucial gap in scalable dielectric integration of 2D materials for next-generation nano-electronics. △ Less

Submitted 25 October, 2022; originally announced October 2022.

arXiv:2209.15140 [pdf, other]

Fully Proprioceptive Slip-Velocity-Aware State Estimation for Mobile Robots via Invariant Kalman Filtering and Disturbance Observer

Authors: Xihang Yu, Sangli Teng, Theodor Chakhachiro, Wenzhe Tong, Tingjun Li, Tzu-Yuan Lin, Sarah Koehler, Manuel Ahumada, Jeffrey M. Walls, Maani Ghaffari

Abstract: This paper develops a novel slip estimator using the invariant observer design theory and Disturbance Observer (DOB). The proposed state estimator for mobile robots is fully proprioceptive and combines data from an inertial measurement unit and body velocity within a Right Invariant Extended Kalman Filter (RI-EKF). By embedding the slip velocity into $\mathrm{SE}_3(3)$ matrix Lie group, the develo… ▽ More This paper develops a novel slip estimator using the invariant observer design theory and Disturbance Observer (DOB). The proposed state estimator for mobile robots is fully proprioceptive and combines data from an inertial measurement unit and body velocity within a Right Invariant Extended Kalman Filter (RI-EKF). By embedding the slip velocity into $\mathrm{SE}_3(3)$ matrix Lie group, the developed DOB-based RI-EKF provides real-time velocity and slip velocity estimates on different terrains. Experimental results using a Husky wheeled robot confirm the mathematical derivations and effectiveness of the proposed method in estimating the observable state variables. Open-source software is available for download and reproducing the presented results. △ Less

Submitted 30 September, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

Comments: The work will be presented in IROS2023. github repository at https://github.com/UMich-CURLY/slip_detection_DOB. arXiv admin note: text overlap with arXiv:1805.10410 by other authors

arXiv:2208.10603 [pdf, other]

Exploring Interactions with Printed Data Visualizations in Augmented Reality

Authors: Wai Tong, Zhutian Chen, Meng Xia, Leo Yu-Ho Lo, Linping Yuan, Benjamin Bach, Huamin Qu

Abstract: This paper presents a design space of interaction techniques to engage with visualizations that are printed on paper and augmented through Augmented Reality. Paper sheets are widely used to deploy visualizations and provide a rich set of tangible affordances for interactions, such as touch, folding, tilting, or stacking. At the same time, augmented reality can dynamically update visualization cont… ▽ More This paper presents a design space of interaction techniques to engage with visualizations that are printed on paper and augmented through Augmented Reality. Paper sheets are widely used to deploy visualizations and provide a rich set of tangible affordances for interactions, such as touch, folding, tilting, or stacking. At the same time, augmented reality can dynamically update visualization content to provide commands such as pan, zoom, filter, or detail on demand. This paper is the first to provide a structured approach to mapping possible actions with the paper to interaction commands. This design space and the findings of a controlled user study have implications for future designs of augmented reality systems involving paper sheets and visualizations. Through workshops (N=20) and ideation, we identified 81 interactions that we classify in three dimensions: 1) commands that can be supported by an interaction, 2) the specific parameters provided by an (inter)action with paper, and 3) the number of paper sheets involved in an interaction. We tested user preference and viability of 11 of these interactions with a prototype implementation in a controlled study (N=12, HoloLens 2) and found that most of the interactions are intuitive and engaging to use. We summarized interactions (e.g., tilt to pan) that have strong affordance to complement "point" for data exploration, physical limitations and properties of paper as a medium, cases requiring redundancy and shortcuts, and other implications for design. △ Less

Submitted 22 August, 2022; originally announced August 2022.

Comments: 11 pages, 9 figures, 1 table, accepted at IEEE VIS 2022

arXiv:2207.11238 [pdf]

Improved lightweight identification of agricultural diseases based on MobileNetV3

Authors: Yuhang Jiang, Wenping Tong

Abstract: At present, the identification of agricultural pests and diseases has the problem that the model is not lightweight enough and difficult to apply. Based on MobileNetV3, this paper introduces the Coordinate Attention block. The parameters of MobileNetV3-large are reduced by 22%, the model size is reduced by 19.7%, and the accuracy is improved by 0.92%. The parameters of MobileNetV3-small are reduce… ▽ More At present, the identification of agricultural pests and diseases has the problem that the model is not lightweight enough and difficult to apply. Based on MobileNetV3, this paper introduces the Coordinate Attention block. The parameters of MobileNetV3-large are reduced by 22%, the model size is reduced by 19.7%, and the accuracy is improved by 0.92%. The parameters of MobileNetV3-small are reduced by 23.4%, the model size is reduced by 18.3%, and the accuracy is increased by 0.40%. In addition, the improved MobileNetV3-small was migrated to Jetson Nano for testing. The accuracy increased by 2.48% to 98.31%, and the inference speed increased by 7.5%. It provides a reference for deploying the agricultural pest identification model to embedded devices. △ Less

Submitted 19 July, 2022; originally announced July 2022.

Comments: Accepted by CAIBDA 2022

arXiv:2206.06897 [pdf, other]

On the Message Passing Efficiency of Polar and Low-Density Parity-Check Decoders

Authors: Dawei Yin, Yuan Li, Xianbin Wang, Jiajie Tong, Huazi Zhang, Jun Wang, Guanghui Wang, Jun Chen, Guiying Yan, Zhiming Ma, Wen Tong

Abstract: This study focuses on the efficiency of message-passing-based decoding algorithms for polar and low-density parity-check (LDPC) codes. Both successive cancellation (SC) and belief propagation (BP) decoding algorithms are studied {in} the message-passing framework. Counter-intuitively, SC decoding demonstrates the highest decoding efficiency, although it was considered a weak decoder {in terms of}… ▽ More This study focuses on the efficiency of message-passing-based decoding algorithms for polar and low-density parity-check (LDPC) codes. Both successive cancellation (SC) and belief propagation (BP) decoding algorithms are studied {in} the message-passing framework. Counter-intuitively, SC decoding demonstrates the highest decoding efficiency, although it was considered a weak decoder {in terms of} error-correction performance. We analyze the complexity-performance tradeoff to dynamically track the decoding efficiency, where the complexity is measured by the number of messages passed (NMP), and the performance is measured by the statistical distance to the maximum a posteriori (MAP) estimate. This study offers a new insight into the contribution of each message passed in decoding, and compares various decoding algorithms on a message-by-message level. The analysis corroborates recent results on terabits-per-second polar SC decoders, and might shed light on better scheduling strategies. △ Less

Submitted 20 April, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

arXiv:2205.14407 [pdf, ps, other]

An efficient polynomial-time approximation scheme for parallel multi-stage open shops

Authors: Jianming Dong, Ruyan Jin, Guohui Lin, Bing Su, Weitian Tong, Yao Xu

Abstract: Various new scheduling problems have been arising from practical production processes and spawning new research areas in the scheduling field. We study the parallel multi-stage open shops problem, which generalizes the classic open shop scheduling and parallel machine scheduling problems. Given m identical k-stage open shops and a set of n jobs, we aim to process all jobs on these open shops with… ▽ More Various new scheduling problems have been arising from practical production processes and spawning new research areas in the scheduling field. We study the parallel multi-stage open shops problem, which generalizes the classic open shop scheduling and parallel machine scheduling problems. Given m identical k-stage open shops and a set of n jobs, we aim to process all jobs on these open shops with the minimum makespan, i.e., the completion time of the last job, under the constraint that job preemption is not allowed. We present an efficient polynomial-time approximation scheme (EPTAS) for the case when both m and k are constant. The main idea for our EPTAS is the combination of several categorization, scaling, and linear programming rounding techniques. Jobs and/or operations are first scaled and then categorized carefully into multiple types so that different types of jobs and/or operations are scheduled appropriately without increasing the makespan too much. △ Less

Submitted 28 May, 2022; originally announced May 2022.

arXiv:2205.06523 [pdf, ps, other]

Deterministic Identification over Channels without CSI

Authors: Yuan Li, Xianbin Wang, Huazi Zhang, Jun Wang, Wen Tong, Guiying Yan, Zhiming Ma

Abstract: Identification capacities of randomized and deterministic identification were proved to exceed channel capacity for Gaussian channels \emph{with} channel side information (CSI). In this work, we extend deterministic identification to the block fading channels without CSI by applying identification codes for both channel estimation and user identification. We prove that identification capacity is a… ▽ More Identification capacities of randomized and deterministic identification were proved to exceed channel capacity for Gaussian channels \emph{with} channel side information (CSI). In this work, we extend deterministic identification to the block fading channels without CSI by applying identification codes for both channel estimation and user identification. We prove that identification capacity is asymptotically higher than transmission capacity even in the absence of CSI. And we also analyze the finite-length performance theoretically and numerically. The simulation results verify the feasibility of the proposed blind deterministic identification in finite blocklength regime. △ Less

Submitted 11 August, 2022; v1 submitted 13 May, 2022; originally announced May 2022.

Showing 1–50 of 150 results for author: Tong, W