subscribe to arXiv mailings

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, Ajinkya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. More details can be found on the project website https://robotics-transformer-x.github.io. △ Less

Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: Project website: https://robotics-transformer-x.github.io

arXiv:2310.08225 [pdf, other]

Fast Word Error Rate Estimation Using Self-Supervised Representations For Speech And Text

Authors: Chanho Park, Chengsong Lu, Mingjie Chen, Thomas Hain

Abstract: The quality of automatic speech recognition (ASR) is typically measured by word error rate (WER). WER estimation is a task aiming to predict the WER of an ASR system, given a speech utterance and a transcription. This task has gained increasing attention while advanced ASR systems are trained on large amounts of data. In this case, WER estimation becomes necessary in many scenarios, for example, s… ▽ More The quality of automatic speech recognition (ASR) is typically measured by word error rate (WER). WER estimation is a task aiming to predict the WER of an ASR system, given a speech utterance and a transcription. This task has gained increasing attention while advanced ASR systems are trained on large amounts of data. In this case, WER estimation becomes necessary in many scenarios, for example, selecting training data with unknown transcription quality or estimating the testing performance of an ASR system without ground truth transcriptions. Facing large amounts of data, the computation efficiency of a WER estimator becomes essential in practical applications. However, previous works usually did not consider it as a priority. In this paper, a Fast WER estimator (Fe-WER) using self-supervised learning representation (SSLR) is introduced. The estimator is built upon SSLR aggregated by average pooling. The results show that Fe-WER outperformed the e-WER3 baseline relatively by 19.69% and 7.16% on Ted-Lium3 in both evaluation metrics of root mean square error and Pearson correlation coefficient, respectively. Moreover, the estimation weighted by duration was 10.43% when the target was 10.88%. Lastly, the inference speed was about 4x in terms of a real-time factor. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: 5 pages

arXiv:2310.07697 [pdf, other]

ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation

Authors: Bo Peng, Xinyuan Chen, Yaohui Wang, Chaochao Lu, Yu Qiao

Abstract: Recent works have successfully extended large-scale text-to-image models to the video domain, producing promising results but at a high computational cost and requiring a large amount of video data. In this work, we introduce ConditionVideo, a training-free approach to text-to-video generation based on the provided condition, video, and input text, by leveraging the power of off-the-shelf text-to-… ▽ More Recent works have successfully extended large-scale text-to-image models to the video domain, producing promising results but at a high computational cost and requiring a large amount of video data. In this work, we introduce ConditionVideo, a training-free approach to text-to-video generation based on the provided condition, video, and input text, by leveraging the power of off-the-shelf text-to-image generation methods (e.g., Stable Diffusion). ConditionVideo generates realistic dynamic videos from random noise or given scene videos. Our method explicitly disentangles the motion representation into condition-guided and scenery motion components. To this end, the ConditionVideo model is designed with a UNet branch and a control branch. To improve temporal coherence, we introduce sparse bi-directional spatial-temporal attention (sBiST-Attn). The 3D control network extends the conventional 2D controlnet model, aiming to strengthen conditional generation accuracy by additionally leveraging the bi-directional frames in the temporal domain. Our method exhibits superior performance in terms of frame consistency, clip score, and conditional accuracy, outperforming other compared methods. △ Less

Submitted 23 May, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

Comments: AAAI 2024

arXiv:2310.07599 [pdf, other]

doi 10.1103/PhysRevA.109.042208

Many-body entanglement and spectral clusters in the extended hard-core bosonic Hatano-Nelson model

Authors: Chao-Ze Lu, Gaoyong Sun

Abstract: We study many-body entanglements and spectra of the extended bosonic Hatano-Nelson model in the hard-core limit. We show that the system undergoes a phase transition from a gapless phase to a charge density wave phase accompanied by a $\mathcal{PT}$ transition in the first excited state. The phase transition is characterized by the crossing of the ground-state biorthogonal order parameter and the… ▽ More We study many-body entanglements and spectra of the extended bosonic Hatano-Nelson model in the hard-core limit. We show that the system undergoes a phase transition from a gapless phase to a charge density wave phase accompanied by a $\mathcal{PT}$ transition in the first excited state. The phase transition is characterized by the crossing of the ground-state biorthogonal order parameter and the sudden change of the first excited-state entanglement entropy. The gapless phase is verified by the logarithmic scaling of the ground-state entanglement entropy with the central charge $c=1$. Furthermore, we show that all energy spectral clusters would form ellipses in strong nearest-neighbor interactions, for which we establish a universal scaling law. The lengths of the major and minor axes are shown to obey power laws with respect to the nearest-neighbor interaction. The exact expressions are derived for the numbers of energy levels on the outermost elliptic ring of each clusters. △ Less

Submitted 11 April, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

Comments: 8 pages, 6 figures

Journal ref: Phys. Rev. A 109, 042208 (2024)

arXiv:2310.07297 [pdf, other]

Score Regularized Policy Optimization through Diffusion Behavior

Authors: Huayu Chen, Cheng Lu, Zhengyi Wang, Hang Su, Jun Zhu

Abstract: Recent developments in offline reinforcement learning have uncovered the immense potential of diffusion modeling, which excels at representing heterogeneous behavior policies. However, sampling from diffusion policies is considerably slow because it necessitates tens to hundreds of iterative inference steps for one action. To address this issue, we propose to extract an efficient deterministic inf… ▽ More Recent developments in offline reinforcement learning have uncovered the immense potential of diffusion modeling, which excels at representing heterogeneous behavior policies. However, sampling from diffusion policies is considerably slow because it necessitates tens to hundreds of iterative inference steps for one action. To address this issue, we propose to extract an efficient deterministic inference policy from critic models and pretrained diffusion behavior models, leveraging the latter to directly regularize the policy gradient with the behavior distribution's score function during optimization. Our method enjoys powerful generative capabilities of diffusion modeling while completely circumventing the computationally intensive and time-consuming diffusion sampling scheme, both during training and evaluation. Extensive results on D4RL tasks show that our method boosts action sampling speed by more than 25 times compared with various leading diffusion-based methods in locomotion tasks, while still maintaining state-of-the-art performance. △ Less

Submitted 14 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

Comments: ICLR 2024

arXiv:2310.07182 [pdf, other]

Generate Coherent Rays Directly

Authors: Fengqi Liu, Zaonan Tan, Weilai Xiang, Chenhao Lu, Dan Li, Xu Gong, Yulong Shi, Songnan Shi, Qilong Kou, Bo Hu

Abstract: The path tracing method generates incoherent rays by randomly sampling directions. This randomness makes it unsuitable for modern processor architectures that rely on coherence to achieve optimal performance. Many efforts have been made to address this issue by reordering rays based on their origin, end, or direction to enhance coherence. However, a drawback of reordering methods is the need to en… ▽ More The path tracing method generates incoherent rays by randomly sampling directions. This randomness makes it unsuitable for modern processor architectures that rely on coherence to achieve optimal performance. Many efforts have been made to address this issue by reordering rays based on their origin, end, or direction to enhance coherence. However, a drawback of reordering methods is the need to encode and sort rays before tracing, introducing additional overhead. We propose a technique to generate coherent rays directly by reusing the direction. Additionally, we introduce an interleaved reuse domain partition method to mitigate the impact of sampling correlation resulting from direction reuse. We demonstrate the effectiveness of our approach across various scenes, establishing its superiority over reordering methods. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: 8 pages

arXiv:2310.06419 [pdf, other]

doi 10.1038/s41535-023-00588-1

Tunable non-Lifshitz-Kosevich temperature dependence of Shubnikov-de Haas oscillation amplitudes in SmSb

Authors: Wei Zhang, C. N. Kuo, S. T. Kuo, Chun Wa So, Jianyu Xie, Kwing To Lai, Wing Chi Yu, C. S. Lue, Hoi Chun Po, Swee K. Goh

Abstract: The Lifshitz-Kosevich (LK) theory is the pillar of magnetic quantum oscillations, which have been extensively applied to characterize a wide range of metallic states. In this study, we focus on the Shubnikov-de Haas (SdH) effect observed in SmSb, a rare-earth monopnictide. We observed a significant departure from the expected LK theory near $T_N=2.4$~K: both a peak-like anomaly and an enhancement… ▽ More The Lifshitz-Kosevich (LK) theory is the pillar of magnetic quantum oscillations, which have been extensively applied to characterize a wide range of metallic states. In this study, we focus on the Shubnikov-de Haas (SdH) effect observed in SmSb, a rare-earth monopnictide. We observed a significant departure from the expected LK theory near $T_N=2.4$~K: both a peak-like anomaly and an enhancement in the temperature dependence of quantum oscillation amplitude are seen in SmSb. Moreover, we discovered a remarkable sensitivity of the SdH amplitudes to sample purity. By adjusting the sample purity, we were able to tune the temperature dependence of the $α$ band's SdH amplitudes from a peak-like anomalous behavior to an enhancement. Therefore, SdH oscillations from the $α$ band connect the two well-known non-LK behaviours, controllable through varying the sample purity, paving the way for developing further understanding of the mechanism leading to the anomalous quantum oscillations. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Comments: 4 figures

Journal ref: npj Quantum Materials 8, 55 (2023)

arXiv:2310.05306 [pdf, other]

doi 10.1109/RTSS59052.2023.00020

Progressive Neural Compression for Adaptive Image Offloading under Timing Constraints

Authors: Ruiqi Wang, Hanyang Liu, Jiaming Qiu, Moran Xu, Roch Guerin, Chenyang Lu

Abstract: IoT devices are increasingly the source of data for machine learning (ML) applications running on edge servers. Data transmissions from devices to servers are often over local wireless networks whose bandwidth is not just limited but, more importantly, variable. Furthermore, in cyber-physical systems interacting with the physical environment, image offloading is also commonly subject to timing con… ▽ More IoT devices are increasingly the source of data for machine learning (ML) applications running on edge servers. Data transmissions from devices to servers are often over local wireless networks whose bandwidth is not just limited but, more importantly, variable. Furthermore, in cyber-physical systems interacting with the physical environment, image offloading is also commonly subject to timing constraints. It is, therefore, important to develop an adaptive approach that maximizes the inference performance of ML applications under timing constraints and the resource constraints of IoT devices. In this paper, we use image classification as our target application and propose progressive neural compression (PNC) as an efficient solution to this problem. Although neural compression has been used to compress images for different ML applications, existing solutions often produce fixed-size outputs that are unsuitable for timing-constrained offloading over variable bandwidth. To address this limitation, we train a multi-objective rateless autoencoder that optimizes for multiple compression rates via stochastic taildrop to create a compression solution that produces features ordered according to their importance to inference performance. Features are then transmitted in that order based on available bandwidth, with classification ultimately performed using the (sub)set of features received by the deadline. We demonstrate the benefits of PNC over state-of-the-art neural compression approaches and traditional compression methods on a testbed comprising an IoT device and an edge server connected over a wireless network with varying bandwidth. △ Less

Submitted 8 October, 2023; originally announced October 2023.

Comments: IEEE the 44th Real-Time System Symposium (RTSS), 2023

arXiv:2310.05170 [pdf, other]

DeepQTest: Testing Autonomous Driving Systems with Reinforcement Learning and Real-world Weather Data

Authors: Chengjie Lu, Tao Yue, Man Zhang, Shaukat Ali

Abstract: Autonomous driving systems (ADSs) are capable of sensing the environment and making driving decisions autonomously. These systems are safety-critical, and testing them is one of the important approaches to ensure their safety. However, due to the inherent complexity of ADSs and the high dimensionality of their operating environment, the number of possible test scenarios for ADSs is infinite. Besid… ▽ More Autonomous driving systems (ADSs) are capable of sensing the environment and making driving decisions autonomously. These systems are safety-critical, and testing them is one of the important approaches to ensure their safety. However, due to the inherent complexity of ADSs and the high dimensionality of their operating environment, the number of possible test scenarios for ADSs is infinite. Besides, the operating environment of ADSs is dynamic, continuously evolving, and full of uncertainties, which requires a testing approach adaptive to the environment. In addition, existing ADS testing techniques have limited effectiveness in ensuring the realism of test scenarios, especially the realism of weather conditions and their changes over time. Recently, reinforcement learning (RL) has demonstrated great potential in addressing challenging problems, especially those requiring constant adaptations to dynamic environments. To this end, we present DeepQTest, a novel ADS testing approach that uses RL to learn environment configurations with a high chance of revealing abnormal ADS behaviors. Specifically, DeepQTest employs Deep Q-Learning and adopts three safety and comfort measures to construct the reward functions. To ensure the realism of generated scenarios, DeepQTest defines a set of realistic constraints and introduces real-world weather conditions into the simulated environment. We employed three comparison baselines, i.e., random, greedy, and a state-of-the-art RL-based approach DeepCOllision, for evaluating DeepQTest on an industrial-scale ADS. Evaluation results show that DeepQTest demonstrated significantly better effectiveness in terms of generating scenarios leading to collisions and ensuring scenario realism compared with the baselines. In addition, among the three reward functions implemented in DeepQTest, Time-To-Collision is recommended as the best design according to our study. △ Less

Submitted 8 October, 2023; originally announced October 2023.

Comments: 40 pages, 7 figures, 13 tables

arXiv:2310.04664 [pdf, other]

Learning to Rank Onset-Occurring-Offset Representations for Micro-Expression Recognition

Authors: Jie Zhu, Yuan Zong, Jingang Shi, Cheng Lu, Hongli Chang, Wenming Zheng

Abstract: This paper focuses on the research of micro-expression recognition (MER) and proposes a flexible and reliable deep learning method called learning to rank onset-occurring-offset representations (LTR3O). The LTR3O method introduces a dynamic and reduced-size sequence structure known as 3O, which consists of onset, occurring, and offset frames, for representing micro-expressions (MEs). This structur… ▽ More This paper focuses on the research of micro-expression recognition (MER) and proposes a flexible and reliable deep learning method called learning to rank onset-occurring-offset representations (LTR3O). The LTR3O method introduces a dynamic and reduced-size sequence structure known as 3O, which consists of onset, occurring, and offset frames, for representing micro-expressions (MEs). This structure facilitates the subsequent learning of ME-discriminative features. A noteworthy advantage of the 3O structure is its flexibility, as the occurring frame is randomly extracted from the original ME sequence without the need for accurate frame spotting methods. Based on the 3O structures, LTR3O generates multiple 3O representation candidates for each ME sample and incorporates well-designed modules to measure and calibrate their emotional expressiveness. This calibration process ensures that the distribution of these candidates aligns with that of macro-expressions (MaMs) over time. Consequently, the visibility of MEs can be implicitly enhanced, facilitating the reliable learning of more discriminative features for MER. Extensive experiments were conducted to evaluate the performance of LTR3O using three widely-used ME databases: CASME II, SMIC, and SAMM. The experimental results demonstrate the effectiveness and superior performance of LTR3O, particularly in terms of its flexibility and reliability, when compared to recent state-of-the-art MER methods. △ Less

Submitted 6 October, 2023; originally announced October 2023.

arXiv:2310.04282 [pdf, other]

Multi-alpha Boson Gas state in Fusion Evaporation Reaction and Three-body Force

Authors: Taofeng Wang, Ziming Li, R. B. Wiringa, Minliang Liu, Jiansong Wang, Yanyun Yang, Qinghua He, Zhiyu Sun, Chengjian Lin, M. Assié, Y. Ayyad, D. Beaumel, Zhen Bai, Fangfang Duan, Zhihao Gao, Song Guo, Yue Hu, Wei Jiang, F. Kobayashi, Chengui Lu, Junbing Ma, Peng Ma, P. Napolitani, G. Verde, Jianguo Wang , et al. (11 additional authors not shown)

Abstract: The experimental evidence for the $α$ Boson gas state in the $^{11}$C+$^{12}$C$\rightarrow$$^{23}$Mg$^{\ast}$ fusion evaporation reaction is presented. By measuring the $α$ emission spectrum with multiplicity 2 and 3, we provide insight into the existence of a three-body force among $α$ particles. The observed spectrum exhibited distinct tails corresponding to $α$ particles emitted in pairs and tr… ▽ More The experimental evidence for the $α$ Boson gas state in the $^{11}$C+$^{12}$C$\rightarrow$$^{23}$Mg$^{\ast}$ fusion evaporation reaction is presented. By measuring the $α$ emission spectrum with multiplicity 2 and 3, we provide insight into the existence of a three-body force among $α$ particles. The observed spectrum exhibited distinct tails corresponding to $α$ particles emitted in pairs and triplets consistent well with the model-calculations of AV18-UX and chiral effective field theory of NV2-3-la*, indicating the formation of $α$ clusters with three-body force in the Boson gas state. △ Less

Submitted 6 October, 2023; originally announced October 2023.

Comments: 7 pages, 6 figures

arXiv:2310.04274 [pdf, other]

Aspect of Clusters Correlation at Light Nuclei Excited State

Authors: Ziming Li, Jie Zhu, Taofeng Wang, Minliang Liu, Jiansong Wang, Yanyun Yang, Chengjian Lin, Zhiyu Sun, Qinghua He, M. Assié, Y. Ayyad, D. Beaumel, Zhen Bai, Fangfang Duan, Zhihao Gao, Song Guo, Yue Hu, Wei Jiang, F. Kobayashi, Chengui Lu, Junbing Ma, Peng Ma, P. Napolitani, G. Verde, Jianguo Wang , et al. (11 additional authors not shown)

Abstract: The correlation of $αα$ was probed via measuring the transverse momentum $p_{T}$ and width $δp_{T}$ of one $α$, for the first time, which represents the spatial and dynamical essentialities of the initial coupling state in $^{8}$Be nucleus. The weighted interaction vertex of 3$α$ reflected by the magnitudes of their relative momentums and relative emission angles proves the isosceles triangle conf… ▽ More The correlation of $αα$ was probed via measuring the transverse momentum $p_{T}$ and width $δp_{T}$ of one $α$, for the first time, which represents the spatial and dynamical essentialities of the initial coupling state in $^{8}$Be nucleus. The weighted interaction vertex of 3$α$ reflected by the magnitudes of their relative momentums and relative emission angles proves the isosceles triangle configuration for 3$α$ at the high excited energy analogous Hoyle states. △ Less

Submitted 6 October, 2023; originally announced October 2023.

Comments: 8 pages, 9 figures

arXiv:2310.04261 [pdf, other]

Variation of Tensor Force due to Nuclear Medium Effect

Authors: Ziming Li, Jie Zhu, Taofeng Wang, Minliang Liu, Jiansong Wang, Yanyun Yang, Chengjian Lin, Zhiyu Sun, Qinghua He, M. Assié, Y. Ayyad, D. Beaumel, Zhen Bai, Fangfang Duan, Zhihao Gao, Song Guo, Yue Hu, Wei Jiang, F. Kobayashi, Chengui Lu, Junbing Ma, Peng Ma, P. Napolitani, G. Verde, Jianguo Wang , et al. (11 additional authors not shown)

Abstract: The enhancement of $J^π(T)$=3$^{+}$(0) state with isospin $T=0$ excited by the tensor force in the free $^{6}$Li nucleus has been observed, for the first time, relative to a shrinkable excitation in the $^{6}$Li cluster component inside its host nucleus. Comparatively, the excitation of $J^π(T)$=0$^{+}$(1) state with isospin $T=1$ for these two $^{6}$Li formations take on an approximately equal ex… ▽ More The enhancement of $J^π(T)$=3$^{+}$(0) state with isospin $T=0$ excited by the tensor force in the free $^{6}$Li nucleus has been observed, for the first time, relative to a shrinkable excitation in the $^{6}$Li cluster component inside its host nucleus. Comparatively, the excitation of $J^π(T)$=0$^{+}$(1) state with isospin $T=1$ for these two $^{6}$Li formations take on an approximately equal excitation strength. The mechanism of such tensor force effect was proposed due to the intensive nuclear medium role on isospin $T$=0 state. △ Less

Submitted 6 October, 2023; originally announced October 2023.

Comments: 6 pages, 4 figures

arXiv:2310.04189 [pdf, other]

Bridging the Gap between Human Motion and Action Semantics via Kinematic Phrases

Authors: Xinpeng Liu, Yong-Lu Li, Ailing Zeng, Zizheng Zhou, Yang You, Cewu Lu

Abstract: Motion understanding aims to establish a reliable mapping between motion and action semantics, while it is a challenging many-to-many problem. An abstract action semantic (i.e., walk forwards) could be conveyed by perceptually diverse motions (walking with arms up or swinging). In contrast, a motion could carry different semantics w.r.t. its context and intention. This makes an elegant mapping bet… ▽ More Motion understanding aims to establish a reliable mapping between motion and action semantics, while it is a challenging many-to-many problem. An abstract action semantic (i.e., walk forwards) could be conveyed by perceptually diverse motions (walking with arms up or swinging). In contrast, a motion could carry different semantics w.r.t. its context and intention. This makes an elegant mapping between them difficult. Previous attempts adopted direct-mapping paradigms with limited reliability. Also, current automatic metrics fail to provide reliable assessments of the consistency between motions and action semantics. We identify the source of these problems as the significant gap between the two modalities. To alleviate this gap, we propose Kinematic Phrases (KP) that take the objective kinematic facts of human motion with proper abstraction, interpretability, and generality. Based on KP, we can unify a motion knowledge base and build a motion understanding system. Meanwhile, KP can be automatically converted from motions to text descriptions with no subjective bias, inspiring Kinematic Prompt Generation (KPG) as a novel white-box motion generation benchmark. In extensive experiments, our approach shows superiority over other methods. Our project is available at https://foruck.github.io/KP/. △ Less

Submitted 11 July, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

Comments: To appear in ECCV 2024. Yong-Lu Li and Cewu Lu are the corresponding authors. Project page is available at https://foruck.github.io/KP/

arXiv:2310.03992 [pdf, other]

Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

Authors: Yan Zhao, Yuan Zong, Jincen Wang, Hailun Lian, Cheng Lu, Li Zhao, Wenming Zheng

Abstract: In this paper, we propose a new unsupervised domain adaptation (DA) method called layer-adapted implicit distribution alignment networks (LIDAN) to address the challenge of cross-corpus speech emotion recognition (SER). LIDAN extends our previous ICASSP work, deep implicit distribution alignment networks (DIDAN), whose key contribution lies in the introduction of a novel regularization term called… ▽ More In this paper, we propose a new unsupervised domain adaptation (DA) method called layer-adapted implicit distribution alignment networks (LIDAN) to address the challenge of cross-corpus speech emotion recognition (SER). LIDAN extends our previous ICASSP work, deep implicit distribution alignment networks (DIDAN), whose key contribution lies in the introduction of a novel regularization term called implicit distribution alignment (IDA). This term allows DIDAN trained on source (training) speech samples to remain applicable to predicting emotion labels for target (testing) speech samples, regardless of corpus variance in cross-corpus SER. To further enhance this method, we extend IDA to layer-adapted IDA (LIDA), resulting in LIDAN. This layer-adpated extention consists of three modified IDA terms that consider emotion labels at different levels of granularity. These terms are strategically arranged within different fully connected layers in LIDAN, aligning with the increasing emotion-discriminative abilities with respect to the layer depth. This arrangement enables LIDAN to more effectively learn emotion-discriminative and corpus-invariant features for SER across various corpora compared to DIDAN. It is also worthy to mention that unlike most existing methods that rely on estimating statistical moments to describe pre-assumed explicit distributions, both IDA and LIDA take a different approach. They utilize an idea of target sample reconstruction to directly bridge the feature distribution gap without making assumptions about their distribution type. As a result, DIDAN and LIDAN can be viewed as implicit cross-corpus SER methods. To evaluate LIDAN, we conducted extensive cross-corpus SER experiments on EmoDB, eNTERFACE, and CASIA corpora. The experimental results demonstrate that LIDAN surpasses recent state-of-the-art explicit unsupervised DA methods in tackling cross-corpus SER tasks. △ Less

Submitted 5 October, 2023; originally announced October 2023.

arXiv:2310.02915 [pdf, other]

Interplay of two $E_g$ orbitals in Superconducting La$_3$Ni$_2$O$_7$ Under Pressure

Authors: Chen Lu, Zhiming Pan, Fan Yang, Congjun Wu

Abstract: The discovery of high-$T_c$ superconductivity (SC) in La$_3$Ni$_2$O$_7$ (LNO) has aroused a great deal of interests. Previously, it was proposed that the Ni-$3d_{z^2}$ orbital is crucial to realize the high-$T_c$ SC in LNO: The preformed Cooper pairs therein acquire coherence via hybridization with the $3d_{x^2-y^2}$ orbital to form the SC. However, we held a different viewpoint that the interlaye… ▽ More The discovery of high-$T_c$ superconductivity (SC) in La$_3$Ni$_2$O$_7$ (LNO) has aroused a great deal of interests. Previously, it was proposed that the Ni-$3d_{z^2}$ orbital is crucial to realize the high-$T_c$ SC in LNO: The preformed Cooper pairs therein acquire coherence via hybridization with the $3d_{x^2-y^2}$ orbital to form the SC. However, we held a different viewpoint that the interlayer pairing $s$-wave SC is induced by the $3d_{x^2-y^2}$ orbital, driven by the strong interlayer superexchange interaction. To include effects from both $E_g$-orbitals , we establish a two-orbital bilayer $t$-$J$ model. Our calculations reveal that due to the no-double-occupancy constraint, the $3d_{x^2-y^2}$ band and the $3d_{z^2}$ bonding band are flattened by a factor of about 2 and 10, respectively, which is consistent with recent angle-resolved-photo-emission-spectroscopy measurements. Consequently, a high temperature SC can be hardly induced in the $3d_{z^2}$-orbital due to the difficulty to develop phase coherence. However, it can be easily achieved by the $3d_{x^2-y^2}$ orbital under realistic interaction strength. With electron doping, the $3d_{z^2}$-band gradually dives below the Fermi level, but $T_c$ continues to enhance, suggesting that it is not necessary for the high-$T_c$ SC in LNO. With hole doping, $T_c$ initially drops and then rises, accompanied by the crossover from the BCS to BEC-type superconducting transitions. △ Less

Submitted 11 December, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: 9.3 pages, 6 figures, with Appendix

arXiv:2310.02782 [pdf, other]

Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design

Authors: Matthew Thomas Jackson, Minqi Jiang, Jack Parker-Holder, Risto Vuorio, Chris Lu, Gregory Farquhar, Shimon Whiteson, Jakob Nicolaus Foerster

Abstract: The past decade has seen vast progress in deep reinforcement learning (RL) on the back of algorithms manually designed by human researchers. Recently, it has been shown that it is possible to meta-learn update rules, with the hope of discovering algorithms that can perform well on a wide range of RL tasks. Despite impressive initial results from algorithms such as Learned Policy Gradient (LPG), th… ▽ More The past decade has seen vast progress in deep reinforcement learning (RL) on the back of algorithms manually designed by human researchers. Recently, it has been shown that it is possible to meta-learn update rules, with the hope of discovering algorithms that can perform well on a wide range of RL tasks. Despite impressive initial results from algorithms such as Learned Policy Gradient (LPG), there remains a generalization gap when these algorithms are applied to unseen environments. In this work, we examine how characteristics of the meta-training distribution impact the generalization performance of these algorithms. Motivated by this analysis and building on ideas from Unsupervised Environment Design (UED), we propose a novel approach for automatically generating curricula to maximize the regret of a meta-learned optimizer, in addition to a novel approximation of regret, which we name algorithmic regret (AR). The result is our method, General RL Optimizers Obtained Via Environment Design (GROOVE). In a series of experiments, we show that GROOVE achieves superior generalization to LPG, and evaluate AR against baseline metrics from UED, identifying it as a critical component of environment design in this setting. We believe this approach is a step towards the discovery of truly general RL algorithms, capable of solving a wide range of real-world environments. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: Published at NeurIPS 2023

arXiv:2310.02675 [pdf, other]

doi 10.1103/PhysRevB.108.115162

Current direction dependent magnetotransport in CuTe

Authors: Ying Kit Tsui, C. N. Kuo, C. E. Hsu, Wei Zhang, Wenyan Wang, Shanmin Wang, Wing Chi Yu, H. C. Hsueh, C. S. Lue, Swee K. Goh

Abstract: Despite being a layered, easily-exfoliated compound, copper monotelluride (CuTe) features an unusual quasi-one-dimensional charge density wave below $T_{\rm CDW}\approx335$ K. Within a CuTe layer, the electrical resistivity depends sensitively on the direction of the electrical current. Here, we use magnetotransport to probe the metallic state of CuTe with two distinct in-plane current directions.… ▽ More Despite being a layered, easily-exfoliated compound, copper monotelluride (CuTe) features an unusual quasi-one-dimensional charge density wave below $T_{\rm CDW}\approx335$ K. Within a CuTe layer, the electrical resistivity depends sensitively on the direction of the electrical current. Here, we use magnetotransport to probe the metallic state of CuTe with two distinct in-plane current directions. When the current flows along the $a$-axis ($I//a$), the magnetoresistance exhibits a downward curvature as the magnetic field increases. On the other hand, when the current is along the $b$-axis ($I//b$), the magnetoresistance shows the opposite curvature. Our analysis uncovers a violation of Kohler scaling, but only for $I//a$. Shubnikov-de Haas oscillations are detected at low temperatures. Our results shed light on the nature of the metallic state in CuTe with the development of the charge density wave. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: 6 pages, 4 figures

Journal ref: Phys. Rev. B 108, 115162 (2023)

arXiv:2310.01599 [pdf, other]

Synthesis technique and electron beam damage study of nanometer-thin single-crystalline Thymine

Authors: Hazem Daoud, Sreelaja Pulleri Vadhyar, Ehsan Nikbin, Cheng Lu, R. J. Dwayne Miller

Abstract: Samples suitable for electron diffraction studies must satisfy certain characteristics such as having a thickness in the range of 10 - 100 nm. We report, to our knowledge, the first successful synthesis technique of nanometer-thin sheets of single-crystalline thymine suitable for electron diffraction and spectroscopy studies. This development provides a well defined system to explore issues relate… ▽ More Samples suitable for electron diffraction studies must satisfy certain characteristics such as having a thickness in the range of 10 - 100 nm. We report, to our knowledge, the first successful synthesis technique of nanometer-thin sheets of single-crystalline thymine suitable for electron diffraction and spectroscopy studies. This development provides a well defined system to explore issues related to UV photochemistry of DNA and high intrinsic stability essential to maintaining integrity of genetic information. The crystals are grown using the evaporation technique and the nanometer-thin sheets are obtained via microtoming. The sample is characterized via x-ray diffraction (XRD) and is subsequently studied using electron diffraction via a transmission electron microscope (TEM). Thymine is found to be more radiation resistant than similar molecular moieties (e.g., carbamazepine) by a factor of 5. This raises interesting questions about the role of the fast relaxation processes of electron scattering-induced excited states, extending the concept of radiation hardening beyond photoexcited states. The high stability of thymine in particular opens the door for further studies of these ultrafast relaxation processes giving rise to the high stability of DNA to UV radiation. △ Less

Submitted 12 January, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

arXiv:2309.17437 [pdf, other]

Learning Decentralized Flocking Controllers with Spatio-Temporal Graph Neural Network

Authors: Siji Chen, Yanshen Sun, Peihan Li, Lifeng Zhou, Chang-Tien Lu

Abstract: Recently a line of researches has delved the use of graph neural networks (GNNs) for decentralized control in swarm robotics. However, it has been observed that relying solely on the states of immediate neighbors is insufficient to imitate a centralized control policy. To address this limitation, prior studies proposed incorporating $L$-hop delayed states into the computation. While this approach… ▽ More Recently a line of researches has delved the use of graph neural networks (GNNs) for decentralized control in swarm robotics. However, it has been observed that relying solely on the states of immediate neighbors is insufficient to imitate a centralized control policy. To address this limitation, prior studies proposed incorporating $L$-hop delayed states into the computation. While this approach shows promise, it can lead to a lack of consensus among distant flock members and the formation of small clusters, consequently resulting in the failure of cohesive flocking behaviors. Instead, our approach leverages spatiotemporal GNN, named STGNN that encompasses both spatial and temporal expansions. The spatial expansion collects delayed states from distant neighbors, while the temporal expansion incorporates previous states from immediate neighbors. The broader and more comprehensive information gathered from both expansions results in more effective and accurate predictions. We develop an expert algorithm for controlling a swarm of robots and employ imitation learning to train our decentralized STGNN model based on the expert algorithm. We simulate the proposed STGNN approach in various settings, demonstrating its decentralized capacity to emulate the global expert algorithm. Further, we implemented our approach to achieve cohesive flocking, leader following and obstacle avoidance by a group of Crazyflie drones. The performance of STGNN underscores its potential as an effective and reliable approach for achieving cohesive flocking, leader following and obstacle avoidance tasks. △ Less

Submitted 2 October, 2023; v1 submitted 29 September, 2023; originally announced September 2023.

arXiv:2309.17336 [pdf, other]

Robust 3D Object Detection from LiDAR-Radar Point Clouds via Cross-Modal Feature Augmentation

Authors: Jianning Deng, Gabriel Chan, Hantao Zhong, Chris Xiaoxuan Lu

Abstract: This paper presents a novel framework for robust 3D object detection from point clouds via cross-modal hallucination. Our proposed approach is agnostic to either hallucination direction between LiDAR and 4D radar. We introduce multiple alignments on both spatial and feature levels to achieve simultaneous backbone refinement and hallucination generation. Specifically, spatial alignment is proposed… ▽ More This paper presents a novel framework for robust 3D object detection from point clouds via cross-modal hallucination. Our proposed approach is agnostic to either hallucination direction between LiDAR and 4D radar. We introduce multiple alignments on both spatial and feature levels to achieve simultaneous backbone refinement and hallucination generation. Specifically, spatial alignment is proposed to deal with the geometry discrepancy for better instance matching between LiDAR and radar. The feature alignment step further bridges the intrinsic attribute gap between the sensing modalities and stabilizes the training. The trained object detection models can deal with difficult detection cases better, even though only single-modal data is used as the input during the inference stage. Extensive experiments on the View-of-Delft (VoD) dataset show that our proposed method outperforms the state-of-the-art (SOTA) methods for both radar and LiDAR object detection while maintaining competitive efficiency in runtime. Code is available at https://github.com/DJNing/See_beyond_seeing. △ Less

Submitted 12 March, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

Comments: Accepted to ICRA 2024. 8 pages, 4 figures. Equal contribution for Gabriel Chan and Hantao Zhong, listed randomly

arXiv:2309.16609 [pdf, other]

Qwen Technical Report

Authors: Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan , et al. (23 additional authors not shown)

Abstract: Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Q… ▽ More Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat models finetuned with human alignment techniques. The base language models consistently demonstrate superior performance across a multitude of downstream tasks, and the chat models, particularly those trained using Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The chat models possess advanced tool-use and planning capabilities for creating agent applications, showcasing impressive performance even when compared to bigger models on complex tasks like utilizing a code interpreter. Furthermore, we have developed coding-specialized models, Code-Qwen and Code-Qwen-Chat, as well as mathematics-focused models, Math-Qwen-Chat, which are built upon base language models. These models demonstrate significantly improved performance in comparison with open-source models, and slightly fall behind the proprietary models. △ Less

Submitted 28 September, 2023; originally announced September 2023.

Comments: 59 pages, 5 figures

arXiv:2309.16264 [pdf, other]

GAMMA: Generalizable Articulation Modeling and Manipulation for Articulated Objects

Authors: Qiaojun Yu, Junbo Wang, Wenhai Liu, Ce Hao, Liu Liu, Lin Shao, Weiming Wang, Cewu Lu

Abstract: Articulated objects like cabinets and doors are widespread in daily life. However, directly manipulating 3D articulated objects is challenging because they have diverse geometrical shapes, semantic categories, and kinetic constraints. Prior works mostly focused on recognizing and manipulating articulated objects with specific joint types. They can either estimate the joint parameters or distinguis… ▽ More Articulated objects like cabinets and doors are widespread in daily life. However, directly manipulating 3D articulated objects is challenging because they have diverse geometrical shapes, semantic categories, and kinetic constraints. Prior works mostly focused on recognizing and manipulating articulated objects with specific joint types. They can either estimate the joint parameters or distinguish suitable grasp poses to facilitate trajectory planning. Although these approaches have succeeded in certain types of articulated objects, they lack generalizability to unseen objects, which significantly impedes their application in broader scenarios. In this paper, we propose a novel framework of Generalizable Articulation Modeling and Manipulating for Articulated Objects (GAMMA), which learns both articulation modeling and grasp pose affordance from diverse articulated objects with different categories. In addition, GAMMA adopts adaptive manipulation to iteratively reduce the modeling errors and enhance manipulation performance. We train GAMMA with the PartNet-Mobility dataset and evaluate with comprehensive experiments in SAPIEN simulation and real-world Franka robot. Results show that GAMMA significantly outperforms SOTA articulation modeling and manipulation algorithms in unseen and cross-category articulated objects. We will open-source all codes and datasets in both simulation and real robots for reproduction in the final version. Images and videos are published on the project website at: http://sites.google.com/view/gamma-articulation △ Less

Submitted 1 March, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: 8 pages, 5 figures, ICRA 2024

arXiv:2309.14975 [pdf, other]

AirExo: Low-Cost Exoskeletons for Learning Whole-Arm Manipulation in the Wild

Authors: Hongjie Fang, Hao-Shu Fang, Yiming Wang, Jieji Ren, Jingjing Chen, Ruo Zhang, Weiming Wang, Cewu Lu

Abstract: While humans can use parts of their arms other than the hands for manipulations like gathering and supporting, whether robots can effectively learn and perform the same type of operations remains relatively unexplored. As these manipulations require joint-level control to regulate the complete poses of the robots, we develop AirExo, a low-cost, adaptable, and portable dual-arm exoskeleton, for tel… ▽ More While humans can use parts of their arms other than the hands for manipulations like gathering and supporting, whether robots can effectively learn and perform the same type of operations remains relatively unexplored. As these manipulations require joint-level control to regulate the complete poses of the robots, we develop AirExo, a low-cost, adaptable, and portable dual-arm exoskeleton, for teleoperation and demonstration collection. As collecting teleoperated data is expensive and time-consuming, we further leverage AirExo to collect cheap in-the-wild demonstrations at scale. Under our in-the-wild learning framework, we show that with only 3 minutes of the teleoperated demonstrations, augmented by diverse and extensive in-the-wild data collected by AirExo, robots can learn a policy that is comparable to or even better than one learned from teleoperated demonstrations lasting over 20 minutes. Experiments demonstrate that our approach enables the model to learn a more general and robust policy across the various stages of the task, enhancing the success rates in task completion even with the presence of disturbances. Project website: https://airexo.github.io/ △ Less

Submitted 9 May, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

Comments: Project page: https://airexo.github.io/

arXiv:2309.10728 [pdf, other]

QuBEC: Boosting Equivalence Checking for Quantum Circuits with QEC Embedding

Authors: Chao Lu, Navnil Choudhury, Utsav Banerjee, Abdullah Ash Saki, Kanad Basu

Abstract: Quantum computing has proven to be capable of accelerating many algorithms by performing tasks that classical computers cannot. Currently, Noisy Intermediate Scale Quantum (NISQ) machines struggle from scalability and noise issues to render a commercial quantum computer. However, the physical and software improvements of a quantum computer can efficiently control quantum gate noise. As the complex… ▽ More Quantum computing has proven to be capable of accelerating many algorithms by performing tasks that classical computers cannot. Currently, Noisy Intermediate Scale Quantum (NISQ) machines struggle from scalability and noise issues to render a commercial quantum computer. However, the physical and software improvements of a quantum computer can efficiently control quantum gate noise. As the complexity of quantum algorithms and implementation increases, software control of quantum circuits may lead to a more intricate design. Consequently, the verification of quantum circuits becomes crucial in ensuring the correctness of the compilation, along with other processes, including quantum error correction and assertions, that can increase the fidelity of quantum circuits. In this paper, we propose a Decision Diagram-based quantum equivalence checking approach, QuBEC, that requires less latency compared to existing techniques, while accounting for circuits with quantum error correction redundancy. Our proposed methodology reduces verification time on certain benchmark circuits by up to $271.49 \times$, while the number of Decision Diagram nodes required is reduced by up to $798.31 \times$, compared to state-of-the-art strategies. The proposed QuBEC framework can contribute to the advancement of quantum computing by enabling faster and more efficient verification of quantum circuits, paving the way for the development of larger and more complex quantum algorithms. △ Less

Submitted 19 September, 2023; originally announced September 2023.

arXiv:2309.09737 [pdf, other]

RaTrack: Moving Object Detection and Tracking with 4D Radar Point Cloud

Authors: Zhijun Pan, Fangqiang Ding, Hantao Zhong, Chris Xiaoxuan Lu

Abstract: Mobile autonomy relies on the precise perception of dynamic environments. Robustly tracking moving objects in 3D world thus plays a pivotal role for applications like trajectory prediction, obstacle avoidance, and path planning. While most current methods utilize LiDARs or cameras for Multiple Object Tracking (MOT), the capabilities of 4D imaging radars remain largely unexplored. Recognizing the c… ▽ More Mobile autonomy relies on the precise perception of dynamic environments. Robustly tracking moving objects in 3D world thus plays a pivotal role for applications like trajectory prediction, obstacle avoidance, and path planning. While most current methods utilize LiDARs or cameras for Multiple Object Tracking (MOT), the capabilities of 4D imaging radars remain largely unexplored. Recognizing the challenges posed by radar noise and point sparsity in 4D radar data, we introduce RaTrack, an innovative solution tailored for radar-based tracking. Bypassing the typical reliance on specific object types and 3D bounding boxes, our method focuses on motion segmentation and clustering, enriched by a motion estimation module. Evaluated on the View-of-Delft dataset, RaTrack showcases superior tracking precision of moving objects, largely surpassing the performance of the state of the art. We release our code and model at https://github.com/LJacksonPan/RaTrack. △ Less

Submitted 11 March, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

Comments: Accepted to ICRA 2024. 8 pages, 4 figures. Co-first authorship for Zhijun Pan, Fangqiang Ding and Hantao Zhong, listed randomly. See demo vide at: https://www.youtube.com/watch?v=_uSpbxOlLGw

arXiv:2309.08990 [pdf]

doi 10.1021/acsami.3c00518

Antiferromagnetic to Ferrimagnetic Phase Transition and Possible Phase Coexistence in Polar Magnets (Fe$_{1-x}$Mn$_x$)$_2$Mo$_3$O$_8$

Authors: Yuting Chang, Lei Gao, Yunlong Xie, Bin You, Yong Liu, Rui Xiong, Junfeng Wang, Chengliang Lu, JunMing Liu

Abstract: In the present work, magnetic properties of single crystal (Fe$_{1-x}$Mn$_x$)$_2$Mo$_3$O$_8$ ($0<x<1$) have been studied by performing extensive measurements. A detailed magnetic phase diagram is built up, in which antiferromagnetic state dominates for $x<0.25$ and ferrimagnetic phase arises for $x>0.3$. Meanwhile, sizeable electric polarization of spin origin is commonly observed in all samples,… ▽ More In the present work, magnetic properties of single crystal (Fe$_{1-x}$Mn$_x$)$_2$Mo$_3$O$_8$ ($0<x<1$) have been studied by performing extensive measurements. A detailed magnetic phase diagram is built up, in which antiferromagnetic state dominates for $x<0.25$ and ferrimagnetic phase arises for $x>0.3$. Meanwhile, sizeable electric polarization of spin origin is commonly observed in all samples, no matter what the magnetic state is. For the samples hosting a ferrimagnetic state, square-like magnetic hysteresis loops are revealed, while the remnant magnetization and coercive field can be tuned drastically by simply varying the Mn-content or temperature. Possible coexistence of the antiferromagnetic and ferrimagnetic phases is proposed to be responsible for the remarkable modulation of magnetic properties in the samples. △ Less

Submitted 16 September, 2023; originally announced September 2023.

Journal ref: ACS appl. mater. interfaces 15, 22204 (2023)

arXiv:2309.08983 [pdf]

doi 10.1103/PhysRevMaterials.5.104412

Structural origin of the Jeff=1/2 antiferromagnetic phase in Ga-doped Sr2IrO4

Authors: H. W. Wang, L. Y. Zhang, N. Hu, B. You, Y. T. Chang, S. L. Yuan, C. L. Lu, J. M. Liu

Abstract: Sr2IrO4 hosts a novel Jeff =1/2 Mott state and quasi-two-dimensional antiferromagnetic order, providing a unique avenue of exploring emergent states of matter and functions that are extraordinarily sensitive to any structural variations. While the correlation between the physical property and lattice structure in Sr2IrO4 has been a focused issue in the past decade, a common perception assumes that… ▽ More Sr2IrO4 hosts a novel Jeff =1/2 Mott state and quasi-two-dimensional antiferromagnetic order, providing a unique avenue of exploring emergent states of matter and functions that are extraordinarily sensitive to any structural variations. While the correlation between the physical property and lattice structure in Sr2IrO4 has been a focused issue in the past decade, a common perception assumes that the magnetic ordering is essentially determined by the Ir-O-Ir bond angle. Therefore, a delicate modulation of this angle and consequently a major modulation of the magnetic ordering, by chemical doping such as Ga at Ir site, has been extensively investigated and well believed. In this work, however, we present a whole package of structure and magnetism data on a series of single crystal and polycrystalline Sr2Ir1-xGaxO4 samples, revealing the substantial difference in the Néel temperature TN between the two types of samples, and the TN value for the polycrystalline sample x = 0.09 is even 64 K higher than that of the single crystal sample x = 0.09 (deltaTN ~ 64 K at x = 0.09). Our systematic investigations demonstrate the crucial role of the c/a ratio in tuning the interlayer coupling and thereby the Neel point TN, i.e. a higher TN can be achieved as c/a is reduced. The notable differences in structural parameters between the two groups of samples are probably caused by additional strain due to the massive grain boundaries in polycrystalline samples. The present work suggests an additional ingredient of physics that is essential in modulating the emergent properties in Sr2IrO4 and probably other iridates. △ Less

Submitted 16 September, 2023; originally announced September 2023.

Journal ref: Phys. Rev. Mater. 5, 104412 (2021)

arXiv:2309.08981 [pdf]

doi 10.1016/j.mtphys.2022.100809

Strain tuned magnetotransport of Jeff=1/2 antiferromagnetic Sr2IrO4 thin films

Authors: N. Hu, Y. K. Weng, K. Chen, B. You, Y. Liu, Y. T. Chang, R. Xiong, S. Dong, C. L. Lu

Abstract: In this work, we report observation of strain effect on physical properties of Sr2IrO4 thin films grown on SrTiO3 (001) and LaAlO3 (001) substrates. It is found that the film on LaAlO3 with compressive strain has a lower antiferromagnetic transition temperature (TN~210 K) than the film on SrTiO3 (TN~230 K) with tensile strain, which is probably caused by modified interlayer coupling. Interestingly… ▽ More In this work, we report observation of strain effect on physical properties of Sr2IrO4 thin films grown on SrTiO3 (001) and LaAlO3 (001) substrates. It is found that the film on LaAlO3 with compressive strain has a lower antiferromagnetic transition temperature (TN~210 K) than the film on SrTiO3 (TN~230 K) with tensile strain, which is probably caused by modified interlayer coupling. Interestingly, magnetoresistance due to pseudospin-flip of the film on LaAlO3 is much larger than that of tensile-strained film on SrTiO3, and robust anisotropic magnetoresistance is observed in the former, but H-driven reversal behavior is seen in the latter. By performing first principles calculations, it is revealed that epitaxial strain plays an efficient role in tuning the canting angle of Jeff=1/2 moments and thus net moment at every IrO2 layer, responsible for the difference in magnetoresistance between the films. The reversal of anisotropic magnetoresistance in the thin film on SrTiO3 can be ascribed to stabilization of a metastable stable with smaller bandgap as the Jeff=1/2 moments are aligned along the diagonal of basal plane by H. However, theoretical calculations reveal much higher magnetocrystalline anisotropy energy in the film on LaAlO3. This causes difficulties to drive the Jeff=1/2 moments to reach the diagonal and thereby the metastable state, explaining the distinct anisotropic magnetoresistance between two samples in a qualitative sense. Our findings indicate that strain can be a highly efficient mean to engineer the functionalities of Jeff=1/2 antiferromagnet Sr2IrO4. △ Less

Submitted 16 September, 2023; originally announced September 2023.

Comments: 21 pages and 5 figures

Journal ref: Materials Today Physics 27, 100809 (2022)

arXiv:2309.08974 [pdf]

doi 10.1103/PhysRevLett.131.136701

Colossal linear magnetoelectricity in polar magnet Fe2Mo3O8

Authors: Yuting Chang, Yakui Weng, Yunlong Xie, Bin You, Junfeng Wang, Liang Li, Jun-Ming Liu, Shuai Dong, Chengliang Lu

Abstract: Linear magnetoelectric effect is an attractive phenomenon in condensed matters and provides indispensable technological functionalities. Here a colossal linear magnetoelectric effect with diagonal component alfa_33 reaching up to ~480 ps/m is reported in a polar magnet Fe2Mo3O8, and this effect can persist in a broad range of magnetic field (~20 T) and is orders of magnitude larger than reported v… ▽ More Linear magnetoelectric effect is an attractive phenomenon in condensed matters and provides indispensable technological functionalities. Here a colossal linear magnetoelectric effect with diagonal component alfa_33 reaching up to ~480 ps/m is reported in a polar magnet Fe2Mo3O8, and this effect can persist in a broad range of magnetic field (~20 T) and is orders of magnitude larger than reported values in literature. Such an exceptional experimental observation can be well reproduced by a theoretical model affirmatively unveiling the vital contributions from the exchange striction, while the sign difference of magnetocrystalline anisotropy can also be reasonably figured out. △ Less

Submitted 16 September, 2023; originally announced September 2023.

Comments: 14 pages and 4 figures

Journal ref: Phys. Rev. Lett. 131, 136701 (2023)

arXiv:2309.08941 [pdf, ps, other]

Quantum Pseudorandom Scramblers

Authors: Chuhan Lu, Minglong Qin, Fang Song, Penghui Yao, Mingnan Zhao

Abstract: Quantum pseudorandom state generators (PRSGs) have stimulated exciting developments in recent years. A PRSG, on a fixed initial (e.g., all-zero) state, produces an output state that is computationally indistinguishable from a Haar random state. However, pseudorandomness of the output state is not guaranteed on other initial states. In fact, known PRSG constructions provably fail on some initial st… ▽ More Quantum pseudorandom state generators (PRSGs) have stimulated exciting developments in recent years. A PRSG, on a fixed initial (e.g., all-zero) state, produces an output state that is computationally indistinguishable from a Haar random state. However, pseudorandomness of the output state is not guaranteed on other initial states. In fact, known PRSG constructions provably fail on some initial state. In this work, we propose and construct quantum Pseudorandom State Scramblers (PRSSs), which can produce a pseudorandom state on an arbitrary initial state. In the information-theoretical setting, we obtain a scrambler which maps an arbitrary initial state to a distribution of quantum states that is close to Haar random in total variation distance. As a result, our PRSS exhibits a dispersing property. Loosely, it can span an $ε$-net of the state space. This significantly strengthens what standard PRSGs can induce, as they may only concentrate on a small region of the state space as long as the average output state approximates a Haar random state in total variation distance. Our PRSS construction develops a parallel extension of the famous Kac's walk, and we show that it mixes exponentially faster than the standard Kac's walk. This constitutes the core of our proof. We also describe a few applications of PRSSs. While our PRSS construction assumes a post-quantum one-way function, PRSSs are potentially a weaker primitive and can be separated from one-way functions in a relativized world similar to standard PRSGs. △ Less

Submitted 16 September, 2023; originally announced September 2023.

arXiv:2309.07109 [pdf, ps, other]

Real-time Monitoring for the Next Core-Collapse Supernova in JUNO

Authors: Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta, Antonio Bergnoli , et al. (606 additional authors not shown)

Abstract: The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neu… ▽ More The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neutrinos at the Jiangmen Underground Neutrino Observatory (JUNO), a 20 kton liquid scintillator detector currently under construction in South China. The real-time monitoring system is designed to ensure both prompt alert speed and comprehensive coverage of progenitor stars. It incorporates prompt monitors on the electronic board as well as online monitors at the data acquisition stage. Assuming a false alert rate of 1 per year, this monitoring system exhibits sensitivity to pre-SN neutrinos up to a distance of approximately 1.6 (0.9) kiloparsecs and SN neutrinos up to about 370 (360) kiloparsecs for a progenitor mass of 30 solar masses, considering both normal and inverted mass ordering scenarios. The pointing ability of the CCSN is evaluated by analyzing the accumulated event anisotropy of inverse beta decay interactions from pre-SN or SN neutrinos. This, along with the early alert, can play a crucial role in facilitating follow-up multi-messenger observations of the next galactic or nearby extragalactic CCSN. △ Less

Submitted 4 December, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

Comments: 24 pages, 9 figures, accepted for the publication at JCAP

arXiv:2309.06173 [pdf, other]

Effect of Rare-earth Element Substitution in Superconducting R$_3$Ni$_2$O$_7$ Under Pressure

Authors: Zhiming Pan, Chen Lu, Fan Yang, Congjun Wu

Abstract: Recently, high temperature ($T_c\approx 80$K) superconductivity (SC) has been discovered in La$_3$Ni$_2$O$_7$ (LNO) under pressure. Question arises whether the transition temperature $T_c$ could be further enhanced under suitable conditions. A possible route for realizing higher $T_c$ is element substitution. Similar SC could appear in rare-earth (RE) R$_3$Ni$_2$O$_7$ (RNO, R=RE element) material… ▽ More Recently, high temperature ($T_c\approx 80$K) superconductivity (SC) has been discovered in La$_3$Ni$_2$O$_7$ (LNO) under pressure. Question arises whether the transition temperature $T_c$ could be further enhanced under suitable conditions. A possible route for realizing higher $T_c$ is element substitution. Similar SC could appear in rare-earth (RE) R$_3$Ni$_2$O$_7$ (RNO, R=RE element) material series under pressure. The electronic properties in the RNO materials are dominated by the Ni $3d$ orbitals in the bilayer NiO$_2$ plane. In the strong coupling limit, the SC could be fully characterized by a bilayer single $3d_{x^2-y^2}$-orbital $t$-$J_{\parallel}$-$J_{\perp}$ model. Under RE element substitution from La to RE element, the lattice constant decreases and the electronic hopping increases, leading to stronger superexchanges between the $3d_{x^2-y^2}$ orbitals. Based on the slave-boson mean-field theory, we explore the pairing nature and the evolution of $T_c$ in RNO materials. Consequently, it is found that the element substitution does not alter the pairing nature, i.e. the inter-layer $s$-wave pairing is always favored in RNO. However, the $T_c$ increases from La to Sm and a nearly doubled $T_c$ is achieved for SmNO. This work provides evidence for possible higher $T_c$ R$_3$Ni$_2$O$_7$ materials, which may be realized in further experiments. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: 4 pages, 3 figures

arXiv:2309.05840 [pdf, other]

Self-Correlation and Cross-Correlation Learning for Few-Shot Remote Sensing Image Semantic Segmentation

Authors: Linhan Wang, Shuo Lei, Jianfeng He, Shengkun Wang, Min Zhang, Chang-Tien Lu

Abstract: Remote sensing image semantic segmentation is an important problem for remote sensing image interpretation. Although remarkable progress has been achieved, existing deep neural network methods suffer from the reliance on massive training data. Few-shot remote sensing semantic segmentation aims at learning to segment target objects from a query image using only a few annotated support images of the… ▽ More Remote sensing image semantic segmentation is an important problem for remote sensing image interpretation. Although remarkable progress has been achieved, existing deep neural network methods suffer from the reliance on massive training data. Few-shot remote sensing semantic segmentation aims at learning to segment target objects from a query image using only a few annotated support images of the target class. Most existing few-shot learning methods stem primarily from their sole focus on extracting information from support images, thereby failing to effectively address the large variance in appearance and scales of geographic objects. To tackle these challenges, we propose a Self-Correlation and Cross-Correlation Learning Network for the few-shot remote sensing image semantic segmentation. Our model enhances the generalization by considering both self-correlation and cross-correlation between support and query images to make segmentation predictions. To further explore the self-correlation with the query image, we propose to adopt a classical spectral method to produce a class-agnostic segmentation mask based on the basic visual information of the image. Extensive experiments on two remote sensing image datasets demonstrate the effectiveness and superiority of our model in few-shot remote sensing image semantic segmentation. Code and models will be accessed at https://github.com/linhanwang/SCCNet. △ Less

Submitted 15 September, 2023; v1 submitted 11 September, 2023; originally announced September 2023.

Comments: 10 pages, 6 figures. Accepted to Sigspatial 2023. arXiv admin note: text overlap with arXiv:2104.01538 by other authors

arXiv:2309.05631 [pdf, other]

Next-to-leading order QCD corrections to the form factors of $B$ to scalar meson decays

Authors: Xue-Ying Han, Long-Shun Lu, Cai-Dian Lü, Yue-Long Shen, Bo-Xuan Shi

Abstract: We calculate the next-to-leading order QCD corrections to $B$ to scalar meson form factors from QCD light-cone sum rules with $B$ meson light-cone distribution amplitudes. We demonstrate that the $B$ meson-to-vacuum correlation functions can be factorized into the convolution of short-distance coefficients and light-cone distribution amplitudes at the one-loop level and find that only… ▽ More We calculate the next-to-leading order QCD corrections to $B$ to scalar meson form factors from QCD light-cone sum rules with $B$ meson light-cone distribution amplitudes. We demonstrate that the $B$ meson-to-vacuum correlation functions can be factorized into the convolution of short-distance coefficients and light-cone distribution amplitudes at the one-loop level and find that only $φ_B^+(ω,μ)$ contributes to the form factors. We then employ the $z$-parameterization combined with constraints from strong coupling constants to reconstruct the $q^2$ dependence of the form factors in the whole kinematic allowed regions. Due to the large cancellations between the hard functions and the jet functions, the next-to-leading order results show a modest increase of approximately 5\% compared to the leading order results. Based on the results of form factors, we predict the branching ratios of semi-leptonic $B\to S\ell\barν_\ell$ and $B\to Sν_\ell\barν_\ell$ processes, as well as several angular observables, such as forward-backward asymmetries, "flat terms" and lepton polarization asymmetries. We compare these results with calculations from other methods. Experimental verification of these results is required in future experiments. △ Less

Submitted 16 October, 2023; v1 submitted 11 September, 2023; originally announced September 2023.

Comments: 47 pages, 13 figures

arXiv:2309.03246 [pdf]

doi 10.1145/3611643.3613897

EvoCLINICAL: Evolving Cyber-Cyber Digital Twin with Active Transfer Learning for Automated Cancer Registry System

Authors: Chengjie Lu, Qinghua Xu, Tao Yue, Shaukat Ali, Thomas Schwitalla, Jan F. Nygård

Abstract: The Cancer Registry of Norway (CRN) collects information on cancer patients by receiving cancer messages from different medical entities (e.g., medical labs, and hospitals) in Norway. Such messages are validated by an automated cancer registry system: GURI. Its correct operation is crucial since it lays the foundation for cancer research and provides critical cancer-related statistics to its stake… ▽ More The Cancer Registry of Norway (CRN) collects information on cancer patients by receiving cancer messages from different medical entities (e.g., medical labs, and hospitals) in Norway. Such messages are validated by an automated cancer registry system: GURI. Its correct operation is crucial since it lays the foundation for cancer research and provides critical cancer-related statistics to its stakeholders. Constructing a cyber-cyber digital twin (CCDT) for GURI can facilitate various experiments and advanced analyses of the operational state of GURI without requiring intensive interactions with the real system. However, GURI constantly evolves due to novel medical diagnostics and treatment, technological advances, etc. Accordingly, CCDT should evolve as well to synchronize with GURI. A key challenge of achieving such synchronization is that evolving CCDT needs abundant data labelled by the new GURI. To tackle this challenge, we propose EvoCLINICAL, which considers the CCDT developed for the previous version of GURI as the pretrained model and fine-tunes it with the dataset labelled by querying a new GURI version. EvoCLINICAL employs a genetic algorithm to select an optimal subset of cancer messages from a candidate dataset and query GURI with it. We evaluate EvoCLINICAL on three evolution processes. The precision, recall, and F1 score are all greater than 91%, demonstrating the effectiveness of EvoCLINICAL. Furthermore, we replace the active learning part of EvoCLINICAL with random selection to study the contribution of transfer learning to the overall performance of EvoCLINICAL. Results show that employing active learning in EvoCLINICAL increases its performances consistently. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: 12 pages, 2 figures, 5 tables; accepted to the industry track of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE '23)

arXiv:2309.02423 [pdf, other]

EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding

Authors: Yue Xu, Yong-Lu Li, Zhemin Huang, Michael Xu Liu, Cewu Lu, Yu-Wing Tai, Chi-Keung Tang

Abstract: With the surge in attention to Egocentric Hand-Object Interaction (Ego-HOI), large-scale datasets such as Ego4D and EPIC-KITCHENS have been proposed. However, most current research is built on resources derived from third-person video action recognition. This inherent domain gap between first- and third-person action videos, which have not been adequately addressed before, makes current Ego-HOI su… ▽ More With the surge in attention to Egocentric Hand-Object Interaction (Ego-HOI), large-scale datasets such as Ego4D and EPIC-KITCHENS have been proposed. However, most current research is built on resources derived from third-person video action recognition. This inherent domain gap between first- and third-person action videos, which have not been adequately addressed before, makes current Ego-HOI suboptimal. This paper rethinks and proposes a new framework as an infrastructure to advance Ego-HOI recognition by Probing, Curation and Adaption (EgoPCA). We contribute comprehensive pre-train sets, balanced test sets and a new baseline, which are complete with a training-finetuning strategy. With our new framework, we not only achieve state-of-the-art performance on Ego-HOI benchmarks but also build several new and effective mechanisms and settings to advance further research. We believe our data and the findings will pave a new way for Ego-HOI understanding. Code and data are available at https://mvig-rhos.com/ego_pca △ Less

Submitted 5 September, 2023; originally announced September 2023.

Comments: ICCV 2023

arXiv:2309.01181 [pdf, other]

doi 10.1103/PhysRevApplied.20.064032

Polarization-entangled quantum frequency comb from a silicon nitride microring resonator

Authors: Wenjun Wen, Wenhan Yan, Chi Lu, Liangliang Lu, Xiaoyu Wu, Yanqing Lu, Shining Zhu, Xiao-song Ma

Abstract: Integrated microresonator facilitates the realization of quantum frequency comb (QFC), which provides a large number of discrete frequency modes with broadband spectral range and narrow linewidth. However, all previous demonstrations have focused on the generation of energy-time or time-bin entangled photons from QFC. Realizing polarization-entangled quantum frequency comb, which is the important… ▽ More Integrated microresonator facilitates the realization of quantum frequency comb (QFC), which provides a large number of discrete frequency modes with broadband spectral range and narrow linewidth. However, all previous demonstrations have focused on the generation of energy-time or time-bin entangled photons from QFC. Realizing polarization-entangled quantum frequency comb, which is the important resource for fundamental study of quantum mechanics and quantum information applications, remains challenging. Here, we demonstrate, for the first time, a broadband polarization-entangled quantum frequency comb by combining an integrated silicon nitride micro-resonator with a Sagnac interferometer. With a free spectral range of about 99 GHz and a narrow linewidth of about 190 MHz, our source provides 22 polarization entangled photons pairs with frequency covering the whole telecom C-band. The entanglement fidelities for all 22 pairs are above 81%, including 17 pairs with fidelities higher than 90%. Our demonstration paves the way for employing the polarization-entangled quantum frequency comb in quantum network using CMOS technology as well as standard dense wavelength division multiplexing technology. △ Less

Submitted 17 April, 2024; v1 submitted 3 September, 2023; originally announced September 2023.

Comments: 11 pages, 9 figures

Journal ref: Phys. Rev. Applied 20, 064032 (2023)

arXiv:2309.00271 [pdf, other]

Probing Inelastic Dark Matter at the LHC, FASER and STCF

Authors: Chih-Ting Lu, Jianfeng Tu, Lei Wu

Abstract: In this work, we explore the potential of probing the inelastic dark matter (DM) model with an extra U(1)D gauge symmetry at the Large Hadron Collider, ForwArd Search ExpeRiment and Super Tau Charm Factory. To saturate the observed DM relic density, the mass splitting between two light dark states has to be small enough, and thus leads to some distinctive signatures at these colliders. By searchin… ▽ More In this work, we explore the potential of probing the inelastic dark matter (DM) model with an extra U(1)D gauge symmetry at the Large Hadron Collider, ForwArd Search ExpeRiment and Super Tau Charm Factory. To saturate the observed DM relic density, the mass splitting between two light dark states has to be small enough, and thus leads to some distinctive signatures at these colliders. By searching for the long-lived particle, the displaced muon-jets, the soft leptons, and the mono-photon events, we find that the inelastic DM mass in the range of 1 MeV to 210 GeV could be tested. △ Less

Submitted 3 July, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

Comments: 22 pages, 6 figures

arXiv:2308.15622 [pdf, other]

Flexible Handover with Real-Time Robust Dynamic Grasp Trajectory Generation

Authors: Gu Zhang, Hao-Shu Fang, Hongjie Fang, Cewu Lu

Abstract: In recent years, there has been a significant effort dedicated to developing efficient, robust, and general human-to-robot handover systems. However, the area of flexible handover in the context of complex and continuous objects' motion remains relatively unexplored. In this work, we propose an approach for effective and robust flexible handover, which enables the robot to grasp moving objects wit… ▽ More In recent years, there has been a significant effort dedicated to developing efficient, robust, and general human-to-robot handover systems. However, the area of flexible handover in the context of complex and continuous objects' motion remains relatively unexplored. In this work, we propose an approach for effective and robust flexible handover, which enables the robot to grasp moving objects with flexible motion trajectories with a high success rate. The key innovation of our approach is the generation of real-time robust grasp trajectories. We also design a future grasp prediction algorithm to enhance the system's adaptability to dynamic handover scenes. We conduct one-motion handover experiments and motion-continuous handover experiments on our novel benchmark that includes 31 diverse household objects. The system we have developed allows users to move and rotate objects in their hands within a relatively large range. The success rate of the robot grasping such moving objects is 78.15% over the entire household object benchmark. △ Less

Submitted 29 August, 2023; originally announced August 2023.

Comments: Paper accepted by IROS2023

arXiv:2308.14568 [pdf, other]

Time-Frequency Transformer: A Novel Time Frequency Joint Learning Method for Speech Emotion Recognition

Authors: Yong Wang, Cheng Lu, Yuan Zong, Hailun Lian, Yan Zhao, Sunan Li

Abstract: In this paper, we propose a novel time-frequency joint learning method for speech emotion recognition, called Time-Frequency Transformer. Its advantage is that the Time-Frequency Transformer can excavate global emotion patterns in the time-frequency domain of speech signal while modeling the local emotional correlations in the time domain and frequency domain respectively. For the purpose, we firs… ▽ More In this paper, we propose a novel time-frequency joint learning method for speech emotion recognition, called Time-Frequency Transformer. Its advantage is that the Time-Frequency Transformer can excavate global emotion patterns in the time-frequency domain of speech signal while modeling the local emotional correlations in the time domain and frequency domain respectively. For the purpose, we first design a Time Transformer and Frequency Transformer to capture the local emotion patterns between frames and inside frequency bands respectively, so as to ensure the integrity of the emotion information modeling in both time and frequency domains. Then, a Time-Frequency Transformer is proposed to mine the time-frequency emotional correlations through the local time-domain and frequency-domain emotion features for learning more discriminative global speech emotion representation. The whole process is a time-frequency joint learning process implemented by a series of Transformer models. Experiments on IEMOCAP and CASIA databases indicate that our proposed method outdoes the state-of-the-art methods. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Comments: Accepted by International Conference on Neural Information Processing (ICONIP2023)

arXiv:2308.14039 [pdf, other]

Deep Learning for Visual Localization and Mapping: A Survey

Authors: Changhao Chen, Bing Wang, Chris Xiaoxuan Lu, Niki Trigoni, Andrew Markham

Abstract: Deep learning based localization and mapping approaches have recently emerged as a new research direction and receive significant attentions from both industry and academia. Instead of creating hand-designed algorithms based on physical models or geometric theories, deep learning solutions provide an alternative to solve the problem in a data-driven way. Benefiting from the ever-increasing volumes… ▽ More Deep learning based localization and mapping approaches have recently emerged as a new research direction and receive significant attentions from both industry and academia. Instead of creating hand-designed algorithms based on physical models or geometric theories, deep learning solutions provide an alternative to solve the problem in a data-driven way. Benefiting from the ever-increasing volumes of data and computational power on devices, these learning methods are fast evolving into a new area that shows potentials to track self-motion and estimate environmental model accurately and robustly for mobile agents. In this work, we provide a comprehensive survey, and propose a taxonomy for the localization and mapping methods using deep learning. This survey aims to discuss two basic questions: whether deep learning is promising to localization and mapping; how deep learning should be applied to solve this problem. To this end, a series of localization and mapping topics are investigated, from the learning based visual odometry, global relocalization, to mapping, and simultaneous localization and mapping (SLAM). It is our hope that this survey organically weaves together the recent works in this vein from robotics, computer vision and machine learning communities, and serves as a guideline for future researchers to apply deep learning to tackle the problem of visual localization and mapping. △ Less

Submitted 27 August, 2023; originally announced August 2023.

Comments: Accepted by IEEE Transactions on Neural Networks and Learning Systems. This is an updated version of arXiv:2006.12567

arXiv:2308.13289 [pdf, other]

JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading

Authors: Sascha Frey, Kang Li, Peer Nagy, Silvia Sapora, Chris Lu, Stefan Zohren, Jakob Foerster, Anisoara Calinescu

Abstract: Financial exchanges across the world use limit order books (LOBs) to process orders and match trades. For research purposes it is important to have large scale efficient simulators of LOB dynamics. LOB simulators have previously been implemented in the context of agent-based models (ABMs), reinforcement learning (RL) environments, and generative models, processing order flows from historical data… ▽ More Financial exchanges across the world use limit order books (LOBs) to process orders and match trades. For research purposes it is important to have large scale efficient simulators of LOB dynamics. LOB simulators have previously been implemented in the context of agent-based models (ABMs), reinforcement learning (RL) environments, and generative models, processing order flows from historical data sets and hand-crafted agents alike. For many applications, there is a requirement for processing multiple books, either for the calibration of ABMs or for the training of RL agents. We showcase the first GPU-enabled LOB simulator designed to process thousands of books in parallel, with a notably reduced per-message processing time. The implementation of our simulator - JAX-LOB - is based on design choices that aim to best exploit the powers of JAX without compromising on the realism of LOB-related mechanisms. We integrate JAX-LOB with other JAX packages, to provide an example of how one may address an optimal execution problem with reinforcement learning, and to share some preliminary results from end-to-end RL training on GPUs. △ Less

Submitted 25 August, 2023; originally announced August 2023.

arXiv:2308.11608 [pdf, other]

doi 10.1103/PhysRevD.108.122005

A Measurement of Gravitational Lensing of the Cosmic Microwave Background Using SPT-3G 2018 Data

Authors: Z. Pan, F. Bianchini, W. L. K. Wu, P. A. R. Ade, Z. Ahmed, E. Anderes, A. J. Anderson, B. Ansarinejad, M. Archipley, K. Aylor, L. Balkenhol, P. S. Barry, R. Basu Thakur, K. Benabed, A. N. Bender, B. A. Benson, L. E. Bleem, F. R. Bouchet, L. Bryant, K. Byrum, E. Camphuis, J. E. Carlstrom, F. W. Carter, T. W. Cecil, C. L. Chang , et al. (111 additional authors not shown)

Abstract: We present a measurement of gravitational lensing over 1500 deg$^2$ of the Southern sky using SPT-3G temperature data at 95 and 150 GHz taken in 2018. The lensing amplitude relative to a fiducial Planck 2018 $Λ$CDM cosmology is found to be $1.020\pm0.060$, excluding instrumental and astrophysical systematic uncertainties. We conduct extensive systematic and null tests to check the robustness of th… ▽ More We present a measurement of gravitational lensing over 1500 deg$^2$ of the Southern sky using SPT-3G temperature data at 95 and 150 GHz taken in 2018. The lensing amplitude relative to a fiducial Planck 2018 $Λ$CDM cosmology is found to be $1.020\pm0.060$, excluding instrumental and astrophysical systematic uncertainties. We conduct extensive systematic and null tests to check the robustness of the lensing measurements, and report a minimum-variance combined lensing power spectrum over angular multipoles of $50<L<2000$, which we use to constrain cosmological models. When analyzed alone and jointly with primary cosmic microwave background (CMB) spectra within the $Λ$CDM model, our lensing amplitude measurements are consistent with measurements from SPT-SZ, SPTpol, ACT, and Planck. Incorporating loose priors on the baryon density and other parameters including uncertainties on a foreground bias template, we obtain a $1σ$ constraint on $σ_8 Ω_{\rm m}^{0.25}=0.595 \pm 0.026$ using the SPT-3G 2018 lensing data alone, where $σ_8$ is a common measure of the amplitude of structure today and $Ω_{\rm m}$ is the matter density parameter. Combining SPT-3G 2018 lensing measurements with baryon acoustic oscillation (BAO) data, we derive parameter constraints of $σ_8 = 0.810 \pm 0.033$, $S_8 \equiv σ_8(Ω_{\rm m}/0.3)^{0.5}= 0.836 \pm 0.039$, and Hubble constant $H_0 =68.8^{+1.3}_{-1.6}$ km s$^{-1}$ Mpc$^{-1}$. Using CMB anisotropy and lensing measurements from SPT-3G only, we provide independent constraints on the spatial curvature of $Ω_{K} = 0.014^{+0.023}_{-0.026}$ (95% C.L.) and the dark energy density of $Ω_Λ= 0.722^{+0.031}_{-0.026}$ (68% C.L.). When combining SPT-3G lensing data with SPT-3G CMB anisotropy and BAO data, we find an upper limit on the sum of the neutrino masses of $\sum m_ν< 0.30$ eV (95% C.L.). △ Less

Submitted 29 January, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

Comments: Bandpower and likelihood data available at https://pole.uchicago.edu/public/data/spt3g_2018_lensing/

Journal ref: Physical Review D 108.12 (2023): 122005

arXiv:2308.10574 [pdf, other]

CHORD: Category-level Hand-held Object Reconstruction via Shape Deformation

Authors: Kailin Li, Lixin Yang, Haoyu Zhen, Zenan Lin, Xinyu Zhan, Licheng Zhong, Jian Xu, Kejian Wu, Cewu Lu

Abstract: In daily life, humans utilize hands to manipulate objects. Modeling the shape of objects that are manipulated by the hand is essential for AI to comprehend daily tasks and to learn manipulation skills. However, previous approaches have encountered difficulties in reconstructing the precise shapes of hand-held objects, primarily owing to a deficiency in prior shape knowledge and inadequate data for… ▽ More In daily life, humans utilize hands to manipulate objects. Modeling the shape of objects that are manipulated by the hand is essential for AI to comprehend daily tasks and to learn manipulation skills. However, previous approaches have encountered difficulties in reconstructing the precise shapes of hand-held objects, primarily owing to a deficiency in prior shape knowledge and inadequate data for training. As illustrated, given a particular type of tool, such as a mug, despite its infinite variations in shape and appearance, humans have a limited number of 'effective' modes and poses for its manipulation. This can be attributed to the fact that humans have mastered the shape prior of the 'mug' category, and can quickly establish the corresponding relations between different mug instances and the prior, such as where the rim and handle are located. In light of this, we propose a new method, CHORD, for Category-level Hand-held Object Reconstruction via shape Deformation. CHORD deforms a categorical shape prior for reconstructing the intra-class objects. To ensure accurate reconstruction, we empower CHORD with three types of awareness: appearance, shape, and interacting pose. In addition, we have constructed a new dataset, COMIC, of category-level hand-object interaction. COMIC contains a rich array of object instances, materials, hand interactions, and viewing directions. Extensive evaluation shows that CHORD outperforms state-of-the-art approaches in both quantitative and qualitative measures. Code, model, and datasets are available at https://kailinli.github.io/CHORD. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: To be presented at ICCV 2023, Paris

arXiv:2308.09987 [pdf, other]

ClothesNet: An Information-Rich 3D Garment Model Repository with Simulated Clothes Environment

Authors: Bingyang Zhou, Haoyu Zhou, Tianhai Liang, Qiaojun Yu, Siheng Zhao, Yuwei Zeng, Jun Lv, Siyuan Luo, Qiancai Wang, Xinyuan Yu, Haonan Chen, Cewu Lu, Lin Shao

Abstract: We present ClothesNet: a large-scale dataset of 3D clothes objects with information-rich annotations. Our dataset consists of around 4400 models covering 11 categories annotated with clothes features, boundary lines, and keypoints. ClothesNet can be used to facilitate a variety of computer vision and robot interaction tasks. Using our dataset, we establish benchmark tasks for clothes perception, i… ▽ More We present ClothesNet: a large-scale dataset of 3D clothes objects with information-rich annotations. Our dataset consists of around 4400 models covering 11 categories annotated with clothes features, boundary lines, and keypoints. ClothesNet can be used to facilitate a variety of computer vision and robot interaction tasks. Using our dataset, we establish benchmark tasks for clothes perception, including classification, boundary line segmentation, and keypoint detection, and develop simulated clothes environments for robotic interaction tasks, including rearranging, folding, hanging, and dressing. We also demonstrate the efficacy of our ClothesNet in real-world experiments. Supplemental materials and dataset are available on our project webpage. △ Less

Submitted 19 August, 2023; originally announced August 2023.

Comments: IEEE/CVF International Conference on Computer Vision (ICCV) 2023

arXiv:2308.09892 [pdf, other]

Utilizing Semantic Textual Similarity for Clinical Survey Data Feature Selection

Authors: Benjamin C. Warner, Ziqi Xu, Simon Haroutounian, Thomas Kannampallil, Chenyang Lu

Abstract: Survey data can contain a high number of features while having a comparatively low quantity of examples. Machine learning models that attempt to predict outcomes from survey data under these conditions can overfit and result in poor generalizability. One remedy to this issue is feature selection, which attempts to select an optimal subset of features to learn upon. A relatively unexplored source o… ▽ More Survey data can contain a high number of features while having a comparatively low quantity of examples. Machine learning models that attempt to predict outcomes from survey data under these conditions can overfit and result in poor generalizability. One remedy to this issue is feature selection, which attempts to select an optimal subset of features to learn upon. A relatively unexplored source of information in the feature selection process is the usage of textual names of features, which may be semantically indicative of which features are relevant to a target outcome. The relationships between feature names and target names can be evaluated using language models (LMs) to produce semantic textual similarity (STS) scores, which can then be used to select features. We examine the performance using STS to select features directly and in the minimal-redundancy-maximal-relevance (mRMR) algorithm. The performance of STS as a feature selection metric is evaluated against preliminary survey data collected as a part of a clinical study on persistent post-surgical pain (PPSP). The results suggest that features selected with STS can result in higher performance models compared to traditional feature selection algorithms. △ Less

Submitted 18 August, 2023; originally announced August 2023.

arXiv:2308.09268 [pdf, other]

Progression-Guided Temporal Action Detection in Videos

Authors: Chongkai Lu, Man-Wai Mak, Ruimin Li, Zheru Chi, Hong Fu

Abstract: We present a novel framework, Action Progression Network (APN), for temporal action detection (TAD) in videos. The framework locates actions in videos by detecting the action evolution process. To encode the action evolution, we quantify a complete action process into 101 ordered stages (0\%, 1\%, ..., 100\%), referred to as action progressions. We then train a neural network to recognize the acti… ▽ More We present a novel framework, Action Progression Network (APN), for temporal action detection (TAD) in videos. The framework locates actions in videos by detecting the action evolution process. To encode the action evolution, we quantify a complete action process into 101 ordered stages (0\%, 1\%, ..., 100\%), referred to as action progressions. We then train a neural network to recognize the action progressions. The framework detects action boundaries by detecting complete action processes in the videos, e.g., a video segment with detected action progressions closely follow the sequence 0\%, 1\%, ..., 100\%. The framework offers three major advantages: (1) Our neural networks are trained end-to-end, contrasting conventional methods that optimize modules separately; (2) The APN is trained using action frames exclusively, enabling models to be trained on action classification datasets and robust to videos with temporal background styles differing from those in training; (3) Our framework effectively avoids detecting incomplete actions and excels in detecting long-lasting actions due to the fine-grained and explicit encoding of the temporal structure of actions. Leveraging these advantages, the APN achieves competitive performance and significantly surpasses its counterparts in detecting long-lasting actions. With an IoU threshold of 0.5, the APN achieves a mean Average Precision (mAP) of 58.3\% on the THUMOS14 dataset and 98.9\% mAP on the DFMAD70 dataset. △ Less

Submitted 17 August, 2023; originally announced August 2023.

Comments: Under Review. Code available at https://github.com/makecent/APN

arXiv:2308.09053 [pdf]

Surface Second Harmonic Generation from Topological Dirac Semimetal PdTe$_2$

Authors: Syed Mohammed Faizanuddin, Ching-Hang Chien, Yao-Jui Chan, Si-Tong Liu, Chia-Nung Kuo, Chin Shuan Lue, Yu-Chieh Wen

Abstract: Recent experiments and calculations in topological semimetals have observed anomalously strong second-order optical nonlinearity, but yet whether the enhancement also occurs at surfaces of topological semimetals in general remains an open question. In this work, we tackle this problem by measuring polarization-dependent and rotational-anisotropy optical second harmonic generation (SHG) from centro… ▽ More Recent experiments and calculations in topological semimetals have observed anomalously strong second-order optical nonlinearity, but yet whether the enhancement also occurs at surfaces of topological semimetals in general remains an open question. In this work, we tackle this problem by measuring polarization-dependent and rotational-anisotropy optical second harmonic generation (SHG) from centrosymmetric type-II Dirac semimetal PdTe$_2$. We found the SHG to follow C$_{3v}$ surface symmetry with a time-varying intensity dictated by the oxidation kinetics of the material after its surface cleavage, indicating the surface origin of SHG. Quantitative characterization of the surface nonlinear susceptibility indicates a large out-of-plane response of PdTe$_2$ with $|χ_{ccc}^{(2)}|$ up to 25 $\times$ 10$^{-18}$ m$^2$/V. Our results support the topological surfaces/interfaces as a new route toward applications of nonlinear optical effects with released symmetry constraints, and demonstrate SHG as a viable means to in situ study of kinetics of topological surfaces. △ Less

Submitted 17 August, 2023; originally announced August 2023.

arXiv:2308.08851 [pdf, other]

Toward a direct measurement of the cosmic acceleration: The pilot observation of H I 21cm absorption line at FAST

Authors: Jiangang Kang, Chang-Zhi Lu, TongJie Zhang, Ming Zhu

Abstract: This study presents results on detecting neutral atomic hydrogen (HI) 21cm absorption in the spectrum of PKS1413+135 at redshift $z=0.24670041$. The observation was conducted by FAST, with a spectral resolution of 10 Hz, using 10 minutes of observing time. The global spectral profile is examined by modeling the absorption line using a single Gaussian function with a resolution of 10 kHz within a 2… ▽ More This study presents results on detecting neutral atomic hydrogen (HI) 21cm absorption in the spectrum of PKS1413+135 at redshift $z=0.24670041$. The observation was conducted by FAST, with a spectral resolution of 10 Hz, using 10 minutes of observing time. The global spectral profile is examined by modeling the absorption line using a single Gaussian function with a resolution of 10 kHz within a 2 MHz bandwidth. The goal is to determine the rate of the latest cosmic acceleration by directly measuring redshift evolution of H I 21 cm absorption line with Hubble flow towards a same background Quasar over a decade or longer time span. This will serve as a detectable signal generated by the accelerated expansion of the Universe at redshift $z < 1$, referred to as redshift drift $\dot{z}$ or the SL effect. The measured HI gas column density in this DLA system is approximately equivalent to the initial observation value, considering uncertainties of the spin temperature of a spiral host galaxy. The high signal-to-noise ratio of 57, obtained at a 10 kHz resolution, strongly supports the feasibility of using the H I 21 cm absorption line in DLA systems to accurately measure the redshift drift rate at a precision level of around $10^{-10}$ per decade. △ Less

Submitted 7 May, 2024; v1 submitted 17 August, 2023; originally announced August 2023.

Comments: 16 pages,6 figures, 2 tables, Accepted for publication by RAA

Showing 201–250 of 1,606 results for author: Lu, C