-
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Authors:
Open X-Embodiment Collaboration,
Abby O'Neill,
Abdul Rehman,
Abhinav Gupta,
Abhiram Maddukuri,
Abhishek Gupta,
Abhishek Padalkar,
Abraham Lee,
Acorn Pooley,
Agrim Gupta,
Ajay Mandlekar,
Ajinkya Jain,
Albert Tung,
Alex Bewley,
Alex Herzog,
Alex Irpan,
Alexander Khazatsky,
Anant Rai,
Anchit Gupta,
Andrew Wang,
Andrey Kolobov,
Anikait Singh,
Animesh Garg,
Aniruddha Kembhavi,
Annie Xie
, et al. (267 additional authors not shown)
Abstract:
Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method…
▽ More
Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train generalist X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. More details can be found on the project website https://robotics-transformer-x.github.io.
△ Less
Submitted 1 June, 2024; v1 submitted 13 October, 2023;
originally announced October 2023.
-
Fast Word Error Rate Estimation Using Self-Supervised Representations For Speech And Text
Authors:
Chanho Park,
Chengsong Lu,
Mingjie Chen,
Thomas Hain
Abstract:
The quality of automatic speech recognition (ASR) is typically measured by word error rate (WER). WER estimation is a task aiming to predict the WER of an ASR system, given a speech utterance and a transcription. This task has gained increasing attention while advanced ASR systems are trained on large amounts of data. In this case, WER estimation becomes necessary in many scenarios, for example, s…
▽ More
The quality of automatic speech recognition (ASR) is typically measured by word error rate (WER). WER estimation is a task aiming to predict the WER of an ASR system, given a speech utterance and a transcription. This task has gained increasing attention while advanced ASR systems are trained on large amounts of data. In this case, WER estimation becomes necessary in many scenarios, for example, selecting training data with unknown transcription quality or estimating the testing performance of an ASR system without ground truth transcriptions. Facing large amounts of data, the computation efficiency of a WER estimator becomes essential in practical applications. However, previous works usually did not consider it as a priority. In this paper, a Fast WER estimator (Fe-WER) using self-supervised learning representation (SSLR) is introduced. The estimator is built upon SSLR aggregated by average pooling. The results show that Fe-WER outperformed the e-WER3 baseline relatively by 19.69% and 7.16% on Ted-Lium3 in both evaluation metrics of root mean square error and Pearson correlation coefficient, respectively. Moreover, the estimation weighted by duration was 10.43% when the target was 10.88%. Lastly, the inference speed was about 4x in terms of a real-time factor.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation
Authors:
Bo Peng,
Xinyuan Chen,
Yaohui Wang,
Chaochao Lu,
Yu Qiao
Abstract:
Recent works have successfully extended large-scale text-to-image models to the video domain, producing promising results but at a high computational cost and requiring a large amount of video data. In this work, we introduce ConditionVideo, a training-free approach to text-to-video generation based on the provided condition, video, and input text, by leveraging the power of off-the-shelf text-to-…
▽ More
Recent works have successfully extended large-scale text-to-image models to the video domain, producing promising results but at a high computational cost and requiring a large amount of video data. In this work, we introduce ConditionVideo, a training-free approach to text-to-video generation based on the provided condition, video, and input text, by leveraging the power of off-the-shelf text-to-image generation methods (e.g., Stable Diffusion). ConditionVideo generates realistic dynamic videos from random noise or given scene videos. Our method explicitly disentangles the motion representation into condition-guided and scenery motion components. To this end, the ConditionVideo model is designed with a UNet branch and a control branch. To improve temporal coherence, we introduce sparse bi-directional spatial-temporal attention (sBiST-Attn). The 3D control network extends the conventional 2D controlnet model, aiming to strengthen conditional generation accuracy by additionally leveraging the bi-directional frames in the temporal domain. Our method exhibits superior performance in terms of frame consistency, clip score, and conditional accuracy, outperforming other compared methods.
△ Less
Submitted 23 May, 2024; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Many-body entanglement and spectral clusters in the extended hard-core bosonic Hatano-Nelson model
Authors:
Chao-Ze Lu,
Gaoyong Sun
Abstract:
We study many-body entanglements and spectra of the extended bosonic Hatano-Nelson model in the hard-core limit. We show that the system undergoes a phase transition from a gapless phase to a charge density wave phase accompanied by a $\mathcal{PT}$ transition in the first excited state. The phase transition is characterized by the crossing of the ground-state biorthogonal order parameter and the…
▽ More
We study many-body entanglements and spectra of the extended bosonic Hatano-Nelson model in the hard-core limit. We show that the system undergoes a phase transition from a gapless phase to a charge density wave phase accompanied by a $\mathcal{PT}$ transition in the first excited state. The phase transition is characterized by the crossing of the ground-state biorthogonal order parameter and the sudden change of the first excited-state entanglement entropy. The gapless phase is verified by the logarithmic scaling of the ground-state entanglement entropy with the central charge $c=1$. Furthermore, we show that all energy spectral clusters would form ellipses in strong nearest-neighbor interactions, for which we establish a universal scaling law. The lengths of the major and minor axes are shown to obey power laws with respect to the nearest-neighbor interaction. The exact expressions are derived for the numbers of energy levels on the outermost elliptic ring of each clusters.
△ Less
Submitted 11 April, 2024; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Score Regularized Policy Optimization through Diffusion Behavior
Authors:
Huayu Chen,
Cheng Lu,
Zhengyi Wang,
Hang Su,
Jun Zhu
Abstract:
Recent developments in offline reinforcement learning have uncovered the immense potential of diffusion modeling, which excels at representing heterogeneous behavior policies. However, sampling from diffusion policies is considerably slow because it necessitates tens to hundreds of iterative inference steps for one action. To address this issue, we propose to extract an efficient deterministic inf…
▽ More
Recent developments in offline reinforcement learning have uncovered the immense potential of diffusion modeling, which excels at representing heterogeneous behavior policies. However, sampling from diffusion policies is considerably slow because it necessitates tens to hundreds of iterative inference steps for one action. To address this issue, we propose to extract an efficient deterministic inference policy from critic models and pretrained diffusion behavior models, leveraging the latter to directly regularize the policy gradient with the behavior distribution's score function during optimization. Our method enjoys powerful generative capabilities of diffusion modeling while completely circumventing the computationally intensive and time-consuming diffusion sampling scheme, both during training and evaluation. Extensive results on D4RL tasks show that our method boosts action sampling speed by more than 25 times compared with various leading diffusion-based methods in locomotion tasks, while still maintaining state-of-the-art performance.
△ Less
Submitted 14 March, 2024; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Generate Coherent Rays Directly
Authors:
Fengqi Liu,
Zaonan Tan,
Weilai Xiang,
Chenhao Lu,
Dan Li,
Xu Gong,
Yulong Shi,
Songnan Shi,
Qilong Kou,
Bo Hu
Abstract:
The path tracing method generates incoherent rays by randomly sampling directions. This randomness makes it unsuitable for modern processor architectures that rely on coherence to achieve optimal performance. Many efforts have been made to address this issue by reordering rays based on their origin, end, or direction to enhance coherence. However, a drawback of reordering methods is the need to en…
▽ More
The path tracing method generates incoherent rays by randomly sampling directions. This randomness makes it unsuitable for modern processor architectures that rely on coherence to achieve optimal performance. Many efforts have been made to address this issue by reordering rays based on their origin, end, or direction to enhance coherence. However, a drawback of reordering methods is the need to encode and sort rays before tracing, introducing additional overhead. We propose a technique to generate coherent rays directly by reusing the direction. Additionally, we introduce an interleaved reuse domain partition method to mitigate the impact of sampling correlation resulting from direction reuse. We demonstrate the effectiveness of our approach across various scenes, establishing its superiority over reordering methods.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Tunable non-Lifshitz-Kosevich temperature dependence of Shubnikov-de Haas oscillation amplitudes in SmSb
Authors:
Wei Zhang,
C. N. Kuo,
S. T. Kuo,
Chun Wa So,
Jianyu Xie,
Kwing To Lai,
Wing Chi Yu,
C. S. Lue,
Hoi Chun Po,
Swee K. Goh
Abstract:
The Lifshitz-Kosevich (LK) theory is the pillar of magnetic quantum oscillations, which have been extensively applied to characterize a wide range of metallic states. In this study, we focus on the Shubnikov-de Haas (SdH) effect observed in SmSb, a rare-earth monopnictide. We observed a significant departure from the expected LK theory near $T_N=2.4$~K: both a peak-like anomaly and an enhancement…
▽ More
The Lifshitz-Kosevich (LK) theory is the pillar of magnetic quantum oscillations, which have been extensively applied to characterize a wide range of metallic states. In this study, we focus on the Shubnikov-de Haas (SdH) effect observed in SmSb, a rare-earth monopnictide. We observed a significant departure from the expected LK theory near $T_N=2.4$~K: both a peak-like anomaly and an enhancement in the temperature dependence of quantum oscillation amplitude are seen in SmSb. Moreover, we discovered a remarkable sensitivity of the SdH amplitudes to sample purity. By adjusting the sample purity, we were able to tune the temperature dependence of the $α$ band's SdH amplitudes from a peak-like anomalous behavior to an enhancement. Therefore, SdH oscillations from the $α$ band connect the two well-known non-LK behaviours, controllable through varying the sample purity, paving the way for developing further understanding of the mechanism leading to the anomalous quantum oscillations.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Progressive Neural Compression for Adaptive Image Offloading under Timing Constraints
Authors:
Ruiqi Wang,
Hanyang Liu,
Jiaming Qiu,
Moran Xu,
Roch Guerin,
Chenyang Lu
Abstract:
IoT devices are increasingly the source of data for machine learning (ML) applications running on edge servers. Data transmissions from devices to servers are often over local wireless networks whose bandwidth is not just limited but, more importantly, variable. Furthermore, in cyber-physical systems interacting with the physical environment, image offloading is also commonly subject to timing con…
▽ More
IoT devices are increasingly the source of data for machine learning (ML) applications running on edge servers. Data transmissions from devices to servers are often over local wireless networks whose bandwidth is not just limited but, more importantly, variable. Furthermore, in cyber-physical systems interacting with the physical environment, image offloading is also commonly subject to timing constraints. It is, therefore, important to develop an adaptive approach that maximizes the inference performance of ML applications under timing constraints and the resource constraints of IoT devices. In this paper, we use image classification as our target application and propose progressive neural compression (PNC) as an efficient solution to this problem. Although neural compression has been used to compress images for different ML applications, existing solutions often produce fixed-size outputs that are unsuitable for timing-constrained offloading over variable bandwidth. To address this limitation, we train a multi-objective rateless autoencoder that optimizes for multiple compression rates via stochastic taildrop to create a compression solution that produces features ordered according to their importance to inference performance. Features are then transmitted in that order based on available bandwidth, with classification ultimately performed using the (sub)set of features received by the deadline. We demonstrate the benefits of PNC over state-of-the-art neural compression approaches and traditional compression methods on a testbed comprising an IoT device and an edge server connected over a wireless network with varying bandwidth.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
DeepQTest: Testing Autonomous Driving Systems with Reinforcement Learning and Real-world Weather Data
Authors:
Chengjie Lu,
Tao Yue,
Man Zhang,
Shaukat Ali
Abstract:
Autonomous driving systems (ADSs) are capable of sensing the environment and making driving decisions autonomously. These systems are safety-critical, and testing them is one of the important approaches to ensure their safety. However, due to the inherent complexity of ADSs and the high dimensionality of their operating environment, the number of possible test scenarios for ADSs is infinite. Besid…
▽ More
Autonomous driving systems (ADSs) are capable of sensing the environment and making driving decisions autonomously. These systems are safety-critical, and testing them is one of the important approaches to ensure their safety. However, due to the inherent complexity of ADSs and the high dimensionality of their operating environment, the number of possible test scenarios for ADSs is infinite. Besides, the operating environment of ADSs is dynamic, continuously evolving, and full of uncertainties, which requires a testing approach adaptive to the environment. In addition, existing ADS testing techniques have limited effectiveness in ensuring the realism of test scenarios, especially the realism of weather conditions and their changes over time. Recently, reinforcement learning (RL) has demonstrated great potential in addressing challenging problems, especially those requiring constant adaptations to dynamic environments. To this end, we present DeepQTest, a novel ADS testing approach that uses RL to learn environment configurations with a high chance of revealing abnormal ADS behaviors. Specifically, DeepQTest employs Deep Q-Learning and adopts three safety and comfort measures to construct the reward functions. To ensure the realism of generated scenarios, DeepQTest defines a set of realistic constraints and introduces real-world weather conditions into the simulated environment. We employed three comparison baselines, i.e., random, greedy, and a state-of-the-art RL-based approach DeepCOllision, for evaluating DeepQTest on an industrial-scale ADS. Evaluation results show that DeepQTest demonstrated significantly better effectiveness in terms of generating scenarios leading to collisions and ensuring scenario realism compared with the baselines. In addition, among the three reward functions implemented in DeepQTest, Time-To-Collision is recommended as the best design according to our study.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
Learning to Rank Onset-Occurring-Offset Representations for Micro-Expression Recognition
Authors:
Jie Zhu,
Yuan Zong,
Jingang Shi,
Cheng Lu,
Hongli Chang,
Wenming Zheng
Abstract:
This paper focuses on the research of micro-expression recognition (MER) and proposes a flexible and reliable deep learning method called learning to rank onset-occurring-offset representations (LTR3O). The LTR3O method introduces a dynamic and reduced-size sequence structure known as 3O, which consists of onset, occurring, and offset frames, for representing micro-expressions (MEs). This structur…
▽ More
This paper focuses on the research of micro-expression recognition (MER) and proposes a flexible and reliable deep learning method called learning to rank onset-occurring-offset representations (LTR3O). The LTR3O method introduces a dynamic and reduced-size sequence structure known as 3O, which consists of onset, occurring, and offset frames, for representing micro-expressions (MEs). This structure facilitates the subsequent learning of ME-discriminative features. A noteworthy advantage of the 3O structure is its flexibility, as the occurring frame is randomly extracted from the original ME sequence without the need for accurate frame spotting methods. Based on the 3O structures, LTR3O generates multiple 3O representation candidates for each ME sample and incorporates well-designed modules to measure and calibrate their emotional expressiveness. This calibration process ensures that the distribution of these candidates aligns with that of macro-expressions (MaMs) over time. Consequently, the visibility of MEs can be implicitly enhanced, facilitating the reliable learning of more discriminative features for MER. Extensive experiments were conducted to evaluate the performance of LTR3O using three widely-used ME databases: CASME II, SMIC, and SAMM. The experimental results demonstrate the effectiveness and superior performance of LTR3O, particularly in terms of its flexibility and reliability, when compared to recent state-of-the-art MER methods.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Multi-alpha Boson Gas state in Fusion Evaporation Reaction and Three-body Force
Authors:
Taofeng Wang,
Ziming Li,
R. B. Wiringa,
Minliang Liu,
Jiansong Wang,
Yanyun Yang,
Qinghua He,
Zhiyu Sun,
Chengjian Lin,
M. Assié,
Y. Ayyad,
D. Beaumel,
Zhen Bai,
Fangfang Duan,
Zhihao Gao,
Song Guo,
Yue Hu,
Wei Jiang,
F. Kobayashi,
Chengui Lu,
Junbing Ma,
Peng Ma,
P. Napolitani,
G. Verde,
Jianguo Wang
, et al. (11 additional authors not shown)
Abstract:
The experimental evidence for the $α$ Boson gas state in the $^{11}$C+$^{12}$C$\rightarrow$$^{23}$Mg$^{\ast}$ fusion evaporation reaction is presented. By measuring the $α$ emission spectrum with multiplicity 2 and 3, we provide insight into the existence of a three-body force among $α$ particles. The observed spectrum exhibited distinct tails corresponding to $α$ particles emitted in pairs and tr…
▽ More
The experimental evidence for the $α$ Boson gas state in the $^{11}$C+$^{12}$C$\rightarrow$$^{23}$Mg$^{\ast}$ fusion evaporation reaction is presented. By measuring the $α$ emission spectrum with multiplicity 2 and 3, we provide insight into the existence of a three-body force among $α$ particles. The observed spectrum exhibited distinct tails corresponding to $α$ particles emitted in pairs and triplets consistent well with the model-calculations of AV18-UX and chiral effective field theory of NV2-3-la*, indicating the formation of $α$ clusters with three-body force in the Boson gas state.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Aspect of Clusters Correlation at Light Nuclei Excited State
Authors:
Ziming Li,
Jie Zhu,
Taofeng Wang,
Minliang Liu,
Jiansong Wang,
Yanyun Yang,
Chengjian Lin,
Zhiyu Sun,
Qinghua He,
M. Assié,
Y. Ayyad,
D. Beaumel,
Zhen Bai,
Fangfang Duan,
Zhihao Gao,
Song Guo,
Yue Hu,
Wei Jiang,
F. Kobayashi,
Chengui Lu,
Junbing Ma,
Peng Ma,
P. Napolitani,
G. Verde,
Jianguo Wang
, et al. (11 additional authors not shown)
Abstract:
The correlation of $αα$ was probed via measuring the transverse momentum $p_{T}$ and width $δp_{T}$ of one $α$, for the first time, which represents the spatial and dynamical essentialities of the initial coupling state in $^{8}$Be nucleus. The weighted interaction vertex of 3$α$ reflected by the magnitudes of their relative momentums and relative emission angles proves the isosceles triangle conf…
▽ More
The correlation of $αα$ was probed via measuring the transverse momentum $p_{T}$ and width $δp_{T}$ of one $α$, for the first time, which represents the spatial and dynamical essentialities of the initial coupling state in $^{8}$Be nucleus. The weighted interaction vertex of 3$α$ reflected by the magnitudes of their relative momentums and relative emission angles proves the isosceles triangle configuration for 3$α$ at the high excited energy analogous Hoyle states.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Variation of Tensor Force due to Nuclear Medium Effect
Authors:
Ziming Li,
Jie Zhu,
Taofeng Wang,
Minliang Liu,
Jiansong Wang,
Yanyun Yang,
Chengjian Lin,
Zhiyu Sun,
Qinghua He,
M. Assié,
Y. Ayyad,
D. Beaumel,
Zhen Bai,
Fangfang Duan,
Zhihao Gao,
Song Guo,
Yue Hu,
Wei Jiang,
F. Kobayashi,
Chengui Lu,
Junbing Ma,
Peng Ma,
P. Napolitani,
G. Verde,
Jianguo Wang
, et al. (11 additional authors not shown)
Abstract:
The enhancement of $J^π(T)$=3$^{+}$(0) state with isospin $T=0$ excited by the tensor force in the free $^{6}$Li nucleus has been observed, for the first time, relative to a shrinkable excitation in the $^{6}$Li cluster component inside its host nucleus. Comparatively, the excitation of $J^π(T)$=0$^{+}$(1) state with isospin $T=1$ for these two $^{6}$Li formations take on an approximately equal ex…
▽ More
The enhancement of $J^π(T)$=3$^{+}$(0) state with isospin $T=0$ excited by the tensor force in the free $^{6}$Li nucleus has been observed, for the first time, relative to a shrinkable excitation in the $^{6}$Li cluster component inside its host nucleus. Comparatively, the excitation of $J^π(T)$=0$^{+}$(1) state with isospin $T=1$ for these two $^{6}$Li formations take on an approximately equal excitation strength. The mechanism of such tensor force effect was proposed due to the intensive nuclear medium role on isospin $T$=0 state.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Bridging the Gap between Human Motion and Action Semantics via Kinematic Phrases
Authors:
Xinpeng Liu,
Yong-Lu Li,
Ailing Zeng,
Zizheng Zhou,
Yang You,
Cewu Lu
Abstract:
Motion understanding aims to establish a reliable mapping between motion and action semantics, while it is a challenging many-to-many problem. An abstract action semantic (i.e., walk forwards) could be conveyed by perceptually diverse motions (walking with arms up or swinging). In contrast, a motion could carry different semantics w.r.t. its context and intention. This makes an elegant mapping bet…
▽ More
Motion understanding aims to establish a reliable mapping between motion and action semantics, while it is a challenging many-to-many problem. An abstract action semantic (i.e., walk forwards) could be conveyed by perceptually diverse motions (walking with arms up or swinging). In contrast, a motion could carry different semantics w.r.t. its context and intention. This makes an elegant mapping between them difficult. Previous attempts adopted direct-mapping paradigms with limited reliability. Also, current automatic metrics fail to provide reliable assessments of the consistency between motions and action semantics. We identify the source of these problems as the significant gap between the two modalities. To alleviate this gap, we propose Kinematic Phrases (KP) that take the objective kinematic facts of human motion with proper abstraction, interpretability, and generality. Based on KP, we can unify a motion knowledge base and build a motion understanding system. Meanwhile, KP can be automatically converted from motions to text descriptions with no subjective bias, inspiring Kinematic Prompt Generation (KPG) as a novel white-box motion generation benchmark. In extensive experiments, our approach shows superiority over other methods. Our project is available at https://foruck.github.io/KP/.
△ Less
Submitted 11 July, 2024; v1 submitted 6 October, 2023;
originally announced October 2023.
-
Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition
Authors:
Yan Zhao,
Yuan Zong,
Jincen Wang,
Hailun Lian,
Cheng Lu,
Li Zhao,
Wenming Zheng
Abstract:
In this paper, we propose a new unsupervised domain adaptation (DA) method called layer-adapted implicit distribution alignment networks (LIDAN) to address the challenge of cross-corpus speech emotion recognition (SER). LIDAN extends our previous ICASSP work, deep implicit distribution alignment networks (DIDAN), whose key contribution lies in the introduction of a novel regularization term called…
▽ More
In this paper, we propose a new unsupervised domain adaptation (DA) method called layer-adapted implicit distribution alignment networks (LIDAN) to address the challenge of cross-corpus speech emotion recognition (SER). LIDAN extends our previous ICASSP work, deep implicit distribution alignment networks (DIDAN), whose key contribution lies in the introduction of a novel regularization term called implicit distribution alignment (IDA). This term allows DIDAN trained on source (training) speech samples to remain applicable to predicting emotion labels for target (testing) speech samples, regardless of corpus variance in cross-corpus SER. To further enhance this method, we extend IDA to layer-adapted IDA (LIDA), resulting in LIDAN. This layer-adpated extention consists of three modified IDA terms that consider emotion labels at different levels of granularity. These terms are strategically arranged within different fully connected layers in LIDAN, aligning with the increasing emotion-discriminative abilities with respect to the layer depth. This arrangement enables LIDAN to more effectively learn emotion-discriminative and corpus-invariant features for SER across various corpora compared to DIDAN. It is also worthy to mention that unlike most existing methods that rely on estimating statistical moments to describe pre-assumed explicit distributions, both IDA and LIDA take a different approach. They utilize an idea of target sample reconstruction to directly bridge the feature distribution gap without making assumptions about their distribution type. As a result, DIDAN and LIDAN can be viewed as implicit cross-corpus SER methods. To evaluate LIDAN, we conducted extensive cross-corpus SER experiments on EmoDB, eNTERFACE, and CASIA corpora. The experimental results demonstrate that LIDAN surpasses recent state-of-the-art explicit unsupervised DA methods in tackling cross-corpus SER tasks.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Interplay of two $E_g$ orbitals in Superconducting La$_3$Ni$_2$O$_7$ Under Pressure
Authors:
Chen Lu,
Zhiming Pan,
Fan Yang,
Congjun Wu
Abstract:
The discovery of high-$T_c$ superconductivity (SC) in La$_3$Ni$_2$O$_7$ (LNO) has aroused a great deal of interests. Previously, it was proposed that the Ni-$3d_{z^2}$ orbital is crucial to realize the high-$T_c$ SC in LNO: The preformed Cooper pairs therein acquire coherence via hybridization with the $3d_{x^2-y^2}$ orbital to form the SC. However, we held a different viewpoint that the interlaye…
▽ More
The discovery of high-$T_c$ superconductivity (SC) in La$_3$Ni$_2$O$_7$ (LNO) has aroused a great deal of interests. Previously, it was proposed that the Ni-$3d_{z^2}$ orbital is crucial to realize the high-$T_c$ SC in LNO: The preformed Cooper pairs therein acquire coherence via hybridization with the $3d_{x^2-y^2}$ orbital to form the SC. However, we held a different viewpoint that the interlayer pairing $s$-wave SC is induced by the $3d_{x^2-y^2}$ orbital, driven by the strong interlayer superexchange interaction. To include effects from both $E_g$-orbitals , we establish a two-orbital bilayer $t$-$J$ model. Our calculations reveal that due to the no-double-occupancy constraint, the $3d_{x^2-y^2}$ band and the $3d_{z^2}$ bonding band are flattened by a factor of about 2 and 10, respectively, which is consistent with recent angle-resolved-photo-emission-spectroscopy measurements. Consequently, a high temperature SC can be hardly induced in the $3d_{z^2}$-orbital due to the difficulty to develop phase coherence. However, it can be easily achieved by the $3d_{x^2-y^2}$ orbital under realistic interaction strength. With electron doping, the $3d_{z^2}$-band gradually dives below the Fermi level, but $T_c$ continues to enhance, suggesting that it is not necessary for the high-$T_c$ SC in LNO. With hole doping, $T_c$ initially drops and then rises, accompanied by the crossover from the BCS to BEC-type superconducting transitions.
△ Less
Submitted 11 December, 2023; v1 submitted 4 October, 2023;
originally announced October 2023.
-
Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design
Authors:
Matthew Thomas Jackson,
Minqi Jiang,
Jack Parker-Holder,
Risto Vuorio,
Chris Lu,
Gregory Farquhar,
Shimon Whiteson,
Jakob Nicolaus Foerster
Abstract:
The past decade has seen vast progress in deep reinforcement learning (RL) on the back of algorithms manually designed by human researchers. Recently, it has been shown that it is possible to meta-learn update rules, with the hope of discovering algorithms that can perform well on a wide range of RL tasks. Despite impressive initial results from algorithms such as Learned Policy Gradient (LPG), th…
▽ More
The past decade has seen vast progress in deep reinforcement learning (RL) on the back of algorithms manually designed by human researchers. Recently, it has been shown that it is possible to meta-learn update rules, with the hope of discovering algorithms that can perform well on a wide range of RL tasks. Despite impressive initial results from algorithms such as Learned Policy Gradient (LPG), there remains a generalization gap when these algorithms are applied to unseen environments. In this work, we examine how characteristics of the meta-training distribution impact the generalization performance of these algorithms. Motivated by this analysis and building on ideas from Unsupervised Environment Design (UED), we propose a novel approach for automatically generating curricula to maximize the regret of a meta-learned optimizer, in addition to a novel approximation of regret, which we name algorithmic regret (AR). The result is our method, General RL Optimizers Obtained Via Environment Design (GROOVE). In a series of experiments, we show that GROOVE achieves superior generalization to LPG, and evaluate AR against baseline metrics from UED, identifying it as a critical component of environment design in this setting. We believe this approach is a step towards the discovery of truly general RL algorithms, capable of solving a wide range of real-world environments.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Current direction dependent magnetotransport in CuTe
Authors:
Ying Kit Tsui,
C. N. Kuo,
C. E. Hsu,
Wei Zhang,
Wenyan Wang,
Shanmin Wang,
Wing Chi Yu,
H. C. Hsueh,
C. S. Lue,
Swee K. Goh
Abstract:
Despite being a layered, easily-exfoliated compound, copper monotelluride (CuTe) features an unusual quasi-one-dimensional charge density wave below $T_{\rm CDW}\approx335$ K. Within a CuTe layer, the electrical resistivity depends sensitively on the direction of the electrical current. Here, we use magnetotransport to probe the metallic state of CuTe with two distinct in-plane current directions.…
▽ More
Despite being a layered, easily-exfoliated compound, copper monotelluride (CuTe) features an unusual quasi-one-dimensional charge density wave below $T_{\rm CDW}\approx335$ K. Within a CuTe layer, the electrical resistivity depends sensitively on the direction of the electrical current. Here, we use magnetotransport to probe the metallic state of CuTe with two distinct in-plane current directions. When the current flows along the $a$-axis ($I//a$), the magnetoresistance exhibits a downward curvature as the magnetic field increases. On the other hand, when the current is along the $b$-axis ($I//b$), the magnetoresistance shows the opposite curvature. Our analysis uncovers a violation of Kohler scaling, but only for $I//a$. Shubnikov-de Haas oscillations are detected at low temperatures. Our results shed light on the nature of the metallic state in CuTe with the development of the charge density wave.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Synthesis technique and electron beam damage study of nanometer-thin single-crystalline Thymine
Authors:
Hazem Daoud,
Sreelaja Pulleri Vadhyar,
Ehsan Nikbin,
Cheng Lu,
R. J. Dwayne Miller
Abstract:
Samples suitable for electron diffraction studies must satisfy certain characteristics such as having a thickness in the range of 10 - 100 nm. We report, to our knowledge, the first successful synthesis technique of nanometer-thin sheets of single-crystalline thymine suitable for electron diffraction and spectroscopy studies. This development provides a well defined system to explore issues relate…
▽ More
Samples suitable for electron diffraction studies must satisfy certain characteristics such as having a thickness in the range of 10 - 100 nm. We report, to our knowledge, the first successful synthesis technique of nanometer-thin sheets of single-crystalline thymine suitable for electron diffraction and spectroscopy studies. This development provides a well defined system to explore issues related to UV photochemistry of DNA and high intrinsic stability essential to maintaining integrity of genetic information. The crystals are grown using the evaporation technique and the nanometer-thin sheets are obtained via microtoming. The sample is characterized via x-ray diffraction (XRD) and is subsequently studied using electron diffraction via a transmission electron microscope (TEM). Thymine is found to be more radiation resistant than similar molecular moieties (e.g., carbamazepine) by a factor of 5. This raises interesting questions about the role of the fast relaxation processes of electron scattering-induced excited states, extending the concept of radiation hardening beyond photoexcited states. The high stability of thymine in particular opens the door for further studies of these ultrafast relaxation processes giving rise to the high stability of DNA to UV radiation.
△ Less
Submitted 12 January, 2024; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Learning Decentralized Flocking Controllers with Spatio-Temporal Graph Neural Network
Authors:
Siji Chen,
Yanshen Sun,
Peihan Li,
Lifeng Zhou,
Chang-Tien Lu
Abstract:
Recently a line of researches has delved the use of graph neural networks (GNNs) for decentralized control in swarm robotics. However, it has been observed that relying solely on the states of immediate neighbors is insufficient to imitate a centralized control policy. To address this limitation, prior studies proposed incorporating $L$-hop delayed states into the computation. While this approach…
▽ More
Recently a line of researches has delved the use of graph neural networks (GNNs) for decentralized control in swarm robotics. However, it has been observed that relying solely on the states of immediate neighbors is insufficient to imitate a centralized control policy. To address this limitation, prior studies proposed incorporating $L$-hop delayed states into the computation. While this approach shows promise, it can lead to a lack of consensus among distant flock members and the formation of small clusters, consequently resulting in the failure of cohesive flocking behaviors. Instead, our approach leverages spatiotemporal GNN, named STGNN that encompasses both spatial and temporal expansions. The spatial expansion collects delayed states from distant neighbors, while the temporal expansion incorporates previous states from immediate neighbors. The broader and more comprehensive information gathered from both expansions results in more effective and accurate predictions. We develop an expert algorithm for controlling a swarm of robots and employ imitation learning to train our decentralized STGNN model based on the expert algorithm. We simulate the proposed STGNN approach in various settings, demonstrating its decentralized capacity to emulate the global expert algorithm. Further, we implemented our approach to achieve cohesive flocking, leader following and obstacle avoidance by a group of Crazyflie drones. The performance of STGNN underscores its potential as an effective and reliable approach for achieving cohesive flocking, leader following and obstacle avoidance tasks.
△ Less
Submitted 2 October, 2023; v1 submitted 29 September, 2023;
originally announced September 2023.
-
Robust 3D Object Detection from LiDAR-Radar Point Clouds via Cross-Modal Feature Augmentation
Authors:
Jianning Deng,
Gabriel Chan,
Hantao Zhong,
Chris Xiaoxuan Lu
Abstract:
This paper presents a novel framework for robust 3D object detection from point clouds via cross-modal hallucination. Our proposed approach is agnostic to either hallucination direction between LiDAR and 4D radar. We introduce multiple alignments on both spatial and feature levels to achieve simultaneous backbone refinement and hallucination generation. Specifically, spatial alignment is proposed…
▽ More
This paper presents a novel framework for robust 3D object detection from point clouds via cross-modal hallucination. Our proposed approach is agnostic to either hallucination direction between LiDAR and 4D radar. We introduce multiple alignments on both spatial and feature levels to achieve simultaneous backbone refinement and hallucination generation. Specifically, spatial alignment is proposed to deal with the geometry discrepancy for better instance matching between LiDAR and radar. The feature alignment step further bridges the intrinsic attribute gap between the sensing modalities and stabilizes the training. The trained object detection models can deal with difficult detection cases better, even though only single-modal data is used as the input during the inference stage. Extensive experiments on the View-of-Delft (VoD) dataset show that our proposed method outperforms the state-of-the-art (SOTA) methods for both radar and LiDAR object detection while maintaining competitive efficiency in runtime. Code is available at https://github.com/DJNing/See_beyond_seeing.
△ Less
Submitted 12 March, 2024; v1 submitted 29 September, 2023;
originally announced September 2023.
-
Qwen Technical Report
Authors:
Jinze Bai,
Shuai Bai,
Yunfei Chu,
Zeyu Cui,
Kai Dang,
Xiaodong Deng,
Yang Fan,
Wenbin Ge,
Yu Han,
Fei Huang,
Binyuan Hui,
Luo Ji,
Mei Li,
Junyang Lin,
Runji Lin,
Dayiheng Liu,
Gao Liu,
Chengqiang Lu,
Keming Lu,
Jianxin Ma,
Rui Men,
Xingzhang Ren,
Xuancheng Ren,
Chuanqi Tan,
Sinan Tan
, et al. (23 additional authors not shown)
Abstract:
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Q…
▽ More
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat models finetuned with human alignment techniques. The base language models consistently demonstrate superior performance across a multitude of downstream tasks, and the chat models, particularly those trained using Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The chat models possess advanced tool-use and planning capabilities for creating agent applications, showcasing impressive performance even when compared to bigger models on complex tasks like utilizing a code interpreter. Furthermore, we have developed coding-specialized models, Code-Qwen and Code-Qwen-Chat, as well as mathematics-focused models, Math-Qwen-Chat, which are built upon base language models. These models demonstrate significantly improved performance in comparison with open-source models, and slightly fall behind the proprietary models.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
GAMMA: Generalizable Articulation Modeling and Manipulation for Articulated Objects
Authors:
Qiaojun Yu,
Junbo Wang,
Wenhai Liu,
Ce Hao,
Liu Liu,
Lin Shao,
Weiming Wang,
Cewu Lu
Abstract:
Articulated objects like cabinets and doors are widespread in daily life. However, directly manipulating 3D articulated objects is challenging because they have diverse geometrical shapes, semantic categories, and kinetic constraints. Prior works mostly focused on recognizing and manipulating articulated objects with specific joint types. They can either estimate the joint parameters or distinguis…
▽ More
Articulated objects like cabinets and doors are widespread in daily life. However, directly manipulating 3D articulated objects is challenging because they have diverse geometrical shapes, semantic categories, and kinetic constraints. Prior works mostly focused on recognizing and manipulating articulated objects with specific joint types. They can either estimate the joint parameters or distinguish suitable grasp poses to facilitate trajectory planning. Although these approaches have succeeded in certain types of articulated objects, they lack generalizability to unseen objects, which significantly impedes their application in broader scenarios. In this paper, we propose a novel framework of Generalizable Articulation Modeling and Manipulating for Articulated Objects (GAMMA), which learns both articulation modeling and grasp pose affordance from diverse articulated objects with different categories. In addition, GAMMA adopts adaptive manipulation to iteratively reduce the modeling errors and enhance manipulation performance. We train GAMMA with the PartNet-Mobility dataset and evaluate with comprehensive experiments in SAPIEN simulation and real-world Franka robot. Results show that GAMMA significantly outperforms SOTA articulation modeling and manipulation algorithms in unseen and cross-category articulated objects. We will open-source all codes and datasets in both simulation and real robots for reproduction in the final version. Images and videos are published on the project website at: http://sites.google.com/view/gamma-articulation
△ Less
Submitted 1 March, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
AirExo: Low-Cost Exoskeletons for Learning Whole-Arm Manipulation in the Wild
Authors:
Hongjie Fang,
Hao-Shu Fang,
Yiming Wang,
Jieji Ren,
Jingjing Chen,
Ruo Zhang,
Weiming Wang,
Cewu Lu
Abstract:
While humans can use parts of their arms other than the hands for manipulations like gathering and supporting, whether robots can effectively learn and perform the same type of operations remains relatively unexplored. As these manipulations require joint-level control to regulate the complete poses of the robots, we develop AirExo, a low-cost, adaptable, and portable dual-arm exoskeleton, for tel…
▽ More
While humans can use parts of their arms other than the hands for manipulations like gathering and supporting, whether robots can effectively learn and perform the same type of operations remains relatively unexplored. As these manipulations require joint-level control to regulate the complete poses of the robots, we develop AirExo, a low-cost, adaptable, and portable dual-arm exoskeleton, for teleoperation and demonstration collection. As collecting teleoperated data is expensive and time-consuming, we further leverage AirExo to collect cheap in-the-wild demonstrations at scale. Under our in-the-wild learning framework, we show that with only 3 minutes of the teleoperated demonstrations, augmented by diverse and extensive in-the-wild data collected by AirExo, robots can learn a policy that is comparable to or even better than one learned from teleoperated demonstrations lasting over 20 minutes. Experiments demonstrate that our approach enables the model to learn a more general and robust policy across the various stages of the task, enhancing the success rates in task completion even with the presence of disturbances. Project website: https://airexo.github.io/
△ Less
Submitted 9 May, 2024; v1 submitted 26 September, 2023;
originally announced September 2023.
-
QuBEC: Boosting Equivalence Checking for Quantum Circuits with QEC Embedding
Authors:
Chao Lu,
Navnil Choudhury,
Utsav Banerjee,
Abdullah Ash Saki,
Kanad Basu
Abstract:
Quantum computing has proven to be capable of accelerating many algorithms by performing tasks that classical computers cannot. Currently, Noisy Intermediate Scale Quantum (NISQ) machines struggle from scalability and noise issues to render a commercial quantum computer. However, the physical and software improvements of a quantum computer can efficiently control quantum gate noise. As the complex…
▽ More
Quantum computing has proven to be capable of accelerating many algorithms by performing tasks that classical computers cannot. Currently, Noisy Intermediate Scale Quantum (NISQ) machines struggle from scalability and noise issues to render a commercial quantum computer. However, the physical and software improvements of a quantum computer can efficiently control quantum gate noise. As the complexity of quantum algorithms and implementation increases, software control of quantum circuits may lead to a more intricate design. Consequently, the verification of quantum circuits becomes crucial in ensuring the correctness of the compilation, along with other processes, including quantum error correction and assertions, that can increase the fidelity of quantum circuits. In this paper, we propose a Decision Diagram-based quantum equivalence checking approach, QuBEC, that requires less latency compared to existing techniques, while accounting for circuits with quantum error correction redundancy. Our proposed methodology reduces verification time on certain benchmark circuits by up to $271.49 \times$, while the number of Decision Diagram nodes required is reduced by up to $798.31 \times$, compared to state-of-the-art strategies. The proposed QuBEC framework can contribute to the advancement of quantum computing by enabling faster and more efficient verification of quantum circuits, paving the way for the development of larger and more complex quantum algorithms.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
RaTrack: Moving Object Detection and Tracking with 4D Radar Point Cloud
Authors:
Zhijun Pan,
Fangqiang Ding,
Hantao Zhong,
Chris Xiaoxuan Lu
Abstract:
Mobile autonomy relies on the precise perception of dynamic environments. Robustly tracking moving objects in 3D world thus plays a pivotal role for applications like trajectory prediction, obstacle avoidance, and path planning. While most current methods utilize LiDARs or cameras for Multiple Object Tracking (MOT), the capabilities of 4D imaging radars remain largely unexplored. Recognizing the c…
▽ More
Mobile autonomy relies on the precise perception of dynamic environments. Robustly tracking moving objects in 3D world thus plays a pivotal role for applications like trajectory prediction, obstacle avoidance, and path planning. While most current methods utilize LiDARs or cameras for Multiple Object Tracking (MOT), the capabilities of 4D imaging radars remain largely unexplored. Recognizing the challenges posed by radar noise and point sparsity in 4D radar data, we introduce RaTrack, an innovative solution tailored for radar-based tracking. Bypassing the typical reliance on specific object types and 3D bounding boxes, our method focuses on motion segmentation and clustering, enriched by a motion estimation module. Evaluated on the View-of-Delft dataset, RaTrack showcases superior tracking precision of moving objects, largely surpassing the performance of the state of the art. We release our code and model at https://github.com/LJacksonPan/RaTrack.
△ Less
Submitted 11 March, 2024; v1 submitted 18 September, 2023;
originally announced September 2023.
-
Antiferromagnetic to Ferrimagnetic Phase Transition and Possible Phase Coexistence in Polar Magnets (Fe$_{1-x}$Mn$_x$)$_2$Mo$_3$O$_8$
Authors:
Yuting Chang,
Lei Gao,
Yunlong Xie,
Bin You,
Yong Liu,
Rui Xiong,
Junfeng Wang,
Chengliang Lu,
JunMing Liu
Abstract:
In the present work, magnetic properties of single crystal (Fe$_{1-x}$Mn$_x$)$_2$Mo$_3$O$_8$ ($0<x<1$) have been studied by performing extensive measurements. A detailed magnetic phase diagram is built up, in which antiferromagnetic state dominates for $x<0.25$ and ferrimagnetic phase arises for $x>0.3$. Meanwhile, sizeable electric polarization of spin origin is commonly observed in all samples,…
▽ More
In the present work, magnetic properties of single crystal (Fe$_{1-x}$Mn$_x$)$_2$Mo$_3$O$_8$ ($0<x<1$) have been studied by performing extensive measurements. A detailed magnetic phase diagram is built up, in which antiferromagnetic state dominates for $x<0.25$ and ferrimagnetic phase arises for $x>0.3$. Meanwhile, sizeable electric polarization of spin origin is commonly observed in all samples, no matter what the magnetic state is. For the samples hosting a ferrimagnetic state, square-like magnetic hysteresis loops are revealed, while the remnant magnetization and coercive field can be tuned drastically by simply varying the Mn-content or temperature. Possible coexistence of the antiferromagnetic and ferrimagnetic phases is proposed to be responsible for the remarkable modulation of magnetic properties in the samples.
△ Less
Submitted 16 September, 2023;
originally announced September 2023.
-
Structural origin of the Jeff=1/2 antiferromagnetic phase in Ga-doped Sr2IrO4
Authors:
H. W. Wang,
L. Y. Zhang,
N. Hu,
B. You,
Y. T. Chang,
S. L. Yuan,
C. L. Lu,
J. M. Liu
Abstract:
Sr2IrO4 hosts a novel Jeff =1/2 Mott state and quasi-two-dimensional antiferromagnetic order, providing a unique avenue of exploring emergent states of matter and functions that are extraordinarily sensitive to any structural variations. While the correlation between the physical property and lattice structure in Sr2IrO4 has been a focused issue in the past decade, a common perception assumes that…
▽ More
Sr2IrO4 hosts a novel Jeff =1/2 Mott state and quasi-two-dimensional antiferromagnetic order, providing a unique avenue of exploring emergent states of matter and functions that are extraordinarily sensitive to any structural variations. While the correlation between the physical property and lattice structure in Sr2IrO4 has been a focused issue in the past decade, a common perception assumes that the magnetic ordering is essentially determined by the Ir-O-Ir bond angle. Therefore, a delicate modulation of this angle and consequently a major modulation of the magnetic ordering, by chemical doping such as Ga at Ir site, has been extensively investigated and well believed. In this work, however, we present a whole package of structure and magnetism data on a series of single crystal and polycrystalline Sr2Ir1-xGaxO4 samples, revealing the substantial difference in the Néel temperature TN between the two types of samples, and the TN value for the polycrystalline sample x = 0.09 is even 64 K higher than that of the single crystal sample x = 0.09 (deltaTN ~ 64 K at x = 0.09). Our systematic investigations demonstrate the crucial role of the c/a ratio in tuning the interlayer coupling and thereby the Neel point TN, i.e. a higher TN can be achieved as c/a is reduced. The notable differences in structural parameters between the two groups of samples are probably caused by additional strain due to the massive grain boundaries in polycrystalline samples. The present work suggests an additional ingredient of physics that is essential in modulating the emergent properties in Sr2IrO4 and probably other iridates.
△ Less
Submitted 16 September, 2023;
originally announced September 2023.
-
Strain tuned magnetotransport of Jeff=1/2 antiferromagnetic Sr2IrO4 thin films
Authors:
N. Hu,
Y. K. Weng,
K. Chen,
B. You,
Y. Liu,
Y. T. Chang,
R. Xiong,
S. Dong,
C. L. Lu
Abstract:
In this work, we report observation of strain effect on physical properties of Sr2IrO4 thin films grown on SrTiO3 (001) and LaAlO3 (001) substrates. It is found that the film on LaAlO3 with compressive strain has a lower antiferromagnetic transition temperature (TN~210 K) than the film on SrTiO3 (TN~230 K) with tensile strain, which is probably caused by modified interlayer coupling. Interestingly…
▽ More
In this work, we report observation of strain effect on physical properties of Sr2IrO4 thin films grown on SrTiO3 (001) and LaAlO3 (001) substrates. It is found that the film on LaAlO3 with compressive strain has a lower antiferromagnetic transition temperature (TN~210 K) than the film on SrTiO3 (TN~230 K) with tensile strain, which is probably caused by modified interlayer coupling. Interestingly, magnetoresistance due to pseudospin-flip of the film on LaAlO3 is much larger than that of tensile-strained film on SrTiO3, and robust anisotropic magnetoresistance is observed in the former, but H-driven reversal behavior is seen in the latter. By performing first principles calculations, it is revealed that epitaxial strain plays an efficient role in tuning the canting angle of Jeff=1/2 moments and thus net moment at every IrO2 layer, responsible for the difference in magnetoresistance between the films. The reversal of anisotropic magnetoresistance in the thin film on SrTiO3 can be ascribed to stabilization of a metastable stable with smaller bandgap as the Jeff=1/2 moments are aligned along the diagonal of basal plane by H. However, theoretical calculations reveal much higher magnetocrystalline anisotropy energy in the film on LaAlO3. This causes difficulties to drive the Jeff=1/2 moments to reach the diagonal and thereby the metastable state, explaining the distinct anisotropic magnetoresistance between two samples in a qualitative sense. Our findings indicate that strain can be a highly efficient mean to engineer the functionalities of Jeff=1/2 antiferromagnet Sr2IrO4.
△ Less
Submitted 16 September, 2023;
originally announced September 2023.
-
Colossal linear magnetoelectricity in polar magnet Fe2Mo3O8
Authors:
Yuting Chang,
Yakui Weng,
Yunlong Xie,
Bin You,
Junfeng Wang,
Liang Li,
Jun-Ming Liu,
Shuai Dong,
Chengliang Lu
Abstract:
Linear magnetoelectric effect is an attractive phenomenon in condensed matters and provides indispensable technological functionalities. Here a colossal linear magnetoelectric effect with diagonal component alfa_33 reaching up to ~480 ps/m is reported in a polar magnet Fe2Mo3O8, and this effect can persist in a broad range of magnetic field (~20 T) and is orders of magnitude larger than reported v…
▽ More
Linear magnetoelectric effect is an attractive phenomenon in condensed matters and provides indispensable technological functionalities. Here a colossal linear magnetoelectric effect with diagonal component alfa_33 reaching up to ~480 ps/m is reported in a polar magnet Fe2Mo3O8, and this effect can persist in a broad range of magnetic field (~20 T) and is orders of magnitude larger than reported values in literature. Such an exceptional experimental observation can be well reproduced by a theoretical model affirmatively unveiling the vital contributions from the exchange striction, while the sign difference of magnetocrystalline anisotropy can also be reasonably figured out.
△ Less
Submitted 16 September, 2023;
originally announced September 2023.
-
Quantum Pseudorandom Scramblers
Authors:
Chuhan Lu,
Minglong Qin,
Fang Song,
Penghui Yao,
Mingnan Zhao
Abstract:
Quantum pseudorandom state generators (PRSGs) have stimulated exciting developments in recent years. A PRSG, on a fixed initial (e.g., all-zero) state, produces an output state that is computationally indistinguishable from a Haar random state. However, pseudorandomness of the output state is not guaranteed on other initial states. In fact, known PRSG constructions provably fail on some initial st…
▽ More
Quantum pseudorandom state generators (PRSGs) have stimulated exciting developments in recent years. A PRSG, on a fixed initial (e.g., all-zero) state, produces an output state that is computationally indistinguishable from a Haar random state. However, pseudorandomness of the output state is not guaranteed on other initial states. In fact, known PRSG constructions provably fail on some initial state.
In this work, we propose and construct quantum Pseudorandom State Scramblers (PRSSs), which can produce a pseudorandom state on an arbitrary initial state. In the information-theoretical setting, we obtain a scrambler which maps an arbitrary initial state to a distribution of quantum states that is close to Haar random in total variation distance. As a result, our PRSS exhibits a dispersing property. Loosely, it can span an $ε$-net of the state space. This significantly strengthens what standard PRSGs can induce, as they may only concentrate on a small region of the state space as long as the average output state approximates a Haar random state in total variation distance.
Our PRSS construction develops a parallel extension of the famous Kac's walk, and we show that it mixes exponentially faster than the standard Kac's walk. This constitutes the core of our proof. We also describe a few applications of PRSSs. While our PRSS construction assumes a post-quantum one-way function, PRSSs are potentially a weaker primitive and can be separated from one-way functions in a relativized world similar to standard PRSGs.
△ Less
Submitted 16 September, 2023;
originally announced September 2023.
-
Real-time Monitoring for the Next Core-Collapse Supernova in JUNO
Authors:
Angel Abusleme,
Thomas Adam,
Shakeel Ahmad,
Rizwan Ahmed,
Sebastiano Aiello,
Muhammad Akram,
Abid Aleem,
Fengpeng An,
Qi An,
Giuseppe Andronico,
Nikolay Anfimov,
Vito Antonelli,
Tatiana Antoshkina,
Burin Asavapibhop,
João Pedro Athayde Marcondes de André,
Didier Auguste,
Weidong Bai,
Nikita Balashov,
Wander Baldini,
Andrea Barresi,
Davide Basilico,
Eric Baussan,
Marco Bellato,
Marco Beretta,
Antonio Bergnoli
, et al. (606 additional authors not shown)
Abstract:
The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neu…
▽ More
The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neutrinos at the Jiangmen Underground Neutrino Observatory (JUNO), a 20 kton liquid scintillator detector currently under construction in South China. The real-time monitoring system is designed to ensure both prompt alert speed and comprehensive coverage of progenitor stars. It incorporates prompt monitors on the electronic board as well as online monitors at the data acquisition stage. Assuming a false alert rate of 1 per year, this monitoring system exhibits sensitivity to pre-SN neutrinos up to a distance of approximately 1.6 (0.9) kiloparsecs and SN neutrinos up to about 370 (360) kiloparsecs for a progenitor mass of 30 solar masses, considering both normal and inverted mass ordering scenarios. The pointing ability of the CCSN is evaluated by analyzing the accumulated event anisotropy of inverse beta decay interactions from pre-SN or SN neutrinos. This, along with the early alert, can play a crucial role in facilitating follow-up multi-messenger observations of the next galactic or nearby extragalactic CCSN.
△ Less
Submitted 4 December, 2023; v1 submitted 13 September, 2023;
originally announced September 2023.
-
Effect of Rare-earth Element Substitution in Superconducting R$_3$Ni$_2$O$_7$ Under Pressure
Authors:
Zhiming Pan,
Chen Lu,
Fan Yang,
Congjun Wu
Abstract:
Recently, high temperature ($T_c\approx 80$K) superconductivity (SC) has been discovered in La$_3$Ni$_2$O$_7$ (LNO) under pressure. Question arises whether the transition temperature $T_c$ could be further enhanced under suitable conditions. A possible route for realizing higher $T_c$ is element substitution. Similar SC could appear in rare-earth (RE) R$_3$Ni$_2$O$_7$ (RNO, R=RE element) material…
▽ More
Recently, high temperature ($T_c\approx 80$K) superconductivity (SC) has been discovered in La$_3$Ni$_2$O$_7$ (LNO) under pressure. Question arises whether the transition temperature $T_c$ could be further enhanced under suitable conditions. A possible route for realizing higher $T_c$ is element substitution. Similar SC could appear in rare-earth (RE) R$_3$Ni$_2$O$_7$ (RNO, R=RE element) material series under pressure. The electronic properties in the RNO materials are dominated by the Ni $3d$ orbitals in the bilayer NiO$_2$ plane. In the strong coupling limit, the SC could be fully characterized by a bilayer single $3d_{x^2-y^2}$-orbital $t$-$J_{\parallel}$-$J_{\perp}$ model. Under RE element substitution from La to RE element, the lattice constant decreases and the electronic hopping increases, leading to stronger superexchanges between the $3d_{x^2-y^2}$ orbitals. Based on the slave-boson mean-field theory, we explore the pairing nature and the evolution of $T_c$ in RNO materials. Consequently, it is found that the element substitution does not alter the pairing nature, i.e. the inter-layer $s$-wave pairing is always favored in RNO. However, the $T_c$ increases from La to Sm and a nearly doubled $T_c$ is achieved for SmNO. This work provides evidence for possible higher $T_c$ R$_3$Ni$_2$O$_7$ materials, which may be realized in further experiments.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Self-Correlation and Cross-Correlation Learning for Few-Shot Remote Sensing Image Semantic Segmentation
Authors:
Linhan Wang,
Shuo Lei,
Jianfeng He,
Shengkun Wang,
Min Zhang,
Chang-Tien Lu
Abstract:
Remote sensing image semantic segmentation is an important problem for remote sensing image interpretation. Although remarkable progress has been achieved, existing deep neural network methods suffer from the reliance on massive training data. Few-shot remote sensing semantic segmentation aims at learning to segment target objects from a query image using only a few annotated support images of the…
▽ More
Remote sensing image semantic segmentation is an important problem for remote sensing image interpretation. Although remarkable progress has been achieved, existing deep neural network methods suffer from the reliance on massive training data. Few-shot remote sensing semantic segmentation aims at learning to segment target objects from a query image using only a few annotated support images of the target class. Most existing few-shot learning methods stem primarily from their sole focus on extracting information from support images, thereby failing to effectively address the large variance in appearance and scales of geographic objects. To tackle these challenges, we propose a Self-Correlation and Cross-Correlation Learning Network for the few-shot remote sensing image semantic segmentation. Our model enhances the generalization by considering both self-correlation and cross-correlation between support and query images to make segmentation predictions. To further explore the self-correlation with the query image, we propose to adopt a classical spectral method to produce a class-agnostic segmentation mask based on the basic visual information of the image. Extensive experiments on two remote sensing image datasets demonstrate the effectiveness and superiority of our model in few-shot remote sensing image semantic segmentation. Code and models will be accessed at https://github.com/linhanwang/SCCNet.
△ Less
Submitted 15 September, 2023; v1 submitted 11 September, 2023;
originally announced September 2023.
-
Next-to-leading order QCD corrections to the form factors of $B$ to scalar meson decays
Authors:
Xue-Ying Han,
Long-Shun Lu,
Cai-Dian Lü,
Yue-Long Shen,
Bo-Xuan Shi
Abstract:
We calculate the next-to-leading order QCD corrections to $B$ to scalar meson form factors from QCD light-cone sum rules with $B$ meson light-cone distribution amplitudes. We demonstrate that the $B$ meson-to-vacuum correlation functions can be factorized into the convolution of short-distance coefficients and light-cone distribution amplitudes at the one-loop level and find that only…
▽ More
We calculate the next-to-leading order QCD corrections to $B$ to scalar meson form factors from QCD light-cone sum rules with $B$ meson light-cone distribution amplitudes. We demonstrate that the $B$ meson-to-vacuum correlation functions can be factorized into the convolution of short-distance coefficients and light-cone distribution amplitudes at the one-loop level and find that only $φ_B^+(ω,μ)$ contributes to the form factors. We then employ the $z$-parameterization combined with constraints from strong coupling constants to reconstruct the $q^2$ dependence of the form factors in the whole kinematic allowed regions. Due to the large cancellations between the hard functions and the jet functions, the next-to-leading order results show a modest increase of approximately 5\% compared to the leading order results. Based on the results of form factors, we predict the branching ratios of semi-leptonic $B\to S\ell\barν_\ell$ and $B\to Sν_\ell\barν_\ell$ processes, as well as several angular observables, such as forward-backward asymmetries, "flat terms" and lepton polarization asymmetries. We compare these results with calculations from other methods. Experimental verification of these results is required in future experiments.
△ Less
Submitted 16 October, 2023; v1 submitted 11 September, 2023;
originally announced September 2023.
-
EvoCLINICAL: Evolving Cyber-Cyber Digital Twin with Active Transfer Learning for Automated Cancer Registry System
Authors:
Chengjie Lu,
Qinghua Xu,
Tao Yue,
Shaukat Ali,
Thomas Schwitalla,
Jan F. Nygård
Abstract:
The Cancer Registry of Norway (CRN) collects information on cancer patients by receiving cancer messages from different medical entities (e.g., medical labs, and hospitals) in Norway. Such messages are validated by an automated cancer registry system: GURI. Its correct operation is crucial since it lays the foundation for cancer research and provides critical cancer-related statistics to its stake…
▽ More
The Cancer Registry of Norway (CRN) collects information on cancer patients by receiving cancer messages from different medical entities (e.g., medical labs, and hospitals) in Norway. Such messages are validated by an automated cancer registry system: GURI. Its correct operation is crucial since it lays the foundation for cancer research and provides critical cancer-related statistics to its stakeholders. Constructing a cyber-cyber digital twin (CCDT) for GURI can facilitate various experiments and advanced analyses of the operational state of GURI without requiring intensive interactions with the real system. However, GURI constantly evolves due to novel medical diagnostics and treatment, technological advances, etc. Accordingly, CCDT should evolve as well to synchronize with GURI. A key challenge of achieving such synchronization is that evolving CCDT needs abundant data labelled by the new GURI. To tackle this challenge, we propose EvoCLINICAL, which considers the CCDT developed for the previous version of GURI as the pretrained model and fine-tunes it with the dataset labelled by querying a new GURI version. EvoCLINICAL employs a genetic algorithm to select an optimal subset of cancer messages from a candidate dataset and query GURI with it. We evaluate EvoCLINICAL on three evolution processes. The precision, recall, and F1 score are all greater than 91%, demonstrating the effectiveness of EvoCLINICAL. Furthermore, we replace the active learning part of EvoCLINICAL with random selection to study the contribution of transfer learning to the overall performance of EvoCLINICAL. Results show that employing active learning in EvoCLINICAL increases its performances consistently.
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding
Authors:
Yue Xu,
Yong-Lu Li,
Zhemin Huang,
Michael Xu Liu,
Cewu Lu,
Yu-Wing Tai,
Chi-Keung Tang
Abstract:
With the surge in attention to Egocentric Hand-Object Interaction (Ego-HOI), large-scale datasets such as Ego4D and EPIC-KITCHENS have been proposed. However, most current research is built on resources derived from third-person video action recognition. This inherent domain gap between first- and third-person action videos, which have not been adequately addressed before, makes current Ego-HOI su…
▽ More
With the surge in attention to Egocentric Hand-Object Interaction (Ego-HOI), large-scale datasets such as Ego4D and EPIC-KITCHENS have been proposed. However, most current research is built on resources derived from third-person video action recognition. This inherent domain gap between first- and third-person action videos, which have not been adequately addressed before, makes current Ego-HOI suboptimal. This paper rethinks and proposes a new framework as an infrastructure to advance Ego-HOI recognition by Probing, Curation and Adaption (EgoPCA). We contribute comprehensive pre-train sets, balanced test sets and a new baseline, which are complete with a training-finetuning strategy. With our new framework, we not only achieve state-of-the-art performance on Ego-HOI benchmarks but also build several new and effective mechanisms and settings to advance further research. We believe our data and the findings will pave a new way for Ego-HOI understanding. Code and data are available at https://mvig-rhos.com/ego_pca
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Polarization-entangled quantum frequency comb from a silicon nitride microring resonator
Authors:
Wenjun Wen,
Wenhan Yan,
Chi Lu,
Liangliang Lu,
Xiaoyu Wu,
Yanqing Lu,
Shining Zhu,
Xiao-song Ma
Abstract:
Integrated microresonator facilitates the realization of quantum frequency comb (QFC), which provides a large number of discrete frequency modes with broadband spectral range and narrow linewidth. However, all previous demonstrations have focused on the generation of energy-time or time-bin entangled photons from QFC. Realizing polarization-entangled quantum frequency comb, which is the important…
▽ More
Integrated microresonator facilitates the realization of quantum frequency comb (QFC), which provides a large number of discrete frequency modes with broadband spectral range and narrow linewidth. However, all previous demonstrations have focused on the generation of energy-time or time-bin entangled photons from QFC. Realizing polarization-entangled quantum frequency comb, which is the important resource for fundamental study of quantum mechanics and quantum information applications, remains challenging. Here, we demonstrate, for the first time, a broadband polarization-entangled quantum frequency comb by combining an integrated silicon nitride micro-resonator with a Sagnac interferometer. With a free spectral range of about 99 GHz and a narrow linewidth of about 190 MHz, our source provides 22 polarization entangled photons pairs with frequency covering the whole telecom C-band. The entanglement fidelities for all 22 pairs are above 81%, including 17 pairs with fidelities higher than 90%. Our demonstration paves the way for employing the polarization-entangled quantum frequency comb in quantum network using CMOS technology as well as standard dense wavelength division multiplexing technology.
△ Less
Submitted 17 April, 2024; v1 submitted 3 September, 2023;
originally announced September 2023.
-
Probing Inelastic Dark Matter at the LHC, FASER and STCF
Authors:
Chih-Ting Lu,
Jianfeng Tu,
Lei Wu
Abstract:
In this work, we explore the potential of probing the inelastic dark matter (DM) model with an extra U(1)D gauge symmetry at the Large Hadron Collider, ForwArd Search ExpeRiment and Super Tau Charm Factory. To saturate the observed DM relic density, the mass splitting between two light dark states has to be small enough, and thus leads to some distinctive signatures at these colliders. By searchin…
▽ More
In this work, we explore the potential of probing the inelastic dark matter (DM) model with an extra U(1)D gauge symmetry at the Large Hadron Collider, ForwArd Search ExpeRiment and Super Tau Charm Factory. To saturate the observed DM relic density, the mass splitting between two light dark states has to be small enough, and thus leads to some distinctive signatures at these colliders. By searching for the long-lived particle, the displaced muon-jets, the soft leptons, and the mono-photon events, we find that the inelastic DM mass in the range of 1 MeV to 210 GeV could be tested.
△ Less
Submitted 3 July, 2024; v1 submitted 1 September, 2023;
originally announced September 2023.
-
Flexible Handover with Real-Time Robust Dynamic Grasp Trajectory Generation
Authors:
Gu Zhang,
Hao-Shu Fang,
Hongjie Fang,
Cewu Lu
Abstract:
In recent years, there has been a significant effort dedicated to developing efficient, robust, and general human-to-robot handover systems. However, the area of flexible handover in the context of complex and continuous objects' motion remains relatively unexplored. In this work, we propose an approach for effective and robust flexible handover, which enables the robot to grasp moving objects wit…
▽ More
In recent years, there has been a significant effort dedicated to developing efficient, robust, and general human-to-robot handover systems. However, the area of flexible handover in the context of complex and continuous objects' motion remains relatively unexplored. In this work, we propose an approach for effective and robust flexible handover, which enables the robot to grasp moving objects with flexible motion trajectories with a high success rate. The key innovation of our approach is the generation of real-time robust grasp trajectories. We also design a future grasp prediction algorithm to enhance the system's adaptability to dynamic handover scenes. We conduct one-motion handover experiments and motion-continuous handover experiments on our novel benchmark that includes 31 diverse household objects. The system we have developed allows users to move and rotate objects in their hands within a relatively large range. The success rate of the robot grasping such moving objects is 78.15% over the entire household object benchmark.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
Time-Frequency Transformer: A Novel Time Frequency Joint Learning Method for Speech Emotion Recognition
Authors:
Yong Wang,
Cheng Lu,
Yuan Zong,
Hailun Lian,
Yan Zhao,
Sunan Li
Abstract:
In this paper, we propose a novel time-frequency joint learning method for speech emotion recognition, called Time-Frequency Transformer. Its advantage is that the Time-Frequency Transformer can excavate global emotion patterns in the time-frequency domain of speech signal while modeling the local emotional correlations in the time domain and frequency domain respectively. For the purpose, we firs…
▽ More
In this paper, we propose a novel time-frequency joint learning method for speech emotion recognition, called Time-Frequency Transformer. Its advantage is that the Time-Frequency Transformer can excavate global emotion patterns in the time-frequency domain of speech signal while modeling the local emotional correlations in the time domain and frequency domain respectively. For the purpose, we first design a Time Transformer and Frequency Transformer to capture the local emotion patterns between frames and inside frequency bands respectively, so as to ensure the integrity of the emotion information modeling in both time and frequency domains. Then, a Time-Frequency Transformer is proposed to mine the time-frequency emotional correlations through the local time-domain and frequency-domain emotion features for learning more discriminative global speech emotion representation. The whole process is a time-frequency joint learning process implemented by a series of Transformer models. Experiments on IEMOCAP and CASIA databases indicate that our proposed method outdoes the state-of-the-art methods.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Deep Learning for Visual Localization and Mapping: A Survey
Authors:
Changhao Chen,
Bing Wang,
Chris Xiaoxuan Lu,
Niki Trigoni,
Andrew Markham
Abstract:
Deep learning based localization and mapping approaches have recently emerged as a new research direction and receive significant attentions from both industry and academia. Instead of creating hand-designed algorithms based on physical models or geometric theories, deep learning solutions provide an alternative to solve the problem in a data-driven way. Benefiting from the ever-increasing volumes…
▽ More
Deep learning based localization and mapping approaches have recently emerged as a new research direction and receive significant attentions from both industry and academia. Instead of creating hand-designed algorithms based on physical models or geometric theories, deep learning solutions provide an alternative to solve the problem in a data-driven way. Benefiting from the ever-increasing volumes of data and computational power on devices, these learning methods are fast evolving into a new area that shows potentials to track self-motion and estimate environmental model accurately and robustly for mobile agents. In this work, we provide a comprehensive survey, and propose a taxonomy for the localization and mapping methods using deep learning. This survey aims to discuss two basic questions: whether deep learning is promising to localization and mapping; how deep learning should be applied to solve this problem. To this end, a series of localization and mapping topics are investigated, from the learning based visual odometry, global relocalization, to mapping, and simultaneous localization and mapping (SLAM). It is our hope that this survey organically weaves together the recent works in this vein from robotics, computer vision and machine learning communities, and serves as a guideline for future researchers to apply deep learning to tackle the problem of visual localization and mapping.
△ Less
Submitted 27 August, 2023;
originally announced August 2023.
-
JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading
Authors:
Sascha Frey,
Kang Li,
Peer Nagy,
Silvia Sapora,
Chris Lu,
Stefan Zohren,
Jakob Foerster,
Anisoara Calinescu
Abstract:
Financial exchanges across the world use limit order books (LOBs) to process orders and match trades. For research purposes it is important to have large scale efficient simulators of LOB dynamics. LOB simulators have previously been implemented in the context of agent-based models (ABMs), reinforcement learning (RL) environments, and generative models, processing order flows from historical data…
▽ More
Financial exchanges across the world use limit order books (LOBs) to process orders and match trades. For research purposes it is important to have large scale efficient simulators of LOB dynamics. LOB simulators have previously been implemented in the context of agent-based models (ABMs), reinforcement learning (RL) environments, and generative models, processing order flows from historical data sets and hand-crafted agents alike. For many applications, there is a requirement for processing multiple books, either for the calibration of ABMs or for the training of RL agents. We showcase the first GPU-enabled LOB simulator designed to process thousands of books in parallel, with a notably reduced per-message processing time. The implementation of our simulator - JAX-LOB - is based on design choices that aim to best exploit the powers of JAX without compromising on the realism of LOB-related mechanisms. We integrate JAX-LOB with other JAX packages, to provide an example of how one may address an optimal execution problem with reinforcement learning, and to share some preliminary results from end-to-end RL training on GPUs.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
A Measurement of Gravitational Lensing of the Cosmic Microwave Background Using SPT-3G 2018 Data
Authors:
Z. Pan,
F. Bianchini,
W. L. K. Wu,
P. A. R. Ade,
Z. Ahmed,
E. Anderes,
A. J. Anderson,
B. Ansarinejad,
M. Archipley,
K. Aylor,
L. Balkenhol,
P. S. Barry,
R. Basu Thakur,
K. Benabed,
A. N. Bender,
B. A. Benson,
L. E. Bleem,
F. R. Bouchet,
L. Bryant,
K. Byrum,
E. Camphuis,
J. E. Carlstrom,
F. W. Carter,
T. W. Cecil,
C. L. Chang
, et al. (111 additional authors not shown)
Abstract:
We present a measurement of gravitational lensing over 1500 deg$^2$ of the Southern sky using SPT-3G temperature data at 95 and 150 GHz taken in 2018. The lensing amplitude relative to a fiducial Planck 2018 $Λ$CDM cosmology is found to be $1.020\pm0.060$, excluding instrumental and astrophysical systematic uncertainties. We conduct extensive systematic and null tests to check the robustness of th…
▽ More
We present a measurement of gravitational lensing over 1500 deg$^2$ of the Southern sky using SPT-3G temperature data at 95 and 150 GHz taken in 2018. The lensing amplitude relative to a fiducial Planck 2018 $Λ$CDM cosmology is found to be $1.020\pm0.060$, excluding instrumental and astrophysical systematic uncertainties. We conduct extensive systematic and null tests to check the robustness of the lensing measurements, and report a minimum-variance combined lensing power spectrum over angular multipoles of $50<L<2000$, which we use to constrain cosmological models. When analyzed alone and jointly with primary cosmic microwave background (CMB) spectra within the $Λ$CDM model, our lensing amplitude measurements are consistent with measurements from SPT-SZ, SPTpol, ACT, and Planck. Incorporating loose priors on the baryon density and other parameters including uncertainties on a foreground bias template, we obtain a $1σ$ constraint on $σ_8 Ω_{\rm m}^{0.25}=0.595 \pm 0.026$ using the SPT-3G 2018 lensing data alone, where $σ_8$ is a common measure of the amplitude of structure today and $Ω_{\rm m}$ is the matter density parameter. Combining SPT-3G 2018 lensing measurements with baryon acoustic oscillation (BAO) data, we derive parameter constraints of $σ_8 = 0.810 \pm 0.033$, $S_8 \equiv σ_8(Ω_{\rm m}/0.3)^{0.5}= 0.836 \pm 0.039$, and Hubble constant $H_0 =68.8^{+1.3}_{-1.6}$ km s$^{-1}$ Mpc$^{-1}$. Using CMB anisotropy and lensing measurements from SPT-3G only, we provide independent constraints on the spatial curvature of $Ω_{K} = 0.014^{+0.023}_{-0.026}$ (95% C.L.) and the dark energy density of $Ω_Λ= 0.722^{+0.031}_{-0.026}$ (68% C.L.). When combining SPT-3G lensing data with SPT-3G CMB anisotropy and BAO data, we find an upper limit on the sum of the neutrino masses of $\sum m_ν< 0.30$ eV (95% C.L.).
△ Less
Submitted 29 January, 2024; v1 submitted 22 August, 2023;
originally announced August 2023.
-
CHORD: Category-level Hand-held Object Reconstruction via Shape Deformation
Authors:
Kailin Li,
Lixin Yang,
Haoyu Zhen,
Zenan Lin,
Xinyu Zhan,
Licheng Zhong,
Jian Xu,
Kejian Wu,
Cewu Lu
Abstract:
In daily life, humans utilize hands to manipulate objects. Modeling the shape of objects that are manipulated by the hand is essential for AI to comprehend daily tasks and to learn manipulation skills. However, previous approaches have encountered difficulties in reconstructing the precise shapes of hand-held objects, primarily owing to a deficiency in prior shape knowledge and inadequate data for…
▽ More
In daily life, humans utilize hands to manipulate objects. Modeling the shape of objects that are manipulated by the hand is essential for AI to comprehend daily tasks and to learn manipulation skills. However, previous approaches have encountered difficulties in reconstructing the precise shapes of hand-held objects, primarily owing to a deficiency in prior shape knowledge and inadequate data for training. As illustrated, given a particular type of tool, such as a mug, despite its infinite variations in shape and appearance, humans have a limited number of 'effective' modes and poses for its manipulation. This can be attributed to the fact that humans have mastered the shape prior of the 'mug' category, and can quickly establish the corresponding relations between different mug instances and the prior, such as where the rim and handle are located. In light of this, we propose a new method, CHORD, for Category-level Hand-held Object Reconstruction via shape Deformation. CHORD deforms a categorical shape prior for reconstructing the intra-class objects. To ensure accurate reconstruction, we empower CHORD with three types of awareness: appearance, shape, and interacting pose. In addition, we have constructed a new dataset, COMIC, of category-level hand-object interaction. COMIC contains a rich array of object instances, materials, hand interactions, and viewing directions. Extensive evaluation shows that CHORD outperforms state-of-the-art approaches in both quantitative and qualitative measures. Code, model, and datasets are available at https://kailinli.github.io/CHORD.
△ Less
Submitted 21 August, 2023;
originally announced August 2023.
-
ClothesNet: An Information-Rich 3D Garment Model Repository with Simulated Clothes Environment
Authors:
Bingyang Zhou,
Haoyu Zhou,
Tianhai Liang,
Qiaojun Yu,
Siheng Zhao,
Yuwei Zeng,
Jun Lv,
Siyuan Luo,
Qiancai Wang,
Xinyuan Yu,
Haonan Chen,
Cewu Lu,
Lin Shao
Abstract:
We present ClothesNet: a large-scale dataset of 3D clothes objects with information-rich annotations. Our dataset consists of around 4400 models covering 11 categories annotated with clothes features, boundary lines, and keypoints. ClothesNet can be used to facilitate a variety of computer vision and robot interaction tasks. Using our dataset, we establish benchmark tasks for clothes perception, i…
▽ More
We present ClothesNet: a large-scale dataset of 3D clothes objects with information-rich annotations. Our dataset consists of around 4400 models covering 11 categories annotated with clothes features, boundary lines, and keypoints. ClothesNet can be used to facilitate a variety of computer vision and robot interaction tasks. Using our dataset, we establish benchmark tasks for clothes perception, including classification, boundary line segmentation, and keypoint detection, and develop simulated clothes environments for robotic interaction tasks, including rearranging, folding, hanging, and dressing. We also demonstrate the efficacy of our ClothesNet in real-world experiments. Supplemental materials and dataset are available on our project webpage.
△ Less
Submitted 19 August, 2023;
originally announced August 2023.
-
Utilizing Semantic Textual Similarity for Clinical Survey Data Feature Selection
Authors:
Benjamin C. Warner,
Ziqi Xu,
Simon Haroutounian,
Thomas Kannampallil,
Chenyang Lu
Abstract:
Survey data can contain a high number of features while having a comparatively low quantity of examples. Machine learning models that attempt to predict outcomes from survey data under these conditions can overfit and result in poor generalizability. One remedy to this issue is feature selection, which attempts to select an optimal subset of features to learn upon. A relatively unexplored source o…
▽ More
Survey data can contain a high number of features while having a comparatively low quantity of examples. Machine learning models that attempt to predict outcomes from survey data under these conditions can overfit and result in poor generalizability. One remedy to this issue is feature selection, which attempts to select an optimal subset of features to learn upon. A relatively unexplored source of information in the feature selection process is the usage of textual names of features, which may be semantically indicative of which features are relevant to a target outcome. The relationships between feature names and target names can be evaluated using language models (LMs) to produce semantic textual similarity (STS) scores, which can then be used to select features. We examine the performance using STS to select features directly and in the minimal-redundancy-maximal-relevance (mRMR) algorithm. The performance of STS as a feature selection metric is evaluated against preliminary survey data collected as a part of a clinical study on persistent post-surgical pain (PPSP). The results suggest that features selected with STS can result in higher performance models compared to traditional feature selection algorithms.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
Progression-Guided Temporal Action Detection in Videos
Authors:
Chongkai Lu,
Man-Wai Mak,
Ruimin Li,
Zheru Chi,
Hong Fu
Abstract:
We present a novel framework, Action Progression Network (APN), for temporal action detection (TAD) in videos. The framework locates actions in videos by detecting the action evolution process. To encode the action evolution, we quantify a complete action process into 101 ordered stages (0\%, 1\%, ..., 100\%), referred to as action progressions. We then train a neural network to recognize the acti…
▽ More
We present a novel framework, Action Progression Network (APN), for temporal action detection (TAD) in videos. The framework locates actions in videos by detecting the action evolution process. To encode the action evolution, we quantify a complete action process into 101 ordered stages (0\%, 1\%, ..., 100\%), referred to as action progressions. We then train a neural network to recognize the action progressions. The framework detects action boundaries by detecting complete action processes in the videos, e.g., a video segment with detected action progressions closely follow the sequence 0\%, 1\%, ..., 100\%. The framework offers three major advantages: (1) Our neural networks are trained end-to-end, contrasting conventional methods that optimize modules separately; (2) The APN is trained using action frames exclusively, enabling models to be trained on action classification datasets and robust to videos with temporal background styles differing from those in training; (3) Our framework effectively avoids detecting incomplete actions and excels in detecting long-lasting actions due to the fine-grained and explicit encoding of the temporal structure of actions. Leveraging these advantages, the APN achieves competitive performance and significantly surpasses its counterparts in detecting long-lasting actions. With an IoU threshold of 0.5, the APN achieves a mean Average Precision (mAP) of 58.3\% on the THUMOS14 dataset and 98.9\% mAP on the DFMAD70 dataset.
△ Less
Submitted 17 August, 2023;
originally announced August 2023.
-
Surface Second Harmonic Generation from Topological Dirac Semimetal PdTe$_2$
Authors:
Syed Mohammed Faizanuddin,
Ching-Hang Chien,
Yao-Jui Chan,
Si-Tong Liu,
Chia-Nung Kuo,
Chin Shuan Lue,
Yu-Chieh Wen
Abstract:
Recent experiments and calculations in topological semimetals have observed anomalously strong second-order optical nonlinearity, but yet whether the enhancement also occurs at surfaces of topological semimetals in general remains an open question. In this work, we tackle this problem by measuring polarization-dependent and rotational-anisotropy optical second harmonic generation (SHG) from centro…
▽ More
Recent experiments and calculations in topological semimetals have observed anomalously strong second-order optical nonlinearity, but yet whether the enhancement also occurs at surfaces of topological semimetals in general remains an open question. In this work, we tackle this problem by measuring polarization-dependent and rotational-anisotropy optical second harmonic generation (SHG) from centrosymmetric type-II Dirac semimetal PdTe$_2$. We found the SHG to follow C$_{3v}$ surface symmetry with a time-varying intensity dictated by the oxidation kinetics of the material after its surface cleavage, indicating the surface origin of SHG. Quantitative characterization of the surface nonlinear susceptibility indicates a large out-of-plane response of PdTe$_2$ with $|χ_{ccc}^{(2)}|$ up to 25 $\times$ 10$^{-18}$ m$^2$/V. Our results support the topological surfaces/interfaces as a new route toward applications of nonlinear optical effects with released symmetry constraints, and demonstrate SHG as a viable means to in situ study of kinetics of topological surfaces.
△ Less
Submitted 17 August, 2023;
originally announced August 2023.
-
Toward a direct measurement of the cosmic acceleration: The pilot observation of H I 21cm absorption line at FAST
Authors:
Jiangang Kang,
Chang-Zhi Lu,
TongJie Zhang,
Ming Zhu
Abstract:
This study presents results on detecting neutral atomic hydrogen (HI) 21cm absorption in the spectrum of PKS1413+135 at redshift $z=0.24670041$. The observation was conducted by FAST, with a spectral resolution of 10 Hz, using 10 minutes of observing time. The global spectral profile is examined by modeling the absorption line using a single Gaussian function with a resolution of 10 kHz within a 2…
▽ More
This study presents results on detecting neutral atomic hydrogen (HI) 21cm absorption in the spectrum of PKS1413+135 at redshift $z=0.24670041$. The observation was conducted by FAST, with a spectral resolution of 10 Hz, using 10 minutes of observing time. The global spectral profile is examined by modeling the absorption line using a single Gaussian function with a resolution of 10 kHz within a 2 MHz bandwidth. The goal is to determine the rate of the latest cosmic acceleration by directly measuring redshift evolution of H I 21 cm absorption line with Hubble flow towards a same background Quasar over a decade or longer time span. This will serve as a detectable signal generated by the accelerated expansion of the Universe at redshift $z < 1$, referred to as redshift drift $\dot{z}$ or the SL effect. The measured HI gas column density in this DLA system is approximately equivalent to the initial observation value, considering uncertainties of the spin temperature of a spiral host galaxy. The high signal-to-noise ratio of 57, obtained at a 10 kHz resolution, strongly supports the feasibility of using the H I 21 cm absorption line in DLA systems to accurately measure the redshift drift rate at a precision level of around $10^{-10}$ per decade.
△ Less
Submitted 7 May, 2024; v1 submitted 17 August, 2023;
originally announced August 2023.