-
I$^2$-SLAM: Inverting Imaging Process for Robust Photorealistic Dense SLAM
Authors:
Gwangtak Bae,
Changwoon Choi,
Hyeongjun Heo,
Sang Min Kim,
Young Min Kim
Abstract:
We present an inverse image-formation module that can enhance the robustness of existing visual SLAM pipelines for casually captured scenarios. Casual video captures often suffer from motion blur and varying appearances, which degrade the final quality of coherent 3D visual representation. We propose integrating the physical imaging into the SLAM system, which employs linear HDR radiance maps to c…
▽ More
We present an inverse image-formation module that can enhance the robustness of existing visual SLAM pipelines for casually captured scenarios. Casual video captures often suffer from motion blur and varying appearances, which degrade the final quality of coherent 3D visual representation. We propose integrating the physical imaging into the SLAM system, which employs linear HDR radiance maps to collect measurements. Specifically, individual frames aggregate images of multiple poses along the camera trajectory to explain prevalent motion blur in hand-held videos. Additionally, we accommodate per-frame appearance variation by dedicating explicit variables for image formation steps, namely white balance, exposure time, and camera response function. Through joint optimization of additional variables, the SLAM pipeline produces high-quality images with more accurate trajectories. Extensive experiments demonstrate that our approach can be incorporated into recent visual SLAM pipelines using various scene representations, such as neural radiance fields or Gaussian splatting.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
A practical approach to calculating magnetic Johnson noise for precision measurements
Authors:
N. S. Phan,
S. M. Clayton,
Y. J. Kim,
T. M. Ito
Abstract:
Magnetic Johnson noise is an important consideration for many applications involving precision magnetometry, and its significance will only increase in the future with improvements in measurement sensitivity. The fluctuation-dissipation theorem can be utilized to derive analytic expressions for magnetic Johnson noise in certain situations. But when used in conjunction with commercially available f…
▽ More
Magnetic Johnson noise is an important consideration for many applications involving precision magnetometry, and its significance will only increase in the future with improvements in measurement sensitivity. The fluctuation-dissipation theorem can be utilized to derive analytic expressions for magnetic Johnson noise in certain situations. But when used in conjunction with commercially available finite element analysis tools, the combined approach is particularly powerful as it provides a practical means to calculate the magnetic Johnson noise arising from conductors of arbitrary geometry and permeability. In this paper, we demonstrate this method to be one of the most comprehensive approaches presently available to calculate thermal magnetic noise. In particular, its applicability is shown to not be limited to cases where the noise is evaluated at a point in space but also can be expanded to include cases where the magnetic field detector has a more general shape, such as a finite size loop, a gradiometer, or a detector that consists of a polarized atomic species trapped in a volume. Furthermore, some physics insights gained through studies made using this method are discussed.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
Authors:
Han Guo,
William Brandon,
Radostin Cholakov,
Jonathan Ragan-Kelley,
Eric P. Xing,
Yoon Kim
Abstract:
The deployment of large language models (LLMs) is often constrained by memory bandwidth, where the primary bottleneck is the cost of transferring model parameters from the GPU's global memory to its registers. When coupled with custom kernels that fuse the dequantization and matmul operations, weight-only quantization can thus enable faster inference by reducing the amount of memory movement. Howe…
▽ More
The deployment of large language models (LLMs) is often constrained by memory bandwidth, where the primary bottleneck is the cost of transferring model parameters from the GPU's global memory to its registers. When coupled with custom kernels that fuse the dequantization and matmul operations, weight-only quantization can thus enable faster inference by reducing the amount of memory movement. However, developing high-performance kernels for weight-quantized LLMs presents substantial challenges, especially when the weights are compressed to non-evenly-divisible bit widths (e.g., 3 bits) with non-uniform, lookup table (LUT) quantization. This paper describes FLUTE, a flexible lookup table engine for LUT-quantized LLMs, which uses offline restructuring of the quantized weight matrix to minimize bit manipulations associated with unpacking, and vectorization and duplication of the lookup table to mitigate shared memory bandwidth constraints. At batch sizes < 32 and quantization group size of 128 (typical in LLM inference), the FLUTE kernel can be 2-4x faster than existing GEMM kernels. As an application of FLUTE, we explore a simple extension to lookup table-based NormalFloat quantization and apply it to quantize LLaMA3 to various configurations, obtaining competitive quantization performance against strong baselines while obtaining an end-to-end throughput increase of 1.5 to 2 times.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Probing-Enhanced Stochastic Programming
Authors:
Zhichao Ma,
Youngdae Kim,
Jeff Linderoth,
James R. Luedtke,
Logan R. Matthews
Abstract:
We consider a two-stage stochastic decision problem where the decision-maker has the opportunity to obtain information about the distribution of the random variables $ξ$ that appear in the problem through a set of discrete actions that we refer to as \emph{probing}. Probing components of a random vector $η$ that is jointly-distributed with $ξ$ allows the decision-maker to learn about the condition…
▽ More
We consider a two-stage stochastic decision problem where the decision-maker has the opportunity to obtain information about the distribution of the random variables $ξ$ that appear in the problem through a set of discrete actions that we refer to as \emph{probing}. Probing components of a random vector $η$ that is jointly-distributed with $ξ$ allows the decision-maker to learn about the conditional distribution of $ξ$ given the observed components of $η$. We propose a three-stage optimization model for this problem, where in the first stage some components of $η$ are chosen to be observed, and decisions in subsequent stages must be consistent with the obtained information. In the case that $η$ and $ξ$ have finite support, Goel and Grossmann gave a mixed-integer programming (MIP) formulation of this problem whose size is proportional to the square of cardinality of the sample space of the random variables. We propose to solve the model using bounds obtained from an information-based relaxation, combined with a branching scheme that enforces the consistency of decisions with observed information. The branch-and-bound approach can naturally be combined with sampling in order to estimate both lower and upper bounds on the optimal solution value and does not require $η$ or $ξ$ to have finite support. We conduct a computational study of our method on instances of a stochastic facility location and sizing problem with the option to probe customers to learn about their demands before building facilities. We find that on instances with finite support, our approach scales significantly better than the MIP formulation and also demonstrate that our method can compute statistical bounds on instances with continuous distributions that improve upon the perfect information bounds.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
LabelDistill: Label-guided Cross-modal Knowledge Distillation for Camera-based 3D Object Detection
Authors:
Sanmin Kim,
Youngseok Kim,
Sihwan Hwang,
Hyeonjun Jeong,
Dongsuk Kum
Abstract:
Recent advancements in camera-based 3D object detection have introduced cross-modal knowledge distillation to bridge the performance gap with LiDAR 3D detectors, leveraging the precise geometric information in LiDAR point clouds. However, existing cross-modal knowledge distillation methods tend to overlook the inherent imperfections of LiDAR, such as the ambiguity of measurements on distant or occ…
▽ More
Recent advancements in camera-based 3D object detection have introduced cross-modal knowledge distillation to bridge the performance gap with LiDAR 3D detectors, leveraging the precise geometric information in LiDAR point clouds. However, existing cross-modal knowledge distillation methods tend to overlook the inherent imperfections of LiDAR, such as the ambiguity of measurements on distant or occluded objects, which should not be transferred to the image detector. To mitigate these imperfections in LiDAR teacher, we propose a novel method that leverages aleatoric uncertainty-free features from ground truth labels. In contrast to conventional label guidance approaches, we approximate the inverse function of the teacher's head to effectively embed label inputs into feature space. This approach provides additional accurate guidance alongside LiDAR teacher, thereby boosting the performance of the image detector. Additionally, we introduce feature partitioning, which effectively transfers knowledge from the teacher modality while preserving the distinctive features of the student, thereby maximizing the potential of both modalities. Experimental results demonstrate that our approach improves mAP and NDS by 5.1 points and 4.9 points compared to the baseline model, proving the effectiveness of our approach. The code is available at https://github.com/sanmin0312/LabelDistill
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Topological Fermi-arc surface state covered by floating electrons on a two-dimensional electride
Authors:
Chan-young Lim,
Min-Seok Kim,
Dong Cheol Lim,
Sunghun Kim,
Yeonghoon Lee,
Jaehoon Cha,
Gyubin Lee,
Sang Yong Song,
Dinesh Thapa,
Jonathan D. Denlinger,
Seong-Gon Kim,
Sung Wng Kim,
Jungpil Seo,
Yeongkwan Kim
Abstract:
Two-dimensional electrides can acquire topologically non-trivial phases due to intriguing interplay between the cationic atomic layers and anionic electron layers. However, experimental evidence of topological surface states has yet to be verified. Here, via angle-resolved photoemission spectroscopy (ARPES) and scanning tunnelling microscopy (STM), we probe the magnetic Weyl states of the ferromag…
▽ More
Two-dimensional electrides can acquire topologically non-trivial phases due to intriguing interplay between the cationic atomic layers and anionic electron layers. However, experimental evidence of topological surface states has yet to be verified. Here, via angle-resolved photoemission spectroscopy (ARPES) and scanning tunnelling microscopy (STM), we probe the magnetic Weyl states of the ferromagnetic electride $[Gd_{2}$C]^{2+}\cdot2e^{-}$. In particular, the presence of Weyl cones and Fermi-arc states is demonstrated through photon energy-dependent ARPES measurements, agreeing with theoretical band structure calculations. Notably, the STM measurements reveal that the Fermi-arc states exist underneath a floating quantum electron liquid on the top Gd layer, forming double-stacked surface states in a heterostructure. Our work thus not only unveils the non-trivial topology of the $[Gd_{2}$C]^{2+}\cdot2e^{-}$ electride but also realizes a surface heterostructure that can host phenomena distinct from the bulk.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Measurement of $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (414 additional authors not shown)
Abstract:
We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We det…
▽ More
We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We determine these parameters for two ranges of $K^0_S π^0$ invariant mass: $m(K^0_S π^0)\in (0.8, 1.0)$ $GeV/c^2$, which is dominated by $B^0 \to K^{*0} (\to K^0_S π^0) γ$ decays, and a complementary region $m(K^0_S π^0)\in (0.6, 0.8)\cup(1.0, 1.8)$ $GeV/c^2$. Our results have improved precision as compared to previous measurements and are consistent with theory predictions.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Introducing VaDA: Novel Image Segmentation Model for Maritime Object Segmentation Using New Dataset
Authors:
Yongjin Kim,
Jinbum Park,
Sanha Kang,
Hanguen Kim
Abstract:
The maritime shipping industry is undergoing rapid evolution driven by advancements in computer vision artificial intelligence (AI). Consequently, research on AI-based object recognition models for maritime transportation is steadily growing, leveraging advancements in sensor technology and computing performance. However, object recognition in maritime environments faces challenges such as light r…
▽ More
The maritime shipping industry is undergoing rapid evolution driven by advancements in computer vision artificial intelligence (AI). Consequently, research on AI-based object recognition models for maritime transportation is steadily growing, leveraging advancements in sensor technology and computing performance. However, object recognition in maritime environments faces challenges such as light reflection, interference, intense lighting, and various weather conditions. To address these challenges, high-performance deep learning algorithms tailored to maritime imagery and high-quality datasets specialized for maritime scenes are essential. Existing AI recognition models and datasets have limited suitability for composing autonomous navigation systems. Therefore, in this paper, we propose a Vertical and Detail Attention (VaDA) model for maritime object segmentation and a new model evaluation method, the Integrated Figure of Calculation Performance (IFCP), to verify its suitability for the system in real-time. Additionally, we introduce a benchmark maritime dataset, OASIs (Ocean AI Segmentation Initiatives) to standardize model performance evaluation across diverse maritime environments. OASIs dataset and details are available at our website: https://www.navlue.com/dataset
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Measurement of branching fractions, CP asymmetry, and isospin asymmetry for $\boldsymbol{B\rightarrowργ}$ decays using Belle and Belle II data
Authors:
Belle II Collaboration,
I. Adachi,
K. Adamczyk,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (385 additional authors not shown)
Abstract:
We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle I…
▽ More
We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle II data sets yields $114\pm 12$ $B^{+}\rightarrowρ^{+}γ$ and $99\pm 12$ $B^{0}\rightarrowρ^{0}γ$ decays. The measured branching fractions are $(13.1^{+2.0 +1.3}_{-1.9 -1.2})\times 10^{-7}$ and $(7.5\pm 1.3^{+1.0}_{-0.8})\times 10^{-7}$ for $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays, respectively, where the first uncertainty is statistical and the second is systematic. We also measure the isospin asymmetry $A_{\rm I}(B\rightarrowργ)=(10.9^{+11.2 +7.8}_{-11.7 -7.3})\%$ and the direct CP asymmetry $A_{CP}(B^{+}\rightarrowρ^{+}γ)=(-8.2\pm 15.2^{+1.6}_{-1.2})\%$.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Visual Multi-Object Tracking with Re-Identification and Occlusion Handling using Labeled Random Finite Sets
Authors:
Linh Van Ma,
Tran Thien Dat Nguyen,
Changbeom Shim,
Du Yong Kim,
Namkoo Ha,
Moongu Jeon
Abstract:
This paper proposes an online visual multi-object tracking (MOT) algorithm that resolves object appearance-reappearance and occlusion. Our solution is based on the labeled random finite set (LRFS) filtering approach, which in principle, addresses disappearance, appearance, reappearance, and occlusion via a single Bayesian recursion. However, in practice, existing numerical approximations cause rea…
▽ More
This paper proposes an online visual multi-object tracking (MOT) algorithm that resolves object appearance-reappearance and occlusion. Our solution is based on the labeled random finite set (LRFS) filtering approach, which in principle, addresses disappearance, appearance, reappearance, and occlusion via a single Bayesian recursion. However, in practice, existing numerical approximations cause reappearing objects to be initialized as new tracks, especially after long periods of being undetected. In occlusion handling, the filter's efficacy is dictated by trade-offs between the sophistication of the occlusion model and computational demand. Our contribution is a novel modeling method that exploits object features to address reappearing objects whilst maintaining a linear complexity in the number of detections. Moreover, to improve the filter's occlusion handling, we propose a fuzzy detection model that takes into consideration the overlapping areas between tracks and their sizes. We also develop a fast version of the filter to further reduce the computational time.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Centrality dependence of Lévy-stable two-pion Bose-Einstein correlations in $\sqrt{s_{_{NN}}}=200$ GeV Au$+$Au collisions
Authors:
PHENIX Collaboration,
N. J. Abdulameer,
U. Acharya,
A. Adare,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
R. Akimoto,
H. Al-Ta'ani,
J. Alexander,
A. Angerami,
K. Aoki,
N. Apadula,
Y. Aramaki,
H. Asano,
E. C. Aschenauer,
E. T. Atomssa,
T. C. Awes,
B. Azmoun,
V. Babintsev,
M. Bai,
B. Bannier,
K. N. Barish,
B. Bassalleck,
S. Bathe
, et al. (377 additional authors not shown)
Abstract:
The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability…
▽ More
The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability $α$, and the Lévy-scale parameter $R$ as a function of transverse mass $m_T$ and centrality. The $λ(m_T)$ parameter is constant at larger values of $m_T$, but decreases as $m_T$ decreases. The Lévy scale parameter $R(m_T)$ decreases with $m_T$ and exhibits proportionality to the length scale of the nuclear overlap region. The Lévy exponent $α(m_T)$ is independent of $m_T$ within uncertainties in each investigated centrality bin, but shows a clear centrality dependence. At all centralities, the Lévy exponent $α$ is significantly different from that of Gaussian ($α=2$) or Cauchy ($α=1$) source distributions. Comparisons to the predictions of Monte-Carlo simulations of resonance-decay chains show that in all but the most peripheral centrality class (50%-60%), the obtained results are inconsistent with the measurements, unless a significant reduction of the in-medium mass of the $η'$ meson is included. In each centrality class, the best value of the in-medium $η'$ mass is compared to the mass of the $η$ meson, as well as to several theoretical predictions that consider restoration of $U_A(1)$ symmetry in hot hadronic matter.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Parameter Efficient Fine Tuning for Multi-scanner PET to PET Reconstruction
Authors:
Yumin Kim,
Gayoon Choi,
Seong Jae Hwang
Abstract:
Reducing scan time in Positron Emission Tomography (PET) imaging while maintaining high-quality images is crucial for minimizing patient discomfort and radiation exposure. Due to the limited size of datasets and distribution discrepancy across scanners in medical imaging, fine-tuning in a parameter-efficient and effective manner is on the rise. Motivated by the potential of Parameter-Efficient Fin…
▽ More
Reducing scan time in Positron Emission Tomography (PET) imaging while maintaining high-quality images is crucial for minimizing patient discomfort and radiation exposure. Due to the limited size of datasets and distribution discrepancy across scanners in medical imaging, fine-tuning in a parameter-efficient and effective manner is on the rise. Motivated by the potential of Parameter-Efficient Fine-Tuning (PEFT), we aim to address these issues by effectively leveraging PEFT to improve limited data and GPU resource issues in multi-scanner setups. In this paper, we introduce PETITE, Parameter-Efficient Fine-Tuning for MultI-scanner PET to PET REconstruction that uses fewer than 1% of the parameters. To the best of our knowledge, this study is the first to systematically explore the efficacy of diverse PEFT techniques in medical imaging reconstruction tasks via prevalent encoder-decoder-type deep models. This investigation, in particular, brings intriguing insights into PETITE as we show further improvements by treating encoder and decoder separately and mixing different PEFT methods, namely, Mix-PEFT. Using multi-scanner PET datasets comprised of five different scanners, we extensively test the cross-scanner PET scan time reduction performances (i.e., a model pre-trained on one scanner is fine-tuned on a different scanner) of 21 feasible Mix-PEFT combinations to derive optimal PETITE. We show that training with less than 1% parameters using PETITE performs on par with full fine-tuning (i.e., 100% parameter)
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
KpopMT: Translation Dataset with Terminology for Kpop Fandom
Authors:
JiWoo Kim,
Yunsu Kim,
JinYeong Bak
Abstract:
While machines learn from existing corpora, humans have the unique capability to establish and accept new language systems. This makes human form unique language systems within social groups. Aligning with this, we focus on a gap remaining in addressing translation challenges within social groups, where in-group members utilize unique terminologies. We propose KpopMT dataset, which aims to fill th…
▽ More
While machines learn from existing corpora, humans have the unique capability to establish and accept new language systems. This makes human form unique language systems within social groups. Aligning with this, we focus on a gap remaining in addressing translation challenges within social groups, where in-group members utilize unique terminologies. We propose KpopMT dataset, which aims to fill this gap by enabling precise terminology translation, choosing Kpop fandom as an initiative for social groups given its global popularity. Expert translators provide 1k English translations for Korean posts and comments, each annotated with specific terminology within social groups' language systems. We evaluate existing translation systems including GPT models on KpopMT to identify their failure cases. Results show overall low scores, underscoring the challenges of reflecting group-specific terminologies and styles in translation. We make KpopMT publicly available.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Electrically Tuning Quasi-Bound States in the Continuum with Hybrid Graphene-Silicon Metasurfaces
Authors:
Ziqiang Cai,
Xianzhe Zhang,
Tushar Sanjay Karnik,
Yihao Xu,
Tae Yoon Kim,
Juejun Hu,
Yongmin Liu
Abstract:
Metasurfaces have become one of the most prominent research topics in the field of optics owing to their unprecedented properties and novel applications on an ultrathin platform. By combining graphene with metasurfaces, electrical tunable functions can be achieved with fast tuning speed, large modulation depth and broad tuning range. However, the tuning efficiency of hybrid graphene metasurfaces w…
▽ More
Metasurfaces have become one of the most prominent research topics in the field of optics owing to their unprecedented properties and novel applications on an ultrathin platform. By combining graphene with metasurfaces, electrical tunable functions can be achieved with fast tuning speed, large modulation depth and broad tuning range. However, the tuning efficiency of hybrid graphene metasurfaces within the short-wavelength infrared (SWIR) spectrum is typically low because of the small resonance wavelength shift in this wavelength range. In this work, through the integration of graphene and silicon metasurfaces that support quasi-bound states in the continuum (quasi-BIC), we experimentally demonstrate significant transmittance tuning even with less than 30 nm resonance wavelength shift thanks to the high quality-factor of quasi-BIC metasurfaces. The tunable transmittance spectrum was measured using Fourier Transform Infrared Spectroscopy (FTIR) with a modified reflective lens to improve the accuracy, and the electrical tuning was realized utilizing the cut-and-stick method of ion gel. At the wavelength of 3.0 um, the measured change of transmittance T_max-T_min and modulation depth (T_max-T_min)/T_max can reach 22.2% and 28.9%, respectively, under a small bias voltage ranging from -2 V to +2 V. To the best of our knowledge, this work is the first experimental demonstration of tunable graphene/quasi-BIC metasurfaces, which have potential applications in optical modulation, reconfigurable photonic devices, and optical communications.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps
Authors:
Yung-Sung Chuang,
Linlu Qiu,
Cheng-Yu Hsieh,
Ranjay Krishna,
Yoon Kim,
James Glass
Abstract:
When asked to summarize articles or answer questions given a passage, large language models (LLMs) can hallucinate details and respond with unsubstantiated answers that are inaccurate with respect to the input context. This paper describes a simple approach for detecting such contextual hallucinations. We hypothesize that contextual hallucinations are related to the extent to which an LLM attends…
▽ More
When asked to summarize articles or answer questions given a passage, large language models (LLMs) can hallucinate details and respond with unsubstantiated answers that are inaccurate with respect to the input context. This paper describes a simple approach for detecting such contextual hallucinations. We hypothesize that contextual hallucinations are related to the extent to which an LLM attends to information in the provided context versus its own generations. Based on this intuition, we propose a simple hallucination detection model whose input features are given by the ratio of attention weights on the context versus newly generated tokens (for each attention head). We find that a linear classifier based on these lookback ratio features is as effective as a richer detector that utilizes the entire hidden states of an LLM or a text-based entailment model. The lookback ratio-based detector -- Lookback Lens -- is found to transfer across tasks and even models, allowing a detector that is trained on a 7B model to be applied (without retraining) to a larger 13B model. We further apply this detector to mitigate contextual hallucinations, and find that a simple classifier-guided decoding approach is able to reduce the amount of hallucination, for example by 9.6% in the XSum summarization task.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
ZTF SN Ia DR2: The spectral diversity of Type Ia supernovae in a volume-limited sample
Authors:
U. Burgaz,
K. Maguire,
G. Dimitriadis,
L. Harvey,
R. Senzel,
J. Sollerman,
J. Nordin,
L. Galbany,
M. Rigault,
M. Smith,
A. Goobar,
J. Johansson,
P. Rosnet,
M. Amenouche,
M. Deckers,
S. Dhawan,
M. Ginolin,
Y. -L. Kim,
A. A. Miller,
T. E. Muller-Bravo,
P. E. Nugent,
J. H. Terwel,
R. Dekany,
A. Drake,
M. J. Graham
, et al. (8 additional authors not shown)
Abstract:
More than 3000 spectroscopically confirmed Type Ia supernovae (SNe Ia) are presented in the Zwicky Transient Facility SN Ia Data Release 2 (ZTF DR2). In this paper, we detail the spectral properties of 482 SNe Ia near maximum light, up to a redshift limit of $z$ $\leq$ 0.06. We measure the velocities and pseudo-equivalent widths (pEW) of key spectral features (Si II $λ$5972 and Si II $λ$6355) and…
▽ More
More than 3000 spectroscopically confirmed Type Ia supernovae (SNe Ia) are presented in the Zwicky Transient Facility SN Ia Data Release 2 (ZTF DR2). In this paper, we detail the spectral properties of 482 SNe Ia near maximum light, up to a redshift limit of $z$ $\leq$ 0.06. We measure the velocities and pseudo-equivalent widths (pEW) of key spectral features (Si II $λ$5972 and Si II $λ$6355) and investigate the relation between the properties of the spectral features and the photometric properties from the SALT2 light-curve parameters as a function of spectroscopic sub-class. We discuss the non-negligible impact of host galaxy contamination on SN Ia spectral classifications, as well as investigate the accuracy of spectral template matching of the ZTF DR2 sample. We define a new subclass of underluminous SNe Ia (`04gs-like') that lie spectroscopically between normal SNe Ia and transitional 86G-like SNe Ia (stronger Si II $λ$5972 than normal SNe Ia but significantly weaker Ti II features than `86G-like' SNe). We model these `04gs-like' SN Ia spectra using the radiative-transfer spectral synthesis code tardis and show that cooler temperatures alone are unable to explain their spectra; some changes in elemental abundances are also required. However, the broad continuity in spectral properties seen from bright (`91T-like') to faint normal SN Ia, including the transitional and 91bg-like SNe Ia, suggests that variations within a single explosion model may be able to explain their behaviour.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Improved limit on neutrinoless double beta decay of \mohundred~from AMoRE-I
Authors:
A. Agrawal,
V. V. Alenkov,
P. Aryal,
J. Beyer,
B. Bhandari,
R. S. Boiko,
K. Boonin,
O. Buzanov,
C. R. Byeon,
N. Chanthima,
M. K. Cheoun,
J. S. Choe,
Seonho Choi,
S. Choudhury,
J. S. Chung,
F. A. Danevich,
M. Djamal,
D. Drung,
C. Enss,
A. Fleischmann,
A. M. Gangapshev,
L. Gastaldo,
Y. M. Gavrilyuk,
A. M. Gezhaev,
O. Gileva
, et al. (83 additional authors not shown)
Abstract:
AMoRE searches for the signature of neutrinoless double beta decay of $^{100}$Mo with a 100 kg sample of enriched $^{100}$Mo. Scintillating molybdate crystals coupled with a metallic magnetic calorimeter operate at milli-Kelvin temperatures to measure the energy of electrons emitted in the decay. As a demonstration of the full-scale AMoRE, we conducted AMoRE-I, a pre-experiment with 18 molybdate c…
▽ More
AMoRE searches for the signature of neutrinoless double beta decay of $^{100}$Mo with a 100 kg sample of enriched $^{100}$Mo. Scintillating molybdate crystals coupled with a metallic magnetic calorimeter operate at milli-Kelvin temperatures to measure the energy of electrons emitted in the decay. As a demonstration of the full-scale AMoRE, we conducted AMoRE-I, a pre-experiment with 18 molybdate crystals, at the Yangyang Underground Laboratory for over two years. The exposure was 8.02 kg$\cdot$year (or 3.89 kg$_{\mathrm{^{100}Mo}}\cdot$year) and the total background rate near the Q-value was 0.025 $\pm$ 0.002 counts/keV/kg/year. We observed no indication of $0νββ$ decay and report a new lower limit of the half-life of $^{100}$Mo $0νββ$ decay as $ T^{0ν}_{1/2}>3.0\times10^{24}~\mathrm{years}$ at 90\% confidence level. The effective Majorana mass limit range is $m_{ββ}<$(210--610) meV using nuclear matrix elements estimated in the framework of different models, including the recent shell model calculations.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Read, Watch and Scream! Sound Generation from Text and Video
Authors:
Yujin Jeong,
Yunji Kim,
Sanghyuk Chun,
Jiyoung Lee
Abstract:
Multimodal generative models have shown impressive advances with the help of powerful diffusion models. Despite the progress, generating sound solely from text poses challenges in ensuring comprehensive scene depiction and temporal alignment. Meanwhile, video-to-sound generation limits the flexibility to prioritize sound synthesis for specific objects within the scene. To tackle these challenges,…
▽ More
Multimodal generative models have shown impressive advances with the help of powerful diffusion models. Despite the progress, generating sound solely from text poses challenges in ensuring comprehensive scene depiction and temporal alignment. Meanwhile, video-to-sound generation limits the flexibility to prioritize sound synthesis for specific objects within the scene. To tackle these challenges, we propose a novel video-and-text-to-sound generation method, called ReWaS, where video serves as a conditional control for a text-to-audio generation model. Our method estimates the structural information of audio (namely, energy) from the video while receiving key content cues from a user prompt. We employ a well-performing text-to-sound model to consolidate the video control, which is much more efficient for training multimodal diffusion models with massive triplet-paired (audio-video-text) data. In addition, by separating the generative components of audio, it becomes a more flexible system that allows users to freely adjust the energy, surrounding environment, and primary sound source according to their preferences. Experimental results demonstrate that our method shows superiority in terms of quality, controllability, and training efficiency. Our demo is available at https://naver-ai.github.io/rewas
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Search for the baryon number and lepton number violating decays $τ^-\to Λπ^-$ and $τ^-\to \barΛπ^-$ at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (349 additional authors not shown)
Abstract:
We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper…
▽ More
We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper limits at 90\% credibility level on the branching fractions of $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛπ^-$ are determined to be $4.7 \times 10^{-8}$ and $4.3 \times 10^{-8}$, respectively.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
3D Adaptive Structural Convolution Network for Domain-Invariant Point Cloud Recognition
Authors:
Younggun Kim,
Beomsik Cho,
Seonghoon Ryoo,
Soomok Lee
Abstract:
Adapting deep learning networks for point cloud data recognition in self-driving vehicles faces challenges due to the variability in datasets and sensor technologies, emphasizing the need for adaptive techniques to maintain accuracy across different conditions. In this paper, we introduce the 3D Adaptive Structural Convolution Network (3D-ASCN), a cutting-edge framework for 3D point cloud recognit…
▽ More
Adapting deep learning networks for point cloud data recognition in self-driving vehicles faces challenges due to the variability in datasets and sensor technologies, emphasizing the need for adaptive techniques to maintain accuracy across different conditions. In this paper, we introduce the 3D Adaptive Structural Convolution Network (3D-ASCN), a cutting-edge framework for 3D point cloud recognition. It combines 3D convolution kernels, a structural tree structure, and adaptive neighborhood sampling for effective geometric feature extraction. This method obtains domain-invariant features and demonstrates robust, adaptable performance on a variety of point cloud datasets, ensuring compatibility across diverse sensor configurations without the need for parameter adjustments. This highlights its potential to significantly enhance the reliability and efficiency of self-driving vehicle technology.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Variational Partial Group Convolutions for Input-Aware Partial Equivariance of Rotations and Color-Shifts
Authors:
Hyunsu Kim,
Yegon Kim,
Hongseok Yang,
Juho Lee
Abstract:
Group Equivariant CNNs (G-CNNs) have shown promising efficacy in various tasks, owing to their ability to capture hierarchical features in an equivariant manner. However, their equivariance is fixed to the symmetry of the whole group, limiting adaptability to diverse partial symmetries in real-world datasets, such as limited rotation symmetry of handwritten digit images and limited color-shift sym…
▽ More
Group Equivariant CNNs (G-CNNs) have shown promising efficacy in various tasks, owing to their ability to capture hierarchical features in an equivariant manner. However, their equivariance is fixed to the symmetry of the whole group, limiting adaptability to diverse partial symmetries in real-world datasets, such as limited rotation symmetry of handwritten digit images and limited color-shift symmetry of flower images. Recent efforts address this limitation, one example being Partial G-CNN which restricts the output group space of convolution layers to break full equivariance. However, such an approach still fails to adjust equivariance levels across data. In this paper, we propose a novel approach, Variational Partial G-CNN (VP G-CNN), to capture varying levels of partial equivariance specific to each data instance. VP G-CNN redesigns the distribution of the output group elements to be conditioned on input data, leveraging variational inference to avoid overfitting. This enables the model to adjust its equivariance levels according to the needs of individual data points. Additionally, we address training instability inherent in discrete group equivariance models by redesigning the reparametrizable distribution. We demonstrate the effectiveness of VP G-CNN on both toy and real-world datasets, including MNIST67-180, CIFAR10, ColorMNIST, and Flowers102. Our results show robust performance, even in uncertainty metrics.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Evidence of $h_{b}(\text{2P}) \to Υ(\text{1S})η$ decay and search for $h_{b}(\text{1P,2P}) \to Υ(\text{1S})π^0$ with the Belle detector
Authors:
Belle Collaboration,
E. Kovalenko,
I. Adachi,
H. Aihara,
D. M. Asner,
T. Aushev,
R. Ayad,
V. Babu,
Sw. Banerjee,
K. Belous,
J. Bennett,
M. Bessner,
T. Bilka,
D. Biswas,
A. Bobrov,
D. Bodrov,
A. Bondar,
A. Bozek,
M. Bračko,
P. Branchini,
T. E. Browder,
A. Budano,
M. Campajola,
M. -C. Chang,
B. G. Cheon
, et al. (142 additional authors not shown)
Abstract:
We report the first evidence for the $h_{b}(\text{2P}) \to Υ(\text{1S})η$ transition with a significance of $3.5$ standard deviations. The decay branching fraction is measured to be $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})η]=(7.1 ~^{+3.7} _{-3.2}\pm 0.8)\times10^{-3}$, which is noticeably smaller than expected. We also set upper limits on $π^0$ transitions of…
▽ More
We report the first evidence for the $h_{b}(\text{2P}) \to Υ(\text{1S})η$ transition with a significance of $3.5$ standard deviations. The decay branching fraction is measured to be $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})η]=(7.1 ~^{+3.7} _{-3.2}\pm 0.8)\times10^{-3}$, which is noticeably smaller than expected. We also set upper limits on $π^0$ transitions of $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})π^0] < 1.8\times10^{-3}$, and $\mathcal{B}[h_{b}(\text{1P})\to Υ(\text{1S})π^0] < 1.8\times10^{-3}$, at the $90\%$ confidence level. These results are obtained with a $131.4$~fb$^{-1}$ data sample collected near the $Υ(\text{5S})$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+e^-$ collider.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Dimensionality Engineering of Magnetic Anisotropy from Anomalous Hall Effect in Synthetic SrRuO3 Crystals
Authors:
Seung Gyo Jeong,
Seong Won Cho,
Sehwan Song,
Jin Young Oh,
Do Gyeom Jeong,
Gyeongtak Han,
Hu Young Jeong,
Ahmed Yousef Mohamed,
Woo-suk Noh,
Sungkyun Park,
Jong Seok Lee,
Suyoun Lee,
Young-Min Kim,
Deok-Yong Cho,
Woo Seok Choi
Abstract:
Magnetic anisotropy in atomically thin correlated heterostructures is essential for exploring quantum magnetic phases for next-generation spintronics. Whereas previous studies have mostly focused on van der Waals systems, here, we investigate the impact of dimensionality of epitaxially-grown correlated oxides down to the monolayer limit on structural, magnetic, and orbital anisotropies. By designi…
▽ More
Magnetic anisotropy in atomically thin correlated heterostructures is essential for exploring quantum magnetic phases for next-generation spintronics. Whereas previous studies have mostly focused on van der Waals systems, here, we investigate the impact of dimensionality of epitaxially-grown correlated oxides down to the monolayer limit on structural, magnetic, and orbital anisotropies. By designing oxide superlattices with a correlated ferromagnetic SrRuO3 and nonmagnetic SrTiO3 layers, we observed modulated ferromagnetic behavior with the change of the SrRuO3 thickness. Especially, for three-unit-cell-thick layers, we observe a significant 1,500% improvement of coercive field in the anomalous Hall effect, which cannot be solely attributed to the dimensional crossover in ferromagnetism. The atomic-scale heterostructures further reveal the systematic modulation of anisotropy for the lattice structure and orbital hybridization, explaining the enhanced magnetic anisotropy. Our findings provide valuable insights into engineering the anisotropic hybridization of synthetic magnetic crystals, offering a tunable spin order for various applications.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Error mitigation with stabilized noise in superconducting quantum processors
Authors:
Youngseok Kim,
Luke C. G. Govia,
Andrew Dane,
Ewout van den Berg,
David M. Zajac,
Bradley Mitchell,
Yinyu Liu,
Karthik Balakrishnan,
George Keefe,
Adam Stabile,
Emily Pritchett,
Jiri Stehlik,
Abhinav Kandala
Abstract:
Pre-fault tolerant quantum computers have already demonstrated the ability to estimate observable values accurately, at a scale beyond brute-force classical computation. This has been enabled by error mitigation techniques that often rely on a representative model on the device noise. However, learning and maintaining these models is complicated by fluctuations in the noise over unpredictable time…
▽ More
Pre-fault tolerant quantum computers have already demonstrated the ability to estimate observable values accurately, at a scale beyond brute-force classical computation. This has been enabled by error mitigation techniques that often rely on a representative model on the device noise. However, learning and maintaining these models is complicated by fluctuations in the noise over unpredictable time scales, for instance, arising from resonant interactions between superconducting qubits and defect two-level systems (TLS). Such interactions affect the stability and uniformity of device performance as a whole, but also affect the noise model accuracy, leading to incorrect observable estimation. Here, we experimentally demonstrate that tuning of the qubit-TLS interactions helps reduce noise instabilities and consequently enables more reliable error-mitigation performance. These experiments provide a controlled platform for studying the performance of error mitigation in the presence of quasi-static noise. We anticipate that the capabilities introduced here will be crucial for the exploration of quantum applications on solid-state processors at non-trivial scales.
△ Less
Submitted 5 July, 2024; v1 submitted 2 July, 2024;
originally announced July 2024.
-
Measurement of the integrated luminosity of data samples collected during 2019-2022 by the Belle II experiment
Authors:
The Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
J. K. Ahn,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (382 additional authors not shown)
Abstract:
A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, diga…
▽ More
A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, digamma, and dimuon events is (426.52 $\pm$ 0.03 $\pm$ 2.48)~fb$^{-1}$, (427.32 $\pm$ 0.03 $\pm$ 2.56)~fb$^{-1}$, and (424.84 $\pm$ 0.04 $\pm$ 3.88)~fb$^{-1}$, where the first uncertainties are statistical and the second are systematic. The resulting total integrated luminosity obtained from the combination of the three methods is (426.88 $\pm$ 1.93)~fb$^{-1}$.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Study of $χ_{bJ}(2P)\toωΥ(1S)$ at Belle
Authors:
Belle Collaboration,
Z. S. Stottler,
T. K. Pedlar,
B. G. Fulsom,
I. Adachi,
K. Adamczyk,
H. Aihara,
S. Al Said,
D. M. Asner,
H. Atmacan,
T. Aushev,
R. Ayad,
V. Babu,
Sw. Banerjee,
M. Bauer,
P. Behera,
K. Belous,
J. Bennett,
F. Bernlochner,
M. Bessner,
T. Bilka,
D. Biswas,
A. Bobrov,
D. Bodrov,
G. Bonvicini
, et al. (157 additional authors not shown)
Abstract:
We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of…
▽ More
We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of $B\big(χ_{b0}(2P)\toωΥ(1S)\big) = \big(0.55\pm0.19\pm0.07\big)\%$. We also obtain branching fractions of $B\big(χ_{b1}(2P)\toωΥ(1S)\big) = \big(2.39{}^{+0.20}_{-0.19}\pm0.24\big)\%$ and $B\big(χ_{b2}(2P)\toωΥ(1S)\big) = \big(0.47{}^{+0.13}_{-0.12}\pm0.06\big)\%$, confirming the measurement of the $ω$ transitions of the $J=1,2~P$--wave states. The ratio for the $J=2$ to $J=1$ transitions is also measured and found to differ by 3.3 standard deviations from the expected value in the QCD multipole expansion.
△ Less
Submitted 8 July, 2024; v1 submitted 30 June, 2024;
originally announced July 2024.
-
BAPO: Base-Anchored Preference Optimization for Personalized Alignment in Large Language Models
Authors:
Gihun Lee,
Minchan Jeong,
Yujin Kim,
Hojung Jung,
Jaehoon Oh,
Sangmook Kim,
Se-Young Yun
Abstract:
While learning to align Large Language Models (LLMs) with human preferences has shown remarkable success, aligning these models to meet the diverse user preferences presents further challenges in preserving previous knowledge. This paper examines the impact of personalized preference optimization on LLMs, revealing that the extent of knowledge loss varies significantly with preference heterogeneit…
▽ More
While learning to align Large Language Models (LLMs) with human preferences has shown remarkable success, aligning these models to meet the diverse user preferences presents further challenges in preserving previous knowledge. This paper examines the impact of personalized preference optimization on LLMs, revealing that the extent of knowledge loss varies significantly with preference heterogeneity. Although previous approaches have utilized the KL constraint between the reference model and the policy model, we observe that they fail to maintain general knowledge and alignment when facing personalized preferences. To this end, we introduce Base-Anchored Preference Optimization (BAPO), a simple yet effective approach that utilizes the initial responses of reference model to mitigate forgetting while accommodating personalized alignment. BAPO effectively adapts to diverse user preferences while minimally affecting global knowledge or general alignment. Our experiments demonstrate the efficacy of BAPO in various setups.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Refined approaches in second leptogenesis for the baryon-lepton asymmetry discrepancy
Authors:
YeolLin ChoeJo,
Kazuki Enomoto,
Yechan Kim,
Hye-Sung Lee
Abstract:
The temperature-dependent mass of the heavy neutrino can lead to the second leptogenesis occurring below the electroweak scale, potentially explaining the large discrepancy between baryon and lepton asymmetries. We investigate this scenario further, exploring the intricate interplay of the weak interaction processes within this framework. It includes notable shifts in the dominant decay channels o…
▽ More
The temperature-dependent mass of the heavy neutrino can lead to the second leptogenesis occurring below the electroweak scale, potentially explaining the large discrepancy between baryon and lepton asymmetries. We investigate this scenario further, exploring the intricate interplay of the weak interaction processes within this framework. It includes notable shifts in the dominant decay channels of heavy neutrinos around the electroweak symmetry breaking, along with the resonance behavior of the scattering processes near the $W/Z$ mass. The $CP$ asymmetry can also vary over cosmic history due to the temperature-dependent mass, allowing the $B-L$ asymmetry generation to be amplified in the late epoch. These findings elucidate how such alterations in the dynamics of second leptogenesis contribute to addressing the observed discrepancies in baryon-lepton asymmetry.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
ZTF SN Ia DR2: The secondary maximum in Type Ia supernovae
Authors:
M. Deckers,
K. Maguire,
L. Shingles,
G. Dimitriadis,
M. Rigault,
M. Smith,
A. Goobar,
J. Nordin,
J. Johansson,
M. Amenouche,
U. Burgaz,
S. Dhawan,
M. Ginolin,
L. Harvey,
W. D. Kenworthy,
Y. -L. Kim,
R. R. Laher,
N. Luo,
S. R. Kulkarni,
F. J. Masci,
T. E. Müller-Bravo,
P. E. Nugent,
N. Pletskova,
J. Purdum,
B. Racine
, et al. (2 additional authors not shown)
Abstract:
Type Ia supernova (SN Ia) light curves have a secondary maximum that exists in the $r$, $i$, and near-infrared filters. The secondary maximum is relatively weak in the $r$ band, but holds the advantage that it is accessible, even at high redshift. We used Gaussian Process fitting to parameterise the light curves of 893 SNe Ia from the Zwicky Transient Facility's (ZTF) second data release (DR2), an…
▽ More
Type Ia supernova (SN Ia) light curves have a secondary maximum that exists in the $r$, $i$, and near-infrared filters. The secondary maximum is relatively weak in the $r$ band, but holds the advantage that it is accessible, even at high redshift. We used Gaussian Process fitting to parameterise the light curves of 893 SNe Ia from the Zwicky Transient Facility's (ZTF) second data release (DR2), and we were able to extract information about the timing and strength of the secondary maximum. We found $>5σ$ correlations between the light curve decline rate ($Δm_{15}(g)$) and the timing and strength of the secondary maximum in the $r$ band. Whilst the timing of the secondary maximum in the $i$ band also correlates with $Δm_{15}(g)$, the strength of the secondary maximum in the $i$ band shows significant scatter as a function of $Δm_{15}(g)$. We found that the transparency timescales of 97 per cent of our sample are consistent with double detonation models, and that SNe Ia with small transparency timescales ($<$ 32 d) reside predominantly in locally red environments. We measured the total ejected mass for the normal SNe Ia in our sample using two methods, and both were consistent with medians of $1.3\ \pm \ 0.3$ and $1.2\ \pm\ 0.2$ solar masses. We find that the strength of the secondary maximum is a better standardisation parameter than the SALT light curve stretch ($x_1$). Finally, we identified a spectral feature in the $r$ band as Fe II, which strengthens during the onset of the secondary maximum. The same feature begins to strengthen at $<$ 3 d post maximum light in 91bg-like SNe. Finally, the correlation between $x_1$ and the strength of the secondary maximum was best fit with a broken line, with a split at $x_1^0\ =\ -0.5\ \pm\ 0.2$, suggestive of the existence of two populations of SNe Ia.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
VERISCORE: Evaluating the factuality of verifiable claims in long-form text generation
Authors:
Yixiao Song,
Yekyung Kim,
Mohit Iyyer
Abstract:
Existing metrics for evaluating the factuality of long-form text, such as FACTSCORE (Min et al., 2023) and SAFE (Wei et al., 2024), decompose an input text into "atomic claims" and verify each against a knowledge base like Wikipedia. These metrics are not suitable for most generation tasks because they assume that every claim is verifiable (i.e., can plausibly be proven true or false). We address…
▽ More
Existing metrics for evaluating the factuality of long-form text, such as FACTSCORE (Min et al., 2023) and SAFE (Wei et al., 2024), decompose an input text into "atomic claims" and verify each against a knowledge base like Wikipedia. These metrics are not suitable for most generation tasks because they assume that every claim is verifiable (i.e., can plausibly be proven true or false). We address this issue with VERISCORE, a metric for diverse long-form generation tasks that contain both verifiable and unverifiable content. VERISCORE can be effectively implemented with either closed or fine-tuned open-weight language models, and human evaluation confirms that VERISCORE's extracted claims are more sensible than those from competing methods across eight different long-form tasks. We use VERISCORE to evaluate generations from 16 different models across multiple long-form tasks and find that while GPT-4o is the best-performing model overall, open-weight models such as Mixtral-8x22 are closing the gap. We show that an LM's VERISCORE on one task (e.g., biography generation) does not necessarily correlate to its VERISCORE on a different task (e.g., long-form QA), highlighting the need for expanding factuality evaluation across tasks with varying fact density.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs
Authors:
Lokesh Mishra,
Sohayl Dhibi,
Yusik Kim,
Cesar Berrospi Ramis,
Shubham Gupta,
Michele Dolfi,
Peter Staar
Abstract:
Environment, Social, and Governance (ESG) KPIs assess an organization's performance on issues such as climate change, greenhouse gas emissions, water consumption, waste management, human rights, diversity, and policies. ESG reports convey this valuable quantitative information through tables. Unfortunately, extracting this information is difficult due to high variability in the table structure as…
▽ More
Environment, Social, and Governance (ESG) KPIs assess an organization's performance on issues such as climate change, greenhouse gas emissions, water consumption, waste management, human rights, diversity, and policies. ESG reports convey this valuable quantitative information through tables. Unfortunately, extracting this information is difficult due to high variability in the table structure as well as content. We propose Statements, a novel domain agnostic data structure for extracting quantitative facts and related information. We propose translating tables to statements as a new supervised deep-learning universal information extraction task. We introduce SemTabNet - a dataset of over 100K annotated tables. Investigating a family of T5-based Statement Extraction Models, our best model generates statements which are 82% similar to the ground-truth (compared to baseline of 21%). We demonstrate the advantages of statements by applying our model to over 2700 tables from ESG reports. The homogeneous nature of statements permits exploratory data analysis on expansive information found in large collections of ESG reports.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Binary neutron star mergers using a discontinuous Galerkin-finite difference hybrid method
Authors:
Nils Deppe,
Francois Foucart,
Marceline S. Bonilla,
Michael Boyle,
Nicholas J. Corso,
Matthew D. Duez,
Matthew Giesler,
François Hébert,
Lawrence E. Kidder,
Yoonsoo Kim,
Prayush Kumar,
Isaac Legred,
Geoffrey Lovelace,
Elias R. Most,
Jordan Moxon,
Kyle C. Nelli,
Harald P. Pfeiffer,
Mark A. Scheel,
Saul A. Teukolsky,
William Throwe,
Nils L. Vu
Abstract:
We present a discontinuous Galerkin-finite difference hybrid scheme that allows high-order shock capturing with the discontinuous Galerkin method for general relativistic magnetohydrodynamics in dynamical spacetimes. We present several optimizations and stability improvements to our algorithm that allow the hybrid method to successfully simulate single, rotating, and binary neutron stars. The hybr…
▽ More
We present a discontinuous Galerkin-finite difference hybrid scheme that allows high-order shock capturing with the discontinuous Galerkin method for general relativistic magnetohydrodynamics in dynamical spacetimes. We present several optimizations and stability improvements to our algorithm that allow the hybrid method to successfully simulate single, rotating, and binary neutron stars. The hybrid method achieves the efficiency of discontinuous Galerkin methods throughout almost the entire spacetime during the inspiral phase, while being able to robustly capture shocks and resolve the stellar surfaces. We also use Cauchy-Characteristic evolution to compute the first gravitational waveforms at future null infinity from binary neutron star mergers. The simulations presented here are the first successful binary neutron star inspiral and merger simulations using discontinuous Galerkin methods.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Magnetic Field Response of Dipolar-Octupolar Quantum Spin Ice
Authors:
Zhengbang Zhou,
Félix Desrochers,
Yong Baek Kim
Abstract:
Dipolar-octupolar (DO) pyrochlore systems Ce$_2$(Zr,Sn,Hf)$_2$O$_7$ have garnered much attention as recent investigations suggest that they may stabilize a novel quantum spin ice (QSI), a quantum spin liquid (QSL) with an emergent $U(1)$ gauge field. In particular, the experimentally estimated microscopic exchange parameters place Ce$_2$Zr$_2$O$_7$ in the $π$-flux QSI regime, and recent neutron sc…
▽ More
Dipolar-octupolar (DO) pyrochlore systems Ce$_2$(Zr,Sn,Hf)$_2$O$_7$ have garnered much attention as recent investigations suggest that they may stabilize a novel quantum spin ice (QSI), a quantum spin liquid (QSL) with an emergent $U(1)$ gauge field. In particular, the experimentally estimated microscopic exchange parameters place Ce$_2$Zr$_2$O$_7$ in the $π$-flux QSI regime, and recent neutron scattering experiments have corroborated some key theoretical predictions. On the other hand, to make a definitive conclusion, more multifaceted experimental signatures are desirable. In this regard, recent neutron scattering investigation of the magnetic field dependence of the spin correlations in Ce$_2$Zr$_2$O$_7$ may provide valuable information. However, there have not been any comprehensive theoretical studies for comparison. In this work, we provide such information using gauge mean-field theory (GMFT), allowing for theoretical investigation beyond the perturbative regime. In particular, we construct the phase diagrams for the [110], [111], and [001] field directions. Furthermore, we demonstrate the distinctive evolution of the equal-time and dynamical spin structure factors as a function of the magnetic field for each field direction. These predictions will help future experiments confirm the true nature of the DO-QSI.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance
Authors:
Younghyun Kim,
Geunmin Hwang,
Junyu Zhang,
Eunbyung Park
Abstract:
Recent surge in large-scale generative models has spurred the development of vast fields in computer vision. In particular, text-to-image diffusion models have garnered widespread adoption across diverse domain due to their potential for high-fidelity image generation. Nonetheless, existing large-scale diffusion models are confined to generate images of up to 1K resolution, which is far from meeti…
▽ More
Recent surge in large-scale generative models has spurred the development of vast fields in computer vision. In particular, text-to-image diffusion models have garnered widespread adoption across diverse domain due to their potential for high-fidelity image generation. Nonetheless, existing large-scale diffusion models are confined to generate images of up to 1K resolution, which is far from meeting the demands of contemporary commercial applications. Directly sampling higher-resolution images often yields results marred by artifacts such as object repetition and distorted shapes. Addressing the aforementioned issues typically necessitates training or fine-tuning models on higher resolution datasets. However, this undertaking poses a formidable challenge due to the difficulty in collecting large-scale high-resolution contents and substantial computational resources. While several preceding works have proposed alternatives, they often fail to produce convincing results. In this work, we probe the generative ability of diffusion models at higher resolution beyond its original capability and propose a novel progressive approach that fully utilizes generated low-resolution image to guide the generation of higher resolution image. Our method obviates the need for additional training or fine-tuning which significantly lowers the burden of computational costs. Extensive experiments and results validate the efficiency and efficacy of our method. Project page: https://yhyun225.github.io/DiffuseHigh/
△ Less
Submitted 11 July, 2024; v1 submitted 26 June, 2024;
originally announced June 2024.
-
Categorification of quantum Borcherds-Bozec algebras
Authors:
Seok-Jin Kang,
Young Rock Kim,
Bolun Tong
Abstract:
We categorify the quantum Borcherds-Bozec algebras by constructing their associated Khovanov-Lauda-Rouquier algebras.
We categorify the quantum Borcherds-Bozec algebras by constructing their associated Khovanov-Lauda-Rouquier algebras.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Scalp Diagnostic System With Label-Free Segmentation and Training-Free Image Translation
Authors:
Youngmin Kim,
Saejin Kim,
Hoyeon Moon,
Youngjae Yu,
Junhyug Noh
Abstract:
Scalp diseases and alopecia affect millions of people around the world, underscoring the urgent need for early diagnosis and management of the disease. However, the development of a comprehensive AI-based diagnosis system encompassing these conditions remains an underexplored domain due to the challenges associated with data imbalance and the costly nature of labeling. To address these issues, we…
▽ More
Scalp diseases and alopecia affect millions of people around the world, underscoring the urgent need for early diagnosis and management of the disease. However, the development of a comprehensive AI-based diagnosis system encompassing these conditions remains an underexplored domain due to the challenges associated with data imbalance and the costly nature of labeling. To address these issues, we propose ScalpVision, an AI-driven system for the holistic diagnosis of scalp diseases and alopecia. In ScalpVision, effective hair segmentation is achieved using pseudo image-label pairs and an innovative prompting method in the absence of traditional hair masking labels. This approach is crucial for extracting key features such as hair thickness and count, which are then used to assess alopecia severity. Additionally, ScalpVision introduces DiffuseIT-M, a generative model adept at dataset augmentation while maintaining hair information, facilitating improved predictions of scalp disease severity. Our experimental results affirm ScalpVision's efficiency in diagnosing a variety of scalp conditions and alopecia, showcasing its potential as a valuable tool in dermatological care.
△ Less
Submitted 25 June, 2024; v1 submitted 24 June, 2024;
originally announced June 2024.
-
Achieving Fairness Across Local and Global Models in Federated Learning
Authors:
Disha Makhija,
Xing Han,
Joydeep Ghosh,
Yejin Kim
Abstract:
Achieving fairness across diverse clients in Federated Learning (FL) remains a significant challenge due to the heterogeneity of the data and the inaccessibility of sensitive attributes from clients' private datasets. This study addresses this issue by introducing \texttt{EquiFL}, a novel approach designed to enhance both local and global fairness in federated learning environments. \texttt{EquiFL…
▽ More
Achieving fairness across diverse clients in Federated Learning (FL) remains a significant challenge due to the heterogeneity of the data and the inaccessibility of sensitive attributes from clients' private datasets. This study addresses this issue by introducing \texttt{EquiFL}, a novel approach designed to enhance both local and global fairness in federated learning environments. \texttt{EquiFL} incorporates a fairness term into the local optimization objective, effectively balancing local performance and fairness. The proposed coordination mechanism also prevents bias from propagating across clients during the collaboration phase. Through extensive experiments across multiple benchmarks, we demonstrate that \texttt{EquiFL} not only strikes a better balance between accuracy and fairness locally at each client but also achieves global fairness. The results also indicate that \texttt{EquiFL} ensures uniform performance distribution among clients, thus contributing to performance fairness. Furthermore, we showcase the benefits of \texttt{EquiFL} in a real-world distributed dataset from a healthcare application, specifically in predicting the effects of treatments on patients across various hospital locations.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Investigating the Influence of Prompt-Specific Shortcuts in AI Generated Text Detection
Authors:
Choonghyun Park,
Hyuhng Joon Kim,
Junyeob Kim,
Youna Kim,
Taeuk Kim,
Hyunsoo Cho,
Hwiyeol Jo,
Sang-goo Lee,
Kang Min Yoo
Abstract:
AI Generated Text (AIGT) detectors are developed with texts from humans and LLMs of common tasks. Despite the diversity of plausible prompt choices, these datasets are generally constructed with a limited number of prompts. The lack of prompt variation can introduce prompt-specific shortcut features that exist in data collected with the chosen prompt, but do not generalize to others. In this paper…
▽ More
AI Generated Text (AIGT) detectors are developed with texts from humans and LLMs of common tasks. Despite the diversity of plausible prompt choices, these datasets are generally constructed with a limited number of prompts. The lack of prompt variation can introduce prompt-specific shortcut features that exist in data collected with the chosen prompt, but do not generalize to others. In this paper, we analyze the impact of such shortcuts in AIGT detection. We propose Feedback-based Adversarial Instruction List Optimization (FAILOpt), an attack that searches for instructions deceptive to AIGT detectors exploiting prompt-specific shortcuts. FAILOpt effectively drops the detection performance of the target detector, comparable to other attacks based on adversarial in-context examples. We also utilize our method to enhance the robustness of the detector by mitigating the shortcuts. Based on the findings, we further train the classifier with the dataset augmented by FAILOpt prompt. The augmented classifier exhibits improvements across generation models, tasks, and attacks. Our code will be available at https://github.com/zxcvvxcz/FAILOpt.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Search for charmed baryons in the $Λ_c^+η$ system and measurement of the branching fractions of $Λ_c(2880)^+$ and $Λ_c(2940)^+$ decaying to $Λ_c^+η$ and $pD^0$ relative to $Σ_c(2455)π$
Authors:
Belle Collaboration,
S. X. Li,
C. P. Shen,
I. Adachi,
J. K. Ahn,
H. Aihara,
D. M. Asner,
H. Atmacan,
T. Aushev,
R. Ayad,
Sw. Banerjee,
K. Belous,
J. Bennett,
M. Bessner,
T. Bilka,
D. Biswas,
D. Bodrov,
A. Bozek,
M. Bračko,
P. Branchini,
T. E. Browder,
A. Budano,
M. Campajola,
M. -C. Chang,
B. G. Cheon
, et al. (102 additional authors not shown)
Abstract:
We search for excited charmed baryons in the $Λ_c^+η$ system using a data sample corresponding to an integrated luminosity of 980 $\rm fb^{-1}$. The data were collected by the Belle detector at the KEKB $e^{+}$$e^{-}$ asymmetric-energy collider. No significant signals are found in the $Λ_c^+η$ mass spectrum, including the known $Λ_c(2880)^+$ and $Λ_c(2940)^+$. Clear $Λ_c(2880)^+$ and…
▽ More
We search for excited charmed baryons in the $Λ_c^+η$ system using a data sample corresponding to an integrated luminosity of 980 $\rm fb^{-1}$. The data were collected by the Belle detector at the KEKB $e^{+}$$e^{-}$ asymmetric-energy collider. No significant signals are found in the $Λ_c^+η$ mass spectrum, including the known $Λ_c(2880)^+$ and $Λ_c(2940)^+$. Clear $Λ_c(2880)^+$ and $Λ_c(2940)^+$ signals are observed in the $pD^0$ mass spectrum. We set upper limits at 90\% credibility level on ratios of branching fractions of $Λ_c(2880)^+$ and $Λ_c(2940)^+$ decaying to $Λ_c^+η$ relative to $Σ_c(2455)π$ of $<0.13$ for the $Λ_c(2880)^+$ and $<1.11$ for the $Λ_c(2940)^+$. We measure ratios of branching fractions of $Λ_c(2880)^+$ and $Λ_c(2940)^+$ decaying to $pD^0$ relative to $Σ_c(2455)π$ of $0.75 \pm 0.03(\text{stat.}) \pm 0.07(\text{syst.})$ for the $Λ_c(2880)^+$ and $3.59 \pm 0.21(\text{stat.}) \pm 0.56(\text{syst.})$ for the $Λ_c(2940)^+$.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Observation of a non-Hermitian supersonic mode
Authors:
Yuxuan Zhang,
Juan Carrasquilla,
Yong Baek Kim
Abstract:
Quantum computers have long been anticipated to excel in simulating quantum many-body physics. While most previous work has focused on Hermitian physics, we demonstrate the power of variational quantum circuits for resource-efficient simulations of dynamical and equilibrium physics in non-Hermitian systems, revealing new phenomena beyond standard Hermitian quantum machines. Using a variational qua…
▽ More
Quantum computers have long been anticipated to excel in simulating quantum many-body physics. While most previous work has focused on Hermitian physics, we demonstrate the power of variational quantum circuits for resource-efficient simulations of dynamical and equilibrium physics in non-Hermitian systems, revealing new phenomena beyond standard Hermitian quantum machines. Using a variational quantum compilation scheme for fermionic systems, we reduce gate count, save qubits, and eliminate the need for postselection, a major challenge in simulating non-Hermitian dynamics via standard Trotterization. Experimentally, we observed a supersonic mode in the connected density-density correlation function on an $ n = 18 $ fermionic chain after a non-Hermitian, locally interacting quench, which would otherwise be forbidden by the Lieb-Robinson bound in a Hermitian system. Additionally, we investigate sequential quantum circuits generated by tensor networks for ground state preparation, here defined as the eigenstate with the lowest real part eigenvalue, using a variance minimization scheme. Through a trapped-ion implementation on the Quantinuum H1 quantum processor, we accurately capture correlation functions and energies across an exceptional point on a dissipative spin chain up to length $ n = 20 $ using only 3 qubits. Motivated by these advancements, we provide an analytical example demonstrating that simulating single-qubit non-Hermitian dynamics for $Θ(\log(n))$ time from certain initial states is exponentially hard on a quantum computer, offering insights into the opportunities and limitations of using quantum computation for simulating non-Hermitian physics.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Harnessing Knowledge Retrieval with Large Language Models for Clinical Report Error Correction
Authors:
Jinge Wu,
Zhaolong Wu,
Abul Hasan,
Yunsoo Kim,
Jason P. Y. Cheung,
Teng Zhang,
Honghan Wu
Abstract:
This study proposes an approach for error correction in clinical radiology reports, leveraging large language models (LLMs) and retrieval-augmented generation (RAG) techniques. The proposed framework employs internal and external retrieval mechanisms to extract relevant medical entities and relations from the report and external knowledge sources. A three-stage inference process is introduced, dec…
▽ More
This study proposes an approach for error correction in clinical radiology reports, leveraging large language models (LLMs) and retrieval-augmented generation (RAG) techniques. The proposed framework employs internal and external retrieval mechanisms to extract relevant medical entities and relations from the report and external knowledge sources. A three-stage inference process is introduced, decomposing the task into error detection, localization, and correction subtasks, which enhances the explainability and performance of the system. The effectiveness of the approach is evaluated using a benchmark dataset created by corrupting real-world radiology reports with realistic errors, guided by domain experts. Experimental results demonstrate the benefits of the proposed methods, with the combination of internal and external retrieval significantly improving the accuracy of error detection, localization, and correction across various state-of-the-art LLMs. The findings contribute to the development of more robust and reliable error correction systems for clinical documentation.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
High-Tc superconductor candidates proposed by machine learning
Authors:
Siwoo Lee,
Jason Hattrick-Simpers,
Young-June Kim,
O. Anatole von Lilienfeld
Abstract:
We cast the relation between chemical compositions of solid-state materials and their superconducting critical temperature (Tc) in terms of a statistical learning problem with reduced complexity. Training of query-aware similarity-based ridge regression models on experimental SuperCon data with (implicit) and without (ambient) high pressure entries achieves average Tc prediction errors of ~10 K fo…
▽ More
We cast the relation between chemical compositions of solid-state materials and their superconducting critical temperature (Tc) in terms of a statistical learning problem with reduced complexity. Training of query-aware similarity-based ridge regression models on experimental SuperCon data with (implicit) and without (ambient) high pressure entries achieves average Tc prediction errors of ~10 K for unseen out-of-sample materials. Subsequent utilization of the approach to scan ~153k materials in the Materials Project enables the ranking of candidates by Tc while taking into account thermodynamic stability and small band gap. Stable top three high-Tc candidate materials with large band gaps for implicit and ambient pressures are predicted to be Cs2Sn(H2N)6 (324 K), CsH5N2 (315K), Rb2Sn(H2N)6 (305 K), and H15IrBr3N5 (189 K), H12OsN5Cl3O (161 K), B10H13I (151 K), respectively. Stable top three high-Tc candidate materials with small band gaps for implicit and ambient pressures are predicted to be RbLiH12Se3N4 (255 K), CeH14Cl3O7 (246 K), Li(H3N)4 (234 K), and ReH30Ru2(NCl)10 (127 K), AlH18Ru(NF)6 (120 K), Sr(Li2P)2 (117 K), respectively.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
12C+12C Reaction Rates and the Evolution of a Massive Star
Authors:
Gwangeon Seong,
Yubin Kim,
Kyujin Kwak,
Sunghoon Ahn,
Chaeyeon Park,
Kevin Insik Hahn,
Chunglee Kim
Abstract:
Carbon fusion is important to understand the late stages in the evolution of a massive star. Astronomically interesting energy ranges for the 12C+12C reactions have been, however, poorly constrained by experiments. Theoretical studies on stellar evolution have relied on reaction rates that are extrapolated from those measured in higher energies. In this work, we update the carbon fusion reaction r…
▽ More
Carbon fusion is important to understand the late stages in the evolution of a massive star. Astronomically interesting energy ranges for the 12C+12C reactions have been, however, poorly constrained by experiments. Theoretical studies on stellar evolution have relied on reaction rates that are extrapolated from those measured in higher energies. In this work, we update the carbon fusion reaction rates by fitting the astrophysical S-factor data obtained from direct measurements based on the Fowler, Caughlan, & Zimmerman (1975) formula. We examine the evolution of a 20 M_sun star with the updated 12C+12C reaction rates performing simulations with the MESA (Modules for Experiments for Stellar Astrophysics) code. Between 0.5 and 1 GK, the updated reaction rates are 0.35 to 0.5 times less than the rates suggested by Caughlan and Fowler (1988). The updated rates result in the increase of core temperature by about 7% and of the neutrino cooling by about a factor of three. Moreover, the carbon-burning lifetime is reduced by a factor of 2.7. The updated carbon fusion reaction rates lead to some changes in the details of the stellar evolution model, their impact seems relatively minor compared to other uncertain physical factors like convection, overshooting, rotation, and mass-loss history. The astrophysical S-factor measurements in lower energies have large errors below the Coulomb barrier. More precise measurements in lower energies for the carbon burning would be useful to improve our study and to understand the evolution of a massive star.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Mode Coupling and Breathing Oscillation in Partially Magnetized Cross-Field Plasmas
Authors:
Jong Yoon Park,
June Young Kim
Abstract:
We report on investigations of mode coupling between rotating spokes during the onset of the breathing oscillation. Demonstrating the existence of nonlinear coupling between the sporadic spokes and the breathing oscillation, we suggest the oscillating azimuthal electric field as the energy source for additional ionization within the plasma. Our results indicate that intermittent three-wave couplin…
▽ More
We report on investigations of mode coupling between rotating spokes during the onset of the breathing oscillation. Demonstrating the existence of nonlinear coupling between the sporadic spokes and the breathing oscillation, we suggest the oscillating azimuthal electric field as the energy source for additional ionization within the plasma. Our results indicate that intermittent three-wave coupling is a possible mechanism for triggering low-frequency breathing oscillations in partially magnetized cross-field plasma.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Meent: Differentiable Electromagnetic Simulator for Machine Learning
Authors:
Yongha Kim,
Anthony W. Jung,
Sanmun Kim,
Kevin Octavian,
Doyoung Heo,
Chaejin Park,
Jeongmin Shin,
Sunghyun Nam,
Chanhyung Park,
Juho Park,
Sangjun Han,
Jinmyoung Lee,
Seolho Kim,
Min Seok Jang,
Chan Y. Park
Abstract:
Electromagnetic (EM) simulation plays a crucial role in analyzing and designing devices with sub-wavelength scale structures such as solar cells, semiconductor devices, image sensors, future displays and integrated photonic devices. Specifically, optics problems such as estimating semiconductor device structures and designing nanophotonic devices provide intriguing research topics with far-reachin…
▽ More
Electromagnetic (EM) simulation plays a crucial role in analyzing and designing devices with sub-wavelength scale structures such as solar cells, semiconductor devices, image sensors, future displays and integrated photonic devices. Specifically, optics problems such as estimating semiconductor device structures and designing nanophotonic devices provide intriguing research topics with far-reaching real world impact. Traditional algorithms for such tasks require iteratively refining parameters through simulations, which often yield sub-optimal results due to the high computational cost of both the algorithms and EM simulations. Machine learning (ML) emerged as a promising candidate to mitigate these challenges, and optics research community has increasingly adopted ML algorithms to obtain results surpassing classical methods across various tasks. To foster a synergistic collaboration between the optics and ML communities, it is essential to have an EM simulation software that is user-friendly for both research communities. To this end, we present Meent, an EM simulation software that employs rigorous coupled-wave analysis (RCWA). Developed in Python and equipped with automatic differentiation (AD) capabilities, Meent serves as a versatile platform for integrating ML into optics research and vice versa. To demonstrate its utility as a research platform, we present three applications of Meent: 1) generating a dataset for training neural operator, 2) serving as an environment for the reinforcement learning of nanophotonic device optimization, and 3) providing a solution for inverse problems with gradient-based optimizers. These applications highlight Meent's potential to advance both EM simulation and ML methodologies. The code is available at https://github.com/kc-ml2/meent with the MIT license to promote the cross-polinations of ideas among academic researchers and industry practitioners.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
The Design, Implementation, and Performance of the LZ Calibration Systems
Authors:
J. Aalbers,
D. S. Akerib,
A. K. Al Musalhi,
F. Alder,
C. S. Amarasinghe,
A. Ames,
T. J. Anderson,
N. Angelides,
H. M. Araújo,
J. E. Armstrong,
M. Arthurs,
A. Baker,
S. Balashov,
J. Bang,
E. E. Barillier,
J. W. Bargemann,
K. Beattie,
T. Benson,
A. Bhatti,
A. Biekert,
T. P. Biesiadzinski,
H. J. Birch,
E. Bishop,
G. M. Blockinger,
B. Boxer
, et al. (179 additional authors not shown)
Abstract:
LUX-ZEPLIN (LZ) is a tonne-scale experiment searching for direct dark matter interactions and other rare events. It is located at the Sanford Underground Research Facility (SURF) in Lead, South Dakota, USA. The core of the LZ detector is a dual-phase xenon time projection chamber (TPC), designed with the primary goal of detecting Weakly Interacting Massive Particles (WIMPs) via their induced low e…
▽ More
LUX-ZEPLIN (LZ) is a tonne-scale experiment searching for direct dark matter interactions and other rare events. It is located at the Sanford Underground Research Facility (SURF) in Lead, South Dakota, USA. The core of the LZ detector is a dual-phase xenon time projection chamber (TPC), designed with the primary goal of detecting Weakly Interacting Massive Particles (WIMPs) via their induced low energy nuclear recoils. Surrounding the TPC, two veto detectors immersed in an ultra-pure water tank enable reducing background events to enhance the discovery potential. Intricate calibration systems are purposely designed to precisely understand the responses of these three detector volumes to various types of particle interactions and to demonstrate LZ's ability to discriminate between signals and backgrounds. In this paper, we present a comprehensive discussion of the key features, requirements, and performance of the LZ calibration systems, which play a crucial role in enabling LZ's WIMP-search and its broad science program. The thorough description of these calibration systems, with an emphasis on their novel aspects, is valuable for future calibration efforts in direct dark matter and other rare-event search experiments.
△ Less
Submitted 20 June, 2024; v1 submitted 2 May, 2024;
originally announced June 2024.
-
Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models
Authors:
Dongwon Jo,
Taesu Kim,
Yulhwa Kim,
Jae-Joon Kim
Abstract:
Binarization, which converts weight parameters to binary values, has emerged as an effective strategy to reduce the size of large language models (LLMs). However, typical binarization techniques significantly diminish linguistic effectiveness of LLMs. To address this issue, we introduce a novel binarization technique called Mixture of Scales (BinaryMoS). Unlike conventional methods, BinaryMoS empl…
▽ More
Binarization, which converts weight parameters to binary values, has emerged as an effective strategy to reduce the size of large language models (LLMs). However, typical binarization techniques significantly diminish linguistic effectiveness of LLMs. To address this issue, we introduce a novel binarization technique called Mixture of Scales (BinaryMoS). Unlike conventional methods, BinaryMoS employs multiple scaling experts for binary weights, dynamically merging these experts for each token to adaptively generate scaling factors. This token-adaptive approach boosts the representational power of binarized LLMs by enabling contextual adjustments to the values of binary weights. Moreover, because this adaptive process only involves the scaling factors rather than the entire weight matrix, BinaryMoS maintains compression efficiency similar to traditional static binarization methods. Our experimental results reveal that BinaryMoS surpasses conventional binarization techniques in various natural language processing tasks and even outperforms 2-bit quantization methods, all while maintaining similar model size to static binarization techniques.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Transversal CNOT gate with multi-cycle error correction
Authors:
Younghun Kim,
Martin Sevior,
Muhammad Usman
Abstract:
A scalable and programmable quantum computer holds the potential to solve computationally intensive tasks that classical computers cannot accomplish within a reasonable time frame, achieving quantum advantage. However, the vulnerability of the current generation of quantum processors to errors poses a significant challenge towards executing complex and deep quantum circuits required for practical…
▽ More
A scalable and programmable quantum computer holds the potential to solve computationally intensive tasks that classical computers cannot accomplish within a reasonable time frame, achieving quantum advantage. However, the vulnerability of the current generation of quantum processors to errors poses a significant challenge towards executing complex and deep quantum circuits required for practical problems. Quantum error correction codes such as Stabilizer codes offer a promising path forward for fault-tolerant quantum computing, however their realisation on quantum hardware is an on-going area of research. In particular, fault-tolerant quantum processing must employ logical gates on logical qubits with error suppression with realistically large size codes. This work has implemented a transversal CNOT gate between two logical qubits constructed using the Repetition code with flag qubits, and demonstrated error suppression with increasing code size under multiple rounds of error detection. By performing experiments on IBM quantum devices through cloud access, our results show that despite the potential for error propagation among logical qubits during the transversal CNOT gate operation, increasing the number of physical qubits from 21 to 39 and 57 can suppress errors, which persists over 10 rounds of error detection. Our work establishes the feasibility of employing logical CNOT gates alongside error detection on a superconductor-based processor using current generation quantum hardware.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
MDCR: A Dataset for Multi-Document Conditional Reasoning
Authors:
Peter Baile Chen,
Yi Zhang,
Chunwei Liu,
Sejal Gupta,
Yoon Kim,
Michael Cafarella
Abstract:
The same real-life questions posed to different individuals may lead to different answers based on their unique situations. For instance, whether a student is eligible for a scholarship depends on eligibility conditions, such as major or degree required. ConditionalQA was proposed to evaluate models' capability of reading a document and answering eligibility questions, considering unmentioned cond…
▽ More
The same real-life questions posed to different individuals may lead to different answers based on their unique situations. For instance, whether a student is eligible for a scholarship depends on eligibility conditions, such as major or degree required. ConditionalQA was proposed to evaluate models' capability of reading a document and answering eligibility questions, considering unmentioned conditions. However, it is limited to questions on single documents, neglecting harder cases that may require cross-document reasoning and optimization, for example, "What is the maximum number of scholarships attainable?" Such questions over multiple documents are not only more challenging due to more context having to understand, but also because the model has to (1) explore all possible combinations of unmentioned conditions and (2) understand the relationship between conditions across documents, to reason about the optimal outcome. To evaluate models' capability of answering such questions, we propose a new dataset MDCR, which can reflect real-world challenges and serve as a new test bed for complex conditional reasoning that requires optimization. We evaluate this dataset using the most recent LLMs and demonstrate their limitations in solving this task. We believe this dataset will facilitate future research in answering optimization questions with unknown conditions.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
ZTF SN Ia DR2: Exploring SN Ia properties in the vicinity of under-dense environments
Authors:
M. Aubert,
P. Rosnet,
B. Popovic,
F. Ruppin,
M. Smith,
M. Rigault,
G. Dimitriadis,
A. Goobar,
J. Johansson,
C. Barjou-Delayre,
U. Burgaz,
B. Carreres,
F. Feinstein,
D. Fouchez,
L. Galbany,
M. Ginolin,
T. de Jaeger,
M. M. Kasliwal,
Y. -L. Kim,
L. Lacroix,
F. J. Masci,
T. E. Müller-Bravo,
B. Racine,
C. Ravoux,
N. Regnault
, et al. (7 additional authors not shown)
Abstract:
The unprecedented statistics of detected Type Ia supernovae (SNe Ia) brought by the Zwicky Transient Facility enables us to probe the impact of the Large-Scale Structure on the properties of these objects. The goal of this paper is to explore the possible impact of the under-dense part of the large-scale structure on the intrinsic SALT2 light curve properties of SNe Ia and uncover possible biases…
▽ More
The unprecedented statistics of detected Type Ia supernovae (SNe Ia) brought by the Zwicky Transient Facility enables us to probe the impact of the Large-Scale Structure on the properties of these objects. The goal of this paper is to explore the possible impact of the under-dense part of the large-scale structure on the intrinsic SALT2 light curve properties of SNe Ia and uncover possible biases in SN Ia analyses. With a volume-limited selection of ZTF-Cosmo-DR2 Type Ia supernovae overlapping with the SDSS-DR7 survey footprint, we investigate the distribution of their properties with regard to voids detected in the SDSS-DR7 galaxy sample. We further use Voronoi volumes as proxy for local density environments within the large-scale structure. We find a moderate dependency of the stretch toward the localisation around the void centre and none when considering colour. The local Voronoi volumes mostly affect the fraction of low/high stretch supernovae. With the current statistics available, we consider that the impact of high or low local density environment can be considered as a proxy for the colour of the host galaxy. Under-dense environments should not cause any biases in supernova analyses.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.