-
Kinetic Typography Diffusion Model
Authors:
Seonmi Park,
Inhwan Bae,
Seunghyun Shin,
Hae-Gon Jeon
Abstract:
This paper introduces a method for realistic kinetic typography that generates user-preferred animatable 'text content'. We draw on recent advances in guided video diffusion models to achieve visually-pleasing text appearances. To do this, we first construct a kinetic typography dataset, comprising about 600K videos. Our dataset is made from a variety of combinations in 584 templates designed by p…
▽ More
This paper introduces a method for realistic kinetic typography that generates user-preferred animatable 'text content'. We draw on recent advances in guided video diffusion models to achieve visually-pleasing text appearances. To do this, we first construct a kinetic typography dataset, comprising about 600K videos. Our dataset is made from a variety of combinations in 584 templates designed by professional motion graphics designers and involves changing each letter's position, glyph, and size (i.e., flying, glitches, chromatic aberration, reflecting effects, etc.). Next, we propose a video diffusion model for kinetic typography. For this, there are three requirements: aesthetic appearances, motion effects, and readable letters. This paper identifies the requirements. For this, we present static and dynamic captions used as spatial and temporal guidance of a video diffusion model, respectively. The static caption describes the overall appearance of the video, such as colors, texture and glyph which represent a shape of each letter. The dynamic caption accounts for the movements of letters and backgrounds. We add one more guidance with zero convolution to determine which text content should be visible in the video. We apply the zero convolution to the text content, and impose it on the diffusion model. Lastly, our glyph loss, only minimizing a difference between the predicted word and its ground-truth, is proposed to make the prediction letters readable. Experiments show that our model generates kinetic typography videos with legible and artistic letter motions based on text prompts.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Supernova Pointing Capabilities of DUNE
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
B. Aimard,
F. Akbar,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
H. Amar,
P. Amedo,
J. Anderson,
D. A. Andrade
, et al. (1340 additional authors not shown)
Abstract:
The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electr…
▽ More
The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electron-neutrino charged-current absorption on $^{40}$Ar and elastic scattering of neutrinos on electrons. Procedures to reconstruct individual interactions, including a newly developed technique called ``brems flipping'', as well as the burst direction from an ensemble of interactions are described. Performance of the burst direction reconstruction is evaluated for supernovae happening at a distance of 10 kpc for a specific supernova burst flux model. The pointing resolution is found to be 3.4 degrees at 68% coverage for a perfect interaction-channel classification and a fiducial mass of 40 kton, and 6.6 degrees for a 10 kton fiducial mass respectively. Assuming a 4% rate of charged-current interactions being misidentified as elastic scattering, DUNE's burst pointing resolution is found to be 4.3 degrees (8.7 degrees) at 68% coverage.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Isotropy of cosmic rays beyond $10^{20}$ eV favors their heavy mass composition
Authors:
Telescope Array Collaboration,
R. U. Abbasi,
Y. Abe,
T. Abu-Zayyad,
M. Allen,
Y. Arai,
R. Arimura,
E. Barcikowski,
J. W. Belz,
D. R. Bergman,
S. A. Blake,
I. Buckland,
B. G. Cheon,
M. Chikawa,
T. Fujii,
K. Fujisue,
K. Fujita,
R. Fujiwara,
M. Fukushima,
G. Furlich,
N. Globus,
R. Gonzalez,
W. Hanlon,
N. Hayashida,
H. He
, et al. (118 additional authors not shown)
Abstract:
We report an estimation of the injected mass composition of ultra-high energy cosmic rays (UHECRs) at energies higher than 10 EeV. The composition is inferred from an energy-dependent sky distribution of UHECR events observed by the Telescope Array surface detector by comparing it to the Large Scale Structure of the local Universe. In the case of negligible extra-galactic magnetic fields the resul…
▽ More
We report an estimation of the injected mass composition of ultra-high energy cosmic rays (UHECRs) at energies higher than 10 EeV. The composition is inferred from an energy-dependent sky distribution of UHECR events observed by the Telescope Array surface detector by comparing it to the Large Scale Structure of the local Universe. In the case of negligible extra-galactic magnetic fields the results are consistent with a relatively heavy injected composition at E ~ 10 EeV that becomes lighter up to E ~ 100 EeV, while the composition at E > 100 EeV is very heavy. The latter is true even in the presence of highest experimentally allowed extra-galactic magnetic fields, while the composition at lower energies can be light if a strong EGMF is present. The effect of the uncertainty in the galactic magnetic field on these results is subdominant.
△ Less
Submitted 3 July, 2024; v1 submitted 27 June, 2024;
originally announced June 2024.
-
Mass composition of ultra-high energy cosmic rays from distribution of their arrival directions with the Telescope Array
Authors:
Telescope Array Collaboration,
R. U. Abbasi,
Y. Abe,
T. Abu-Zayyad,
M. Allen,
Y. Arai,
R. Arimura,
E. Barcikowski,
J. W. Belz,
D. R. Bergman,
S. A. Blake,
I. Buckland,
B. G. Cheon,
M. Chikawa,
T. Fujii,
K. Fujisue,
K. Fujita,
R. Fujiwara,
M. Fukushima,
G. Furlich,
N. Globus,
R. Gonzalez,
W. Hanlon,
N. Hayashida,
H. He
, et al. (118 additional authors not shown)
Abstract:
We use a new method to estimate the injected mass composition of ultrahigh cosmic rays (UHECRs) at energies higher than 10 EeV. The method is based on comparison of the energy-dependent distribution of cosmic ray arrival directions as measured by the Telescope Array experiment (TA) with that calculated in a given putative model of UHECR under the assumption that sources trace the large-scale struc…
▽ More
We use a new method to estimate the injected mass composition of ultrahigh cosmic rays (UHECRs) at energies higher than 10 EeV. The method is based on comparison of the energy-dependent distribution of cosmic ray arrival directions as measured by the Telescope Array experiment (TA) with that calculated in a given putative model of UHECR under the assumption that sources trace the large-scale structure (LSS) of the Universe. As we report in the companion letter, the TA data show large deflections with respect to the LSS which can be explained, assuming small extra-galactic magnetic fields (EGMF), by an intermediate composition changing to a heavy one (iron) in the highest energy bin. Here we show that these results are robust to uncertainties in UHECR injection spectra, the energy scale of the experiment and galactic magnetic fields (GMF). The assumption of weak EGMF, however, strongly affects this interpretation at all but the highest energies E > 100 EeV, where the remarkable isotropy of the data implies a heavy injected composition even in the case of strong EGMF. This result also holds if UHECR sources are as rare as $2 \times 10^{-5}$ Mpc$^{-3}$, that is the conservative lower limit for the source number density.
△ Less
Submitted 3 July, 2024; v1 submitted 27 June, 2024;
originally announced June 2024.
-
Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization
Authors:
Sungbin Shin,
Wonpyo Park,
Jaeho Lee,
Namhoon Lee
Abstract:
This work suggests fundamentally rethinking the current practice of pruning large language models (LLMs). The way it is done is by divide and conquer: split the model into submodels, sequentially prune them, and reconstruct predictions of the dense counterparts on small calibration data one at a time; the final model is obtained simply by putting the resulting sparse submodels together. While this…
▽ More
This work suggests fundamentally rethinking the current practice of pruning large language models (LLMs). The way it is done is by divide and conquer: split the model into submodels, sequentially prune them, and reconstruct predictions of the dense counterparts on small calibration data one at a time; the final model is obtained simply by putting the resulting sparse submodels together. While this approach enables pruning under memory constraints, it generates high reconstruction errors. In this work, we first present an array of reconstruction techniques that can significantly reduce this error by more than $90\%$. Unwittingly, however, we discover that minimizing reconstruction error is not always ideal and can overfit the given calibration data, resulting in rather increased language perplexity and poor performance at downstream tasks. We find out that a strategy of self-generating calibration data can mitigate this trade-off between reconstruction and generalization, suggesting new directions in the presence of both benefits and pitfalls of reconstruction for pruning LLMs.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Anomalous Fermi pockets on Hund's metal surface of Sr2RuO4 induced by the correlation-enhanced spin-orbit coupling
Authors:
Takeshi Kondo,
Masayuki Ochi,
Shuntaro Akebi,
Yuyang Dong,
Haruka Taniguchi,
Yoshiteru Maeno,
Shik Shin
Abstract:
The electronic structure of the topmost layer in Sr2RuO4 in the close vicinity of the Fermi level is investigated by angle-resolved photoemission spectroscopy (ARPES) with a 7-eV laser. We find that the spin-orbit coupling (SOC) predicted as 100 meV by the density functional theory (DFT) calculations is enormously enhanced in a real material up to 250 meV, even more than that of bulk state (200 me…
▽ More
The electronic structure of the topmost layer in Sr2RuO4 in the close vicinity of the Fermi level is investigated by angle-resolved photoemission spectroscopy (ARPES) with a 7-eV laser. We find that the spin-orbit coupling (SOC) predicted as 100 meV by the density functional theory (DFT) calculations is enormously enhanced in a real material up to 250 meV, even more than that of bulk state (200 meV), by the electron-correlation effect increased by the octahedral rotation in the crystal structure. This causes the formation of highly orbital-mixing small Fermi pockets and reasonably explains why the orbital-selective Mott transition (OSMT) is not realized in perovskite oxides with crystal distortion. Interestingly, Hund's metal feature allows the quasiparticle generation only near EF, restricting the spectral gap opening derived by band hybridization within an extremely small binding energy (< 10 meV). Furthermore, it causes coherent-incoherent crossover, making the Fermi pockets disappear at elevated temperatures. The anomalous Fermi pockets are characterized by the dichotomy of the orbital-isolating Hund's coupling and the orbital-mixing SOC, which is key to understanding the nature of Sr2RuO4.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Coherence Length of Electronic Nematicity in Iron-Based Superconductors
Authors:
Yoichi Kageyama,
Asato Onishi,
Cédric Bareille,
Kousuke Ishida,
Yuta Mizukami,
Shigeyuki Ishida,
Hiroshi Eisaki,
Kenichiro Hashimoto,
Toshiyuki Taniuchi,
Shik Shin,
Hiroshi Kontani,
Takasada Shibauchi
Abstract:
Recent developments in laser-excited photoemission electron microscopy (laser-PEEM) advance the visualization of electronic nematicity and nematic domain structures in iron-based superconductors. In FeSe and BaFe$_2$(As$_{0.87}$P$_{0.13}$)$_2$ superconductors, it has been reported that the thickness of the electronic nematic domain walls is unexpectedly long, leading to the formation of mesoscopic…
▽ More
Recent developments in laser-excited photoemission electron microscopy (laser-PEEM) advance the visualization of electronic nematicity and nematic domain structures in iron-based superconductors. In FeSe and BaFe$_2$(As$_{0.87}$P$_{0.13}$)$_2$ superconductors, it has been reported that the thickness of the electronic nematic domain walls is unexpectedly long, leading to the formation of mesoscopic nematicity wave [T. Shimojima $\textit{et al.}$, Science $\textbf{373}$ (2021) 1122]. This finding demonstrates that the nematic coherence length $ξ_{\rm nem}$ can be decoupled from the lattice domain wall. Here, we report that the electronic domain wall thickness shows a distinct variation in related materials: it is similarly long in FeSe$_{0.9}$S$_{0.1}$ whereas it is much shorter in undoped BaFe$_2$As$_2$. We find a correlation between the thick domain walls and the non-Fermi liquid properties of normal-state resistivity above the nematic transition temperature. This suggests that the nematic coherence length can be enhanced by underlying spin-orbital fluctuations responsible for the anomalous transport properties.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
SPEAR: Receiver-to-Receiver Acoustic Neural Warping Field
Authors:
Yuhang He,
Shitong Xu,
Jia-Xing Zhong,
Sangyun Shin,
Niki Trigoni,
Andrew Markham
Abstract:
We present SPEAR, a continuous receiver-to-receiver acoustic neural warping field for spatial acoustic effects prediction in an acoustic 3D space with a single stationary audio source. Unlike traditional source-to-receiver modelling methods that require prior space acoustic properties knowledge to rigorously model audio propagation from source to receiver, we propose to predict by warping the spat…
▽ More
We present SPEAR, a continuous receiver-to-receiver acoustic neural warping field for spatial acoustic effects prediction in an acoustic 3D space with a single stationary audio source. Unlike traditional source-to-receiver modelling methods that require prior space acoustic properties knowledge to rigorously model audio propagation from source to receiver, we propose to predict by warping the spatial acoustic effects from one reference receiver position to another target receiver position, so that the warped audio essentially accommodates all spatial acoustic effects belonging to the target position. SPEAR can be trained in a data much more readily accessible manner, in which we simply ask two robots to independently record spatial audio at different positions. We further theoretically prove the universal existence of the warping field if and only if one audio source presents. Three physical principles are incorporated to guide SPEAR network design, leading to the learned warping field physically meaningful. We demonstrate SPEAR superiority on both synthetic, photo-realistic and real-world dataset, showing the huge potential of SPEAR to various down-stream robotic tasks.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Ad Auctions for LLMs via Retrieval Augmented Generation
Authors:
MohammadTaghi Hajiaghayi,
Sébastien Lahaie,
Keivan Rezaei,
Suho Shin
Abstract:
In the field of computational advertising, the integration of ads into the outputs of large language models (LLMs) presents an opportunity to support these services without compromising content integrity. This paper introduces novel auction mechanisms for ad allocation and pricing within the textual outputs of LLMs, leveraging retrieval-augmented generation (RAG). We propose a segment auction wher…
▽ More
In the field of computational advertising, the integration of ads into the outputs of large language models (LLMs) presents an opportunity to support these services without compromising content integrity. This paper introduces novel auction mechanisms for ad allocation and pricing within the textual outputs of LLMs, leveraging retrieval-augmented generation (RAG). We propose a segment auction where an ad is probabilistically retrieved for each discourse segment (paragraph, section, or entire output) according to its bid and relevance, following the RAG framework, and priced according to competing bids. We show that our auction maximizes logarithmic social welfare, a new notion of welfare that balances allocation efficiency and fairness, and we characterize the associated incentive-compatible pricing rule. These results are extended to multi-ad allocation per segment. An empirical evaluation validates the feasibility and effectiveness of our approach over several ad auction scenarios, and exhibits inherent tradeoffs in metrics as we allow the LLM more flexibility to allocate ads.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Observation of Declination Dependence in the Cosmic Ray Energy Spectrum
Authors:
The Telescope Array Collaboration,
R. U. Abbasi,
T. Abu-Zayyad,
M. Allen,
J. W. Belz,
D. R. Bergman,
I. Buckland,
W. Campbell,
B. G. Cheon,
K. Endo,
A. Fedynitch,
T. Fujii,
K. Fujisue,
K. Fujita,
M. Fukushima,
G. Furlich,
Z. Gerber,
N. Globus,
W. Hanlon,
N. Hayashida,
H. He,
K. Hibino,
R. Higuchi,
D. Ikeda,
T. Ishii
, et al. (101 additional authors not shown)
Abstract:
We report on an observation of the difference between northern and southern skies of the ultrahigh energy cosmic ray energy spectrum with a significance of ${\sim}8σ$. We use measurements from the two largest experiments$\unicode{x2014}$the Telescope Array observing the northern hemisphere and the Pierre Auger Observatory viewing the southern hemisphere. Since the comparison of two measurements fr…
▽ More
We report on an observation of the difference between northern and southern skies of the ultrahigh energy cosmic ray energy spectrum with a significance of ${\sim}8σ$. We use measurements from the two largest experiments$\unicode{x2014}$the Telescope Array observing the northern hemisphere and the Pierre Auger Observatory viewing the southern hemisphere. Since the comparison of two measurements from different observatories introduces the issue of possible systematic differences between detectors and analyses, we validate the methodology of the comparison by examining the region of the sky where the apertures of the two observatories overlap. Although the spectra differ in this region, we find that there is only a $1.8σ$ difference between the spectrum measurements when anisotropic regions are removed and a fiducial cut in the aperture is applied.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
In-operando microwave scattering-parameter calibrated measurement of a Josephson travelling wave parametric amplifier
Authors:
S. H. Shin,
M. Stanley,
W. N. Wong,
T. Sweetnam,
A. Elarabi,
T. Lindström,
N. M. Ridler,
S. E. de Graaf
Abstract:
Superconducting travelling wave parametric amplifiers (TWPAs) are broadband near-quantum limited microwave amplifiers commonly used for qubit readout and a wide range of other applications in quantum technologies. The performance of these amplifiers depends on achieving impedance matching to minimise reflected signals. Here we apply a microwave calibration technique to extract the S-parameters of…
▽ More
Superconducting travelling wave parametric amplifiers (TWPAs) are broadband near-quantum limited microwave amplifiers commonly used for qubit readout and a wide range of other applications in quantum technologies. The performance of these amplifiers depends on achieving impedance matching to minimise reflected signals. Here we apply a microwave calibration technique to extract the S-parameters of a Josephson junction based TWPA in-operando. This enables reflections occurring at the TWPA and its extended network of components to be quantified, and we find that the in-operation performance can be well described by the off-state measured S-parameters.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Double-sided van der Waals epitaxy of topological insulators across an atomically thin membrane
Authors:
Joon Young Park,
Young Jae Shin,
Jeacheol Shin,
Jehyun Kim,
Janghyun Jo,
Hyobin Yoo,
Danial Haei,
Chohee Hyun,
Jiyoung Yun,
Robert M. Huber,
Arijit Gupta,
Kenji Watanabe,
Takashi Taniguchi,
Wan Kyu Park,
Hyeon Suk Shin,
Miyoung Kim,
Dohun Kim,
Gyu-Chul Yi,
Philip Kim
Abstract:
Atomically thin van der Waals (vdW) films provide a novel material platform for epitaxial growth of quantum heterostructures. However, unlike the remote epitaxial growth of three-dimensional bulk crystals, the growth of two-dimensional (2D) material heterostructures across atomic layers has been limited due to the weak vdW interaction. Here, we report the double-sided epitaxy of vdW layered materi…
▽ More
Atomically thin van der Waals (vdW) films provide a novel material platform for epitaxial growth of quantum heterostructures. However, unlike the remote epitaxial growth of three-dimensional bulk crystals, the growth of two-dimensional (2D) material heterostructures across atomic layers has been limited due to the weak vdW interaction. Here, we report the double-sided epitaxy of vdW layered materials through atomic membranes. We grow vdW topological insulators (TIs) Sb$_2$Te$_3$ and Bi$_2$Se$_3$ by molecular beam epitaxy on both surfaces of atomically thin graphene or hBN, which serve as suspended 2D vdW "$\textit{substrate}$" layers. Both homo- and hetero- double-sided vdW TI tunnel junctions are fabricated, with the atomically thin hBN acting as a crystal-momentum-conserving tunnelling barrier with abrupt and epitaxial interface. By performing field-angle dependent magneto-tunnelling spectroscopy on these devices, we reveal the energy-momentum-spin resonant tunnelling of massless Dirac electrons between helical Landau levels developed in the topological surface states at the interface.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Condensed-space methods for nonlinear programming on GPUs
Authors:
François Pacaud,
Sungho Shin,
Alexis Montoison,
Michel Schanen,
Mihai Anitescu
Abstract:
This paper explores two condensed-space interior-point methods to efficiently solve large-scale nonlinear programs on graphics processing units (GPUs). The interior-point method solves a sequence of symmetric indefinite linear systems, or Karush-Kuhn-Tucker (KKT) systems, which become increasingly ill-conditioned as we approach the solution. Solving a KKT system with traditional sparse factorizati…
▽ More
This paper explores two condensed-space interior-point methods to efficiently solve large-scale nonlinear programs on graphics processing units (GPUs). The interior-point method solves a sequence of symmetric indefinite linear systems, or Karush-Kuhn-Tucker (KKT) systems, which become increasingly ill-conditioned as we approach the solution. Solving a KKT system with traditional sparse factorization methods involve numerical pivoting, making parallelization difficult. A solution is to condense the KKT system into a symmetric positive-definite matrix and solve it with a Cholesky factorization, stable without pivoting. Although condensed KKT systems are more prone to ill-conditioning than the original ones, they exhibit structured ill-conditioning that mitigates the loss of accuracy. This paper compares the benefits of two recent condensed-space interior-point methods, HyKKT and LiftedKKT. We implement the two methods on GPUs using MadNLP.jl, an optimization solver interfaced with the NVIDIA sparse linear solver cuDSS and with the GPU-accelerated modeler ExaModels.jl. Our experiments on the PGLIB and the COPS benchmarks reveal that GPUs can attain up to a tenfold speed increase compared to CPUs when solving large-scale instances.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Scalable Multi-Period AC Optimal Power Flow Utilizing GPUs with High Memory Capacities
Authors:
Sungho Shin,
Vishwas Rao,
Michel Schanen,
D. Adrian Maldonado,
Mihai Anitescu
Abstract:
This paper demonstrates the scalability of open-source GPU-accelerated nonlinear programming (NLP) frameworks -- ExaModels.jl and MadNLP.jl -- for solving multi-period alternating current (AC) optimal power flow (OPF) problems on GPUs with high memory capacities (e.g., NVIDIA GH200 with 480 GB of unified memory). There has been a growing interest in solving multi-period AC OPF problems, as the inc…
▽ More
This paper demonstrates the scalability of open-source GPU-accelerated nonlinear programming (NLP) frameworks -- ExaModels.jl and MadNLP.jl -- for solving multi-period alternating current (AC) optimal power flow (OPF) problems on GPUs with high memory capacities (e.g., NVIDIA GH200 with 480 GB of unified memory). There has been a growing interest in solving multi-period AC OPF problems, as the increasingly fluctuating electricity market requires operation planning over multiple periods. These problems, formerly deemed intractable, are now becoming technologically feasible to solve thanks to the advent of high-memory GPU hardware and accelerated NLP tools. This study evaluates the capability of these tools to tackle previously unsolvable multi-period AC OPF instances. Our numerical experiments, run on an NVIDIA GH200, demonstrate that we can solve a multi-period OPF instance with more than 10 million variables up to $10^{-4}$ precision in less than 10 minutes. These results demonstrate the efficacy of the GPU-accelerated NLP frameworks for the solution of extreme-scale multi-period OPF. We provide ExaModelsPower.jl, an open-source modeling tool for multi-period AC OPF models for GPUs.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Dusk Till Dawn: Self-supervised Nighttime Stereo Depth Estimation using Visual Foundation Models
Authors:
Madhu Vankadari,
Samuel Hodgson,
Sangyun Shin,
Kaichen Zhou Andrew Markham,
Niki Trigoni
Abstract:
Self-supervised depth estimation algorithms rely heavily on frame-warping relationships, exhibiting substantial performance degradation when applied in challenging circumstances, such as low-visibility and nighttime scenarios with varying illumination conditions. Addressing this challenge, we introduce an algorithm designed to achieve accurate self-supervised stereo depth estimation focusing on ni…
▽ More
Self-supervised depth estimation algorithms rely heavily on frame-warping relationships, exhibiting substantial performance degradation when applied in challenging circumstances, such as low-visibility and nighttime scenarios with varying illumination conditions. Addressing this challenge, we introduce an algorithm designed to achieve accurate self-supervised stereo depth estimation focusing on nighttime conditions. Specifically, we use pretrained visual foundation models to extract generalised features across challenging scenes and present an efficient method for matching and integrating these features from stereo frames. Moreover, to prevent pixels violating photometric consistency assumption from negatively affecting the depth predictions, we propose a novel masking approach designed to filter out such pixels. Lastly, addressing weaknesses in the evaluation of current depth estimation algorithms, we present novel evaluation metrics. Our experiments, conducted on challenging datasets including Oxford RobotCar and Multi-Spectral Stereo, demonstrate the robust improvements realized by our approach. Code is available at: https://github.com/madhubabuv/dtd
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Revisiting Reactor Anti-Neutrino 5 MeV Bump with $^{13}$C Neutral-Current Interaction
Authors:
Pouya Bakhti,
Min-Gwa Park,
Meshkat Rajaee,
Chang Sub Shin,
Seodong Shin
Abstract:
For the first time, we systematically investigate the potential of neutrino-nucleus neutral current interactions with $^{13}$C to identify the origin of the 5 MeV bump observed in reactor anti-neutrino spectra in the inverse beta decay process. The distinctive signal is obtained from the de-excitation of $^{13}$C$^*$ into the ground state emitting a 3.685 MeV photon in various liquid scintillator…
▽ More
For the first time, we systematically investigate the potential of neutrino-nucleus neutral current interactions with $^{13}$C to identify the origin of the 5 MeV bump observed in reactor anti-neutrino spectra in the inverse beta decay process. The distinctive signal is obtained from the de-excitation of $^{13}$C$^*$ into the ground state emitting a 3.685 MeV photon in various liquid scintillator detectors. Such an interaction predominantly occurs for the reactor anti-neutrinos within the energy range coinciding with the 5 MeV bump. For a detector that has a capability of 95\% level photon and electron separation and small thorium contamination below $5 \times 10^{-17}$ gr/gr located in a site with an overburden of about a few hundred m.w.e, such as the location of near detectors of RENO and Daya Bay will have a great sensitivity to resolve the 5 MeV bump. In addition, we propose a novel approach to track the time evolution of reactor isotopes by analyzing our $^{13}$C signal shedding light on the contributions from $^{235}$U or $^{239}$Pu to the observed bump. This provides an extra powerful tool in both discriminating the flux models and testing any new physics possibilities for the 5 MeV bump at 3$σ$ to 5$σ$ level with much less systematic uncertainties and assuming 10 kt.year of data collection. Our detector requirements are realistic, aligning well with recent studies conducted for existing or forthcoming experiments.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Twisted MoSe2 Homobilayer Behaving as a Heterobilayer
Authors:
Arka Karmakar,
Abdullah Al-Mahboob,
Natalia Zawadzka,
Mateusz Raczyński,
Weiguang Yang,
Mehdi Arfaoui,
Gayatri,
Julia Kucharek,
Jerzy T. Sadowski,
Hyeon Suk Shin,
Adam Babiński,
Wojciech Pacuski,
Tomasz Kazimierczuk,
Maciej R Molas
Abstract:
Heterostructures (HSs) formed by the transition-metal dichalcogenides (TMDCs) materials have shown great promise in next-generation optoelectronic and photonic applications. An artificially twisted HS, allows us to manipulate the optical, and electronic properties. With this work, we introduce the understanding of the complex energy transfer (ET) process governed by the dipolar interaction in a tw…
▽ More
Heterostructures (HSs) formed by the transition-metal dichalcogenides (TMDCs) materials have shown great promise in next-generation optoelectronic and photonic applications. An artificially twisted HS, allows us to manipulate the optical, and electronic properties. With this work, we introduce the understanding of the complex energy transfer (ET) process governed by the dipolar interaction in a twisted molybdenum diselenide (MoSe2) homobilayer without any charge-blocking interlayer. We fabricated an unconventional homobilayer (i.e., HS) with a large twist angle by combining the chemical vapor deposition (CVD) and mechanical exfoliation (Exf.) techniques to fully exploit the lattice parameters mismatch and indirect/direct (CVD/Exf.) bandgap nature. This effectively weaken the charge transfer (CT) process and allows the ET process to take over the carrier recombination channels. We utilize a series of optical and electron spectroscopy techniques complementing by the density functional theory calculations, to describe a massive photoluminescence enhancement from the HS area due to an efficient ET process. Our results show that the electronically decoupled MoSe2 homobilayer is coupled by the ET process, mimicking a 'true' heterobilayer nature.
△ Less
Submitted 7 June, 2024; v1 submitted 23 April, 2024;
originally announced April 2024.
-
Socratic Planner: Inquiry-Based Zero-Shot Planning for Embodied Instruction Following
Authors:
Suyeon Shin,
Sujin jeon,
Junghyun Kim,
Gi-Cheon Kang,
Byoung-Tak Zhang
Abstract:
Embodied Instruction Following (EIF) is the task of executing natural language instructions by navigating and interacting with objects in 3D environments. One of the primary challenges in EIF is compositional task planning, which is often addressed with supervised or in-context learning with labeled data. To this end, we introduce the Socratic Planner, the first zero-shot planning method that infe…
▽ More
Embodied Instruction Following (EIF) is the task of executing natural language instructions by navigating and interacting with objects in 3D environments. One of the primary challenges in EIF is compositional task planning, which is often addressed with supervised or in-context learning with labeled data. To this end, we introduce the Socratic Planner, the first zero-shot planning method that infers without the need for any training data. Socratic Planner first decomposes the instructions into substructural information of the task through self-questioning and answering, translating it into a high-level plan, i.e., a sequence of subgoals. Subgoals are executed sequentially, with our visually grounded re-planning mechanism adjusting plans dynamically through a dense visual feedback. We also introduce an evaluation metric of high-level plans, RelaxedHLP, for a more comprehensive evaluation. Experiments demonstrate the effectiveness of the Socratic Planner, achieving competitive performance on both zero-shot and few-shot task planning in the ALFRED benchmark, particularly excelling in tasks requiring higher-dimensional inference. Additionally, a precise adjustments in the plan were achieved by incorporating environmental visual information.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation
Authors:
Yeonguk Yu,
Sungho Shin,
Seunghyeok Back,
Minhwan Ko,
Sangjun Noh,
Kyoobin Lee
Abstract:
Test-time adaptation (TTA) aims to adapt a pre-trained model to a new test domain without access to source data after deployment. Existing approaches typically rely on self-training with pseudo-labels since ground-truth cannot be obtained from test data. Although the quality of pseudo labels is important for stable and accurate long-term adaptation, it has not been previously addressed. In this wo…
▽ More
Test-time adaptation (TTA) aims to adapt a pre-trained model to a new test domain without access to source data after deployment. Existing approaches typically rely on self-training with pseudo-labels since ground-truth cannot be obtained from test data. Although the quality of pseudo labels is important for stable and accurate long-term adaptation, it has not been previously addressed. In this work, we propose DPLOT, a simple yet effective TTA framework that consists of two components: (1) domain-specific block selection and (2) pseudo-label generation using paired-view images. Specifically, we select blocks that involve domain-specific feature extraction and train these blocks by entropy minimization. After blocks are adjusted for current test domain, we generate pseudo-labels by averaging given test images and corresponding flipped counterparts. By simply using flip augmentation, we prevent a decrease in the quality of the pseudo-labels, which can be caused by the domain gap resulting from strong augmentation. Our experimental results demonstrate that DPLOT outperforms previous TTA methods in CIFAR10-C, CIFAR100-C, and ImageNet-C benchmarks, reducing error by up to 5.4%, 9.1%, and 2.9%, respectively. Also, we provide an extensive analysis to demonstrate effectiveness of our framework. Code is available at https://github.com/gist-ailab/domain-specific-block-selection-and-paired-view-pseudo-labeling-for-online-TTA.
△ Less
Submitted 7 May, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
1/2$^-$ $α$ cluster resonances of $^{13}$C studied by the analytic continuation in the coupling constant
Authors:
Seungheon Shin,
Masaaki Kimura,
Bo Zhou,
Qing Zhao
Abstract:
The 1/2$^-$ resonant states in $^{13}{\rm C}$ are investigated to search for the Hoyle-analog state. In order to treat the resonance states located around the 3$α+n$ threshold, the analytic continuation in the coupling constant (ACCC) has been combined with the real-time evolution method (REM). The properties of the 1/2$^-$ resonance states such as the radii and monopole transition probabilities a…
▽ More
The 1/2$^-$ resonant states in $^{13}{\rm C}$ are investigated to search for the Hoyle-analog state. In order to treat the resonance states located around the 3$α+n$ threshold, the analytic continuation in the coupling constant (ACCC) has been combined with the real-time evolution method (REM). The properties of the 1/2$^-$ resonance states such as the radii and monopole transition probabilities are calculated. We show the 1/2$^-_3$ and 1/2$^-_4$ states are well-developed $α$ cluster states, and the 1/2$^-_4$ state is a candidate of the Hoyle-analog state.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Surfactant-laden bubble bursting: dynamics of capillary waves and Worthington jet at large Bond number
Authors:
Paula Pico,
Lyes Kahouadji,
Seungwon Shin,
Jalel Chergui,
Damir Juric,
Omar K. Matar
Abstract:
We present a numerical study of the main sub-stages preceding aerosol formation via bursting bubbles: capillary wave propagation along the bubble, convergence at the bubble's apex, the ascent of a Worthington jet and its break-up to release liquid drops. We focus on two crucial yet overlooked aspects of the system: the presence of surface-active agents and dynamics driven by non-negligible gravita…
▽ More
We present a numerical study of the main sub-stages preceding aerosol formation via bursting bubbles: capillary wave propagation along the bubble, convergence at the bubble's apex, the ascent of a Worthington jet and its break-up to release liquid drops. We focus on two crucial yet overlooked aspects of the system: the presence of surface-active agents and dynamics driven by non-negligible gravitational effects, quantified by the Bond number. Our results propose, for the first time, a mechanism explaining capillary wave retardation in the presence of surfactants, involving the transition from bi- to uni-directional Marangoni stresses, which pull the interface upwards, countering the motion of the waves. We also quantitatively elucidate the variable nature of the waves' velocity with various surfactant parameters, including surfactant solubility and elasticity, a departure from the constant behaviour well-documented in clean interfaces.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
(C$_5$H$_9$NH$_3$)$_2$CuBr$_4$: a metal-organic two-ladder quantum magnet
Authors:
J. Philippe,
F. Elson,
M. P. N. Casati,
S. Sanz,
M. Metzelaars,
O. Shliakhtun,
O. K. Forslund,
J. Lass,
T. Shiroka,
A. Linden,
D. G. Mazzone,
J. Ollivier,
S. Shin,
M. Medarde,
B. Lake,
M. Mansson,
M. Bartkowiak,
B. Normand,
P. Kögerler,
Y. Sassa,
M. Janoschek,
G. Simutis
Abstract:
Low-dimensional quantum magnets are a versatile materials platform for studying the emergent many-body physics and collective excitations that can arise even in systems with only short-range interactions. Understanding their low-temperature structure and spin Hamiltonian is key to explaining their magnetic properties, including unconventional quantum phases, phase transitions, and excited states.…
▽ More
Low-dimensional quantum magnets are a versatile materials platform for studying the emergent many-body physics and collective excitations that can arise even in systems with only short-range interactions. Understanding their low-temperature structure and spin Hamiltonian is key to explaining their magnetic properties, including unconventional quantum phases, phase transitions, and excited states. We study the metal-organic coordination compound (C$_5$H$_9$NH$_3$)$_2$CuBr$_4$ and its deuterated counterpart, which upon its discovery was identified as a candidate two-leg quantum ($S = 1/2$) spin ladder in the strong-leg coupling regime. By growing large single crystals and probing them with both bulk and microscopic techniques, we deduce that two previously unknown structural phase transitions take place between 136 K and 113 K. The low-temperature structure has a monoclinic unit cell giving rise to two inequivalent spin ladders. We further confirm the absence of long-range magnetic order down to 30 mK and discuss the implications of this two-ladder structure for the magnetic properties of (C$_5$H$_9$NH$_3$)$_2$CuBr$_4$.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Harder, Better, Faster, Stronger: Interactive Visualization for Human-Centered AI Tools
Authors:
Md Naimul Hoque,
Sungbok Shin,
Niklas Elmqvist
Abstract:
Human-centered AI (HCAI), rather than replacing the human, puts the human user in the driver's seat of so-called human-centered AI-infused tools (HCAI tools): interactive software tools that amplify, augment, empower, and enhance human performance using AI models; often novel generative or foundation AI ones. In this paper, we discuss how interactive visualization can be a key enabling technology…
▽ More
Human-centered AI (HCAI), rather than replacing the human, puts the human user in the driver's seat of so-called human-centered AI-infused tools (HCAI tools): interactive software tools that amplify, augment, empower, and enhance human performance using AI models; often novel generative or foundation AI ones. In this paper, we discuss how interactive visualization can be a key enabling technology for creating such human-centered AI tools. Visualization has already been shown to be a fundamental component in explainable AI models, and coupling this with data-driven, semantic, and unified interaction feedback loops will enable a human-centered approach to integrating AI models in the loop with human users. We present several examples of our past and current work on such HCAI tools, including for creative writing, temporal prediction, and user experience analysis. We then draw parallels between these tools to suggest common themes on how interactive visualization can support the design of future HCAI tools.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
HyperCLOVA X Technical Report
Authors:
Kang Min Yoo,
Jaegeun Han,
Sookyo In,
Heewon Jeon,
Jisu Jeong,
Jaewook Kang,
Hyunwook Kim,
Kyung-Min Kim,
Munhyong Kim,
Sungju Kim,
Donghyun Kwak,
Hanock Kwak,
Se Jung Kwon,
Bado Lee,
Dongsoo Lee,
Gichang Lee,
Jooho Lee,
Baeseong Park,
Seongjin Shin,
Joonsang Yu,
Seolki Baek,
Sumin Byeon,
Eungsup Cho,
Dooseok Choe,
Jeesung Han
, et al. (371 additional authors not shown)
Abstract:
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t…
▽ More
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in developing their sovereign LLMs.
△ Less
Submitted 13 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
GPU-accelerated nonlinear model predictive control with ExaModels and MadNLP
Authors:
François Pacaud,
Sungho Shin
Abstract:
We investigate the potential of Graphics Processing Units (GPUs) to solve large-scale nonlinear model predictive control (NMPC) problems. We accelerate the solution of the constrained nonlinear programs in the NMPC algorithm using the GPU-accelerated automatic differentiation tool ExaModels with the interior-point solver MadNLP. The sparse linear systems formulated in the interior-point method is…
▽ More
We investigate the potential of Graphics Processing Units (GPUs) to solve large-scale nonlinear model predictive control (NMPC) problems. We accelerate the solution of the constrained nonlinear programs in the NMPC algorithm using the GPU-accelerated automatic differentiation tool ExaModels with the interior-point solver MadNLP. The sparse linear systems formulated in the interior-point method is solved on the GPU using a hybrid solver combining an iterative method with a sparse Cholesky factorization, which harness the newly released NVIDIA cuDSS solver. Our results on the classical distillation column instance show that despite a significant pre-processing time, the hybrid solver allows to reduce the time per iteration by a factor of 25 for the largest instance.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
Does AI help humans make better decisions? A methodological framework for experimental evaluation
Authors:
Eli Ben-Michael,
D. James Greiner,
Melody Huang,
Kosuke Imai,
Zhichao Jiang,
Sooahn Shin
Abstract:
The use of Artificial Intelligence (AI) based on data-driven algorithms has become ubiquitous in today's society. Yet, in many cases and especially when stakes are high, humans still make final decisions. The critical question, therefore, is whether AI helps humans make better decisions as compared to a human alone or AI an alone. We introduce a new methodological framework that can be used to ans…
▽ More
The use of Artificial Intelligence (AI) based on data-driven algorithms has become ubiquitous in today's society. Yet, in many cases and especially when stakes are high, humans still make final decisions. The critical question, therefore, is whether AI helps humans make better decisions as compared to a human alone or AI an alone. We introduce a new methodological framework that can be used to answer experimentally this question with no additional assumptions. We measure a decision maker's ability to make correct decisions using standard classification metrics based on the baseline potential outcome. We consider a single-blinded experimental design, in which the provision of AI-generated recommendations is randomized across cases with a human making final decisions. Under this experimental design, we show how to compare the performance of three alternative decision-making systems--human-alone, human-with-AI, and AI-alone. We apply the proposed methodology to the data from our own randomized controlled trial of a pretrial risk assessment instrument. We find that AI recommendations do not improve the classification accuracy of a judge's decision to impose cash bail. Our analysis also shows that AI-alone decisions generally perform worse than human decisions with or without AI assistance. Finally, AI recommendations tend to impose cash bail on non-white arrestees more often than necessary when compared to white arrestees.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
Ignore Me But Don't Replace Me: Utilizing Non-Linguistic Elements for Pretraining on the Cybersecurity Domain
Authors:
Eugene Jang,
Jian Cui,
Dayeon Yim,
Youngjin Jin,
Jin-Woo Chung,
Seungwon Shin,
Yongjae Lee
Abstract:
Cybersecurity information is often technically complex and relayed through unstructured text, making automation of cyber threat intelligence highly challenging. For such text domains that involve high levels of expertise, pretraining on in-domain corpora has been a popular method for language models to obtain domain expertise. However, cybersecurity texts often contain non-linguistic elements (suc…
▽ More
Cybersecurity information is often technically complex and relayed through unstructured text, making automation of cyber threat intelligence highly challenging. For such text domains that involve high levels of expertise, pretraining on in-domain corpora has been a popular method for language models to obtain domain expertise. However, cybersecurity texts often contain non-linguistic elements (such as URLs and hash values) that could be unsuitable with the established pretraining methodologies. Previous work in other domains have removed or filtered such text as noise, but the effectiveness of these methods have not been investigated, especially in the cybersecurity domain. We propose different pretraining methodologies and evaluate their effectiveness through downstream tasks and probing tasks. Our proposed strategy (selective MLM and jointly training NLE token classification) outperforms the commonly taken approach of replacing non-linguistic elements (NLEs). We use our domain-customized methodology to train CyBERTuned, a cybersecurity domain language model that outperforms other cybersecurity PLMs on most tasks.
△ Less
Submitted 2 April, 2024; v1 submitted 15 March, 2024;
originally announced March 2024.
-
Towards Embedding Dynamic Personas in Interactive Robots: Masquerading Animated Social Kinematics (MASK)
Authors:
Jeongeun Park,
Taemoon Jeong,
Hyeonseong Kim,
Taehyun Byun,
Seungyoon Shin,
Keunjun Choi,
Jaewoon Kwon,
Taeyoon Lee,
Matthew Pan,
Sungjoon Choi
Abstract:
This paper presents the design and development of an innovative interactive robotic system to enhance audience engagement using character-like personas. Built upon the foundations of persona-driven dialog agents, this work extends the agent application to the physical realm, employing robots to provide a more immersive and interactive experience. The proposed system, named the Masquerading Animate…
▽ More
This paper presents the design and development of an innovative interactive robotic system to enhance audience engagement using character-like personas. Built upon the foundations of persona-driven dialog agents, this work extends the agent application to the physical realm, employing robots to provide a more immersive and interactive experience. The proposed system, named the Masquerading Animated Social Kinematics (MASK), leverages an anthropomorphic robot which interacts with guests using non-verbal interactions, including facial expressions and gestures. A behavior generation system based upon a finite-state machine structure effectively conditions robotic behavior to convey distinct personas. The MASK framework integrates a perception engine, a behavior selection engine, and a comprehensive action library to enable real-time, dynamic interactions with minimal human intervention in behavior design. Throughout the user subject studies, we examined whether the users could recognize the intended character in film-character-based persona conditions. We conclude by discussing the role of personas in interactive agents and the factors to consider for creating an engaging user experience.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Unknown Domain Inconsistency Minimization for Domain Generalization
Authors:
Seungjae Shin,
HeeSun Bae,
Byeonghu Na,
Yoon-Yeong Kim,
Il-Chul Moon
Abstract:
The objective of domain generalization (DG) is to enhance the transferability of the model learned from a source domain to unobserved domains. To prevent overfitting to a specific domain, Sharpness-Aware Minimization (SAM) reduces source domain's loss sharpness. Although SAM variants have delivered significant improvements in DG, we highlight that there's still potential for improvement in general…
▽ More
The objective of domain generalization (DG) is to enhance the transferability of the model learned from a source domain to unobserved domains. To prevent overfitting to a specific domain, Sharpness-Aware Minimization (SAM) reduces source domain's loss sharpness. Although SAM variants have delivered significant improvements in DG, we highlight that there's still potential for improvement in generalizing to unknown domains through the exploration on data space. This paper introduces an objective rooted in both parameter and data perturbed regions for domain generalization, coined Unknown Domain Inconsistency Minimization (UDIM). UDIM reduces the loss landscape inconsistency between source domain and unknown domains. As unknown domains are inaccessible, these domains are empirically crafted by perturbing instances from the source domain dataset. In particular, by aligning the loss landscape acquired in the source domain to the loss landscape of perturbed domains, we expect to achieve generalization grounded on these flat minima for the unknown domains. Theoretically, we validate that merging SAM optimization with the UDIM objective establishes an upper bound for the true objective of the DG task. In an empirical aspect, UDIM consistently outperforms SAM variants across multiple DG benchmark datasets. Notably, UDIM shows statistically significant improvements in scenarios with more restrictive domain information, underscoring UDIM's generalization capability in unseen domains. Our code is available at \url{https://github.com/SJShin-AI/UDIM}.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Performance of a modular ton-scale pixel-readout liquid argon time projection chamber
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
B. Aimard,
F. Akbar,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
H. Amar,
P. Amedo,
J. Anderson,
D. A. Andrade
, et al. (1340 additional authors not shown)
Abstract:
The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmi…
▽ More
The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmic ray events collected in the spring of 2021. We use this sample to demonstrate the imaging performance of the charge and light readout systems as well as the signal correlations between the two. We also report argon purity and detector uniformity measurements, and provide comparisons to detector simulations.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Dirichlet-based Per-Sample Weighting by Transition Matrix for Noisy Label Learning
Authors:
HeeSun Bae,
Seungjae Shin,
Byeonghu Na,
Il-Chul Moon
Abstract:
For learning with noisy labels, the transition matrix, which explicitly models the relation between noisy label distribution and clean label distribution, has been utilized to achieve the statistical consistency of either the classifier or the risk. Previous researches have focused more on how to estimate this transition matrix well, rather than how to utilize it. We propose good utilization of th…
▽ More
For learning with noisy labels, the transition matrix, which explicitly models the relation between noisy label distribution and clean label distribution, has been utilized to achieve the statistical consistency of either the classifier or the risk. Previous researches have focused more on how to estimate this transition matrix well, rather than how to utilize it. We propose good utilization of the transition matrix is crucial and suggest a new utilization method based on resampling, coined RENT. Specifically, we first demonstrate current utilizations can have potential limitations for implementation. As an extension to Reweighting, we suggest the Dirichlet distribution-based per-sample Weight Sampling (DWS) framework, and compare reweighting and resampling under DWS framework. With the analyses from DWS, we propose RENT, a REsampling method with Noise Transition matrix. Empirically, RENT consistently outperforms existing transition matrix utilization methods, which includes reweighting, on various benchmark datasets. Our code is available at \url{https://github.com/BaeHeeSun/RENT}.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection
Authors:
Taeheon Kim,
Sebin Shin,
Youngjoon Yu,
Hak Gu Kim,
Yong Man Ro
Abstract:
RGBT multispectral pedestrian detection has emerged as a promising solution for safety-critical applications that require day/night operations. However, the modality bias problem remains unsolved as multispectral pedestrian detectors learn the statistical bias in datasets. Specifically, datasets in multispectral pedestrian detection mainly distribute between ROTO (day) and RXTO (night) data; the m…
▽ More
RGBT multispectral pedestrian detection has emerged as a promising solution for safety-critical applications that require day/night operations. However, the modality bias problem remains unsolved as multispectral pedestrian detectors learn the statistical bias in datasets. Specifically, datasets in multispectral pedestrian detection mainly distribute between ROTO (day) and RXTO (night) data; the majority of the pedestrian labels statistically co-occur with their thermal features. As a result, multispectral pedestrian detectors show poor generalization ability on examples beyond this statistical correlation, such as ROTX data. To address this problem, we propose a novel Causal Mode Multiplexer (CMM) framework that effectively learns the causalities between multispectral inputs and predictions. Moreover, we construct a new dataset (ROTX-MP) to evaluate modality bias in multispectral pedestrian detection. ROTX-MP mainly includes ROTX examples not presented in previous datasets. Extensive experiments demonstrate that our proposed CMM framework generalizes well on existing datasets (KAIST, CVC-14, FLIR) and the new ROTX-MP. We will release our new dataset to the public for future research.
△ Less
Submitted 5 April, 2024; v1 submitted 2 March, 2024;
originally announced March 2024.
-
Distributed Sequential Quadratic Programming with Overlapping Graph Decomposition and Exact Augmented Lagrangian
Authors:
Runxin Ni,
Sen Na,
Sungho Shin,
Mihai Anitescu
Abstract:
In this paper, we address the challenge of solving large-scale graph-structured nonlinear programs (gsNLPs) in a scalable manner. GsNLPs are problems in which the objective and constraint functions are associated with nodes on a graph and depend on the variables of adjacent nodes. This graph-structured formulation encompasses various specific instances, such as dynamic optimization, PDE-constraine…
▽ More
In this paper, we address the challenge of solving large-scale graph-structured nonlinear programs (gsNLPs) in a scalable manner. GsNLPs are problems in which the objective and constraint functions are associated with nodes on a graph and depend on the variables of adjacent nodes. This graph-structured formulation encompasses various specific instances, such as dynamic optimization, PDE-constrained optimization, multistage stochastic optimization, and general network optimization. By leveraging the sequential quadratic programming (SQP) framework, we propose a globally convergent overlapping graph decomposition method to solve large-scale gsNLPs under standard mild regularity conditions on the graph topology. In each iteration, we perform an overlapping graph decomposition to compute an approximate Newton direction in a parallel environment. Then, we select a suitable stepsize and update the primal-dual iterate by performing a backtracking line search on an exact augmented Lagrangian merit function. Built on the exponential decay of sensitivity of gsNLPs, we show that the approximate Newton direction is a descent direction of the augmented Lagrangian, which leads to global convergence with a local linear convergence rate. In particular, global convergence is achieved for sufficiently large overlaps, and the local linear convergence rate improves exponentially in terms of the overlap size. Our results match existing state-of-the-art guarantees established for dynamic programs (which simply correspond to linear graphs). We validate the theory on a semilinear elliptic PDE-constrained problem.
△ Less
Submitted 11 June, 2024; v1 submitted 26 February, 2024;
originally announced February 2024.
-
Charge orders with distinct magnetic response in a prototypical kagome superconductor LaRu$_{3}$Si$_{2}$
Authors:
C. Mielke III,
V. Sazgari,
I. Plokhikh,
S. Shin,
H. Nakamura,
J. N. Graham,
J. Küspert,
I. Bialo,
G. Garbarino,
D. Das,
M. Medarde,
M. Bartkowiak,
S. S. Islam,
R. Khasanov,
H. Luetkens,
M. Z. Hasan,
E. Pomjakushina,
J. -X. Yin,
M. H. Fischer,
J. Chang,
T. Neupert,
S. Nakatsuji,
B. Wehinger,
D. J. Gawryluk,
Z. Guguchia
Abstract:
The kagome lattice has emerged as a promising platform for hosting unconventional chiral charge order at high temperatures. Notably, in LaRu$_{3}$Si$_{2}$, a room-temperature charge-ordered state with a propagation vector of ($\frac{1}{4}$,~0,~0) has been recently identified. However, understanding the interplay between this charge order and superconductivity, particularly with respect to time-rev…
▽ More
The kagome lattice has emerged as a promising platform for hosting unconventional chiral charge order at high temperatures. Notably, in LaRu$_{3}$Si$_{2}$, a room-temperature charge-ordered state with a propagation vector of ($\frac{1}{4}$,~0,~0) has been recently identified. However, understanding the interplay between this charge order and superconductivity, particularly with respect to time-reversal-symmetry breaking, remains elusive. In this study, we employ single crystal X-ray diffraction, magnetotransport, and muon-spin rotation experiments to investigate the charge order and its electronic and magnetic responses in LaRu$_{3}$Si$_{2}$ across a wide temperature range down to the superconducting state. Our findings reveal the emergence of a charge order with a propagation vector of ($\frac{1}{6}$,~0,~0) below $T_{\rm CO,2}$ ${\simeq}$ 80 K, coexisting with the previously identified room-temperature primary charge order ($\frac{1}{4}$,~0,~0). The primary charge-ordered state exhibits zero magnetoresistance. In contrast, the appearance of the secondary charge order at $T_{\rm CO,2}$ is accompanied by a notable magnetoresistance response and a pronounced temperature-dependent Hall effect, which experiences a sign reversal, switching from positive to negative below $T^{*}$ ${\simeq}$ 35 K. Intriguingly, we observe an enhancement in the internal field width sensed by the muon ensemble below $T^{*}$ ${\simeq}$ 35 K. Moreover, the muon spin relaxation rate exhibits a substantial increase upon the application of an external magnetic field below $T_{\rm CO,2}$ ${\simeq}$ 80 K. Our results highlight the coexistence of two distinct types of charge order in LaRu$_{3}$Si$_{2}$ within the correlated kagome lattice, namely a non-magnetic charge order ($\frac{1}{4}$,~0,~0) below $T_{\rm co,1}$ ${\simeq}$ 400 K and a time-reversal-symmetry-breaking charge order below $T_{\rm CO,2}$.
△ Less
Submitted 28 February, 2024; v1 submitted 25 February, 2024;
originally announced February 2024.
-
Dueling Over Dessert, Mastering the Art of Repeated Cake Cutting
Authors:
Simina Brânzei,
MohammadTaghi Hajiaghayi,
Reed Phillips,
Suho Shin,
Kun Wang
Abstract:
We consider the setting of repeated fair division between two players, denoted Alice and Bob, with private valuations over a cake. In each round, a new cake arrives, which is identical to the ones in previous rounds. Alice cuts the cake at a point of her choice, while Bob chooses the left piece or the right piece, leaving the remainder for Alice. We consider two versions: sequential, where Bob obs…
▽ More
We consider the setting of repeated fair division between two players, denoted Alice and Bob, with private valuations over a cake. In each round, a new cake arrives, which is identical to the ones in previous rounds. Alice cuts the cake at a point of her choice, while Bob chooses the left piece or the right piece, leaving the remainder for Alice. We consider two versions: sequential, where Bob observes Alice's cut point before choosing left/right, and simultaneous, where he only observes her cut point after making his choice. The simultaneous version was first considered by Aumann and Maschler (1995).
We observe that if Bob is almost myopic and chooses his favorite piece too often, then he can be systematically exploited by Alice through a strategy akin to a binary search. This strategy allows Alice to approximate Bob's preferences with increasing precision, thereby securing a disproportionate share of the resource over time.
We analyze the limits of how much a player can exploit the other one and show that fair utility profiles are in fact achievable. Specifically, the players can enforce the equitable utility profile of $(1/2, 1/2)$ in the limit on every trajectory of play, by keeping the other player's utility to approximately $1/2$ on average while guaranteeing they themselves get at least approximately $1/2$ on average. We show this theorem using a connection with Blackwell approachability.
Finally, we analyze a natural dynamic known as fictitious play, where players best respond to the empirical distribution of the other player. We show that fictitious play converges to the equitable utility profile of $(1/2, 1/2)$ at a rate of $O(1/\sqrt{T})$.
△ Less
Submitted 18 February, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill
Authors:
Sangwoo Shin,
Daehee Lee,
Minjong Yoo,
Woo Kyung Kim,
Honguk Woo
Abstract:
One-shot imitation is to learn a new task from a single demonstration, yet it is a challenging problem to adopt it for complex tasks with the high domain diversity inherent in a non-stationary environment. To tackle the problem, we explore the compositionality of complex tasks, and present a novel skill-based imitation learning framework enabling one-shot imitation and zero-shot adaptation; from a…
▽ More
One-shot imitation is to learn a new task from a single demonstration, yet it is a challenging problem to adopt it for complex tasks with the high domain diversity inherent in a non-stationary environment. To tackle the problem, we explore the compositionality of complex tasks, and present a novel skill-based imitation learning framework enabling one-shot imitation and zero-shot adaptation; from a single demonstration for a complex unseen task, a semantic skill sequence is inferred and then each skill in the sequence is converted into an action sequence optimized for environmental hidden dynamics that can vary over time. Specifically, we leverage a vision-language model to learn a semantic skill set from offline video datasets, where each skill is represented on the vision-language embedding space, and adapt meta-learning with dynamics inference to enable zero-shot skill adaptation. We evaluate our framework with various one-shot imitation scenarios for extended multi-stage Meta-world tasks, showing its superiority in learning complex tasks, generalizing to dynamics changes, and extending to different demonstration conditions and modalities, compared to other baselines.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
SemTra: A Semantic Skill Translator for Cross-Domain Zero-Shot Policy Adaptation
Authors:
Sangwoo Shin,
Minjong Yoo,
Jeongwoo Lee,
Honguk Woo
Abstract:
This work explores the zero-shot adaptation capability of semantic skills, semantically interpretable experts' behavior patterns, in cross-domain settings, where a user input in interleaved multi-modal snippets can prompt a new long-horizon task for different domains. In these cross-domain settings, we present a semantic skill translator framework SemTra which utilizes a set of multi-modal models…
▽ More
This work explores the zero-shot adaptation capability of semantic skills, semantically interpretable experts' behavior patterns, in cross-domain settings, where a user input in interleaved multi-modal snippets can prompt a new long-horizon task for different domains. In these cross-domain settings, we present a semantic skill translator framework SemTra which utilizes a set of multi-modal models to extract skills from the snippets, and leverages the reasoning capabilities of a pretrained language model to adapt these extracted skills to the target domain. The framework employs a two-level hierarchy for adaptation: task adaptation and skill adaptation. During task adaptation, seq-to-seq translation by the language model transforms the extracted skills into a semantic skill sequence, which is tailored to fit the cross-domain contexts. Skill adaptation focuses on optimizing each semantic skill for the target domain context, through parametric instantiations that are facilitated by language prompting and contrastive learning-based context inferences. This hierarchical adaptation empowers the framework to not only infer a complex task specification in one-shot from the interleaved multi-modal snippets, but also adapt it to new domains with zero-shot learning abilities. We evaluate our framework with Meta-World, Franka Kitchen, RLBench, and CARLA environments. The results clarify the framework's superiority in performing long-horizon tasks and adapting to different domains, showing its broad applicability in practical use cases, such as cognitive robots interpreting abstract instructions and autonomous vehicles operating under varied configurations.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Doping Liquid Argon with Xenon in ProtoDUNE Single-Phase: Effects on Scintillation Light
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
B. Aimard,
F. Akbar,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
H. Amar Es-sghir,
P. Amedo,
J. Anderson,
D. A. Andrade,
C. Andreopoulos
, et al. (1300 additional authors not shown)
Abstract:
Doping of liquid argon TPCs (LArTPCs) with a small concentration of xenon is a technique for light-shifting and facilitates the detection of the liquid argon scintillation light. In this paper, we present the results of the first doping test ever performed in a kiloton-scale LArTPC. From February to May 2020, we carried out this special run in the single-phase DUNE Far Detector prototype (ProtoDUN…
▽ More
Doping of liquid argon TPCs (LArTPCs) with a small concentration of xenon is a technique for light-shifting and facilitates the detection of the liquid argon scintillation light. In this paper, we present the results of the first doping test ever performed in a kiloton-scale LArTPC. From February to May 2020, we carried out this special run in the single-phase DUNE Far Detector prototype (ProtoDUNE-SP) at CERN, featuring 770 t of total liquid argon mass with 410 t of fiducial mass. The goal of the run was to measure the light and charge response of the detector to the addition of xenon, up to a concentration of 18.8 ppm. The main purpose was to test the possibility for reduction of non-uniformities in light collection, caused by deployment of photon detectors only within the anode planes. Light collection was analysed as a function of the xenon concentration, by using the pre-existing photon detection system (PDS) of ProtoDUNE-SP and an additional smaller set-up installed specifically for this run. In this paper we first summarize our current understanding of the argon-xenon energy transfer process and the impact of the presence of nitrogen in argon with and without xenon dopant. We then describe the key elements of ProtoDUNE-SP and the injection method deployed. Two dedicated photon detectors were able to collect the light produced by xenon and the total light. The ratio of these components was measured to be about 0.65 as 18.8 ppm of xenon were injected. We performed studies of the collection efficiency as a function of the distance between tracks and light detectors, demonstrating enhanced uniformity of response for the anode-mounted PDS. We also show that xenon doping can substantially recover light losses due to contamination of the liquid argon by nitrogen.
△ Less
Submitted 9 February, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Amorphous Boron Nitride as a Diffusion Barrier to Cu Atoms
Authors:
Onurcan Kaya,
Hyeongjoon Kim,
Byeongkyu Kim,
Luigi Colombo,
Hyeon-Jin Shin,
Ivan Cole,
Hyeon Suk Shin,
Stephan Roche
Abstract:
This study focuses on amorphous boron nitride ($\rm α$-BN) as a novel diffusion barrier for advanced semiconductor technology, particularly addressing the critical challenge of copper diffusion in back-end-of-logic (BEOL) interconnects. Owing to its ultralow dielectric constant and robust barrier properties, $\rm α$-BN is examined as an alternative to conventional low-k dielectrics. The investigat…
▽ More
This study focuses on amorphous boron nitride ($\rm α$-BN) as a novel diffusion barrier for advanced semiconductor technology, particularly addressing the critical challenge of copper diffusion in back-end-of-logic (BEOL) interconnects. Owing to its ultralow dielectric constant and robust barrier properties, $\rm α$-BN is examined as an alternative to conventional low-k dielectrics. The investigation primarily employs theoretical modeling, using a Gaussian Approximation Potential, to simulate and understand the atomic-level interactions and barrier mechanisms of $\rm α$-BN. This machine learning-based approach allows for realistic simulations of its amorphous structure, enabling the exploration of the impact of different film morphologies on barrier efficacy. Complementing the theoretical study, experimental analyses are conducted on Plasma-Enhanced Chemical Vapor Deposition (PECVD) grown $\rm α$-BN films, evaluating their effectiveness in preventing copper diffusion in silicon-based substrates. The results from both the theoretical and experimental investigations highlight the potential of $\rm α$-BN as a highly effective diffusion barrier, suitable for integration in nanoelectronics. This research not only proposes $\rm α$-BN as a promising candidate for BEOL interconnects but also demonstrates the synergy of advanced computational models and experimental methods in material innovation for semiconductor applications.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Towards Generating Executable Metamorphic Relations Using Large Language Models
Authors:
Seung Yeob Shin,
Fabrizio Pastore,
Domenico Bianculli,
Alexandra Baicoianu
Abstract:
Metamorphic testing (MT) has proven to be a successful solution to automating testing and addressing the oracle problem. However, it entails manually deriving metamorphic relations (MRs) and converting them into an executable form; these steps are time-consuming and may prevent the adoption of MT. In this paper, we propose an approach for automatically deriving executable MRs (EMRs) from requireme…
▽ More
Metamorphic testing (MT) has proven to be a successful solution to automating testing and addressing the oracle problem. However, it entails manually deriving metamorphic relations (MRs) and converting them into an executable form; these steps are time-consuming and may prevent the adoption of MT. In this paper, we propose an approach for automatically deriving executable MRs (EMRs) from requirements using large language models (LLMs). Instead of merely asking the LLM to produce EMRs, our approach relies on a few-shot prompting strategy to instruct the LLM to perform activities in the MT process, by providing requirements and API specifications, as one would do with software engineers. To assess the feasibility of our approach, we conducted a questionnaire-based survey in collaboration with Siemens Industry Software, a worldwide leader in providing industry software and services, focusing on four of their software applications. Additionally, we evaluated the accuracy of the generated EMRs for a Web application. The outcomes of our study are highly promising, as they demonstrate the capability of our approach to generate MRs and EMRs that are both comprehensible and pertinent for testing purposes.
△ Less
Submitted 7 June, 2024; v1 submitted 30 January, 2024;
originally announced January 2024.
-
Dynamical Generation of the Baryon Asymmetry from a Scale Hierarchy
Authors:
Jae Hyeok Chang,
Kwang Sik Jeong,
Chang Hyeon Lee,
Chang Sub Shin
Abstract:
We propose a novel baryogenesis scenario where the baryon asymmetry originates directly from a hierarchy between two fundamental mass scales: the electroweak scale and the Planck scale. Our model is based on the neutrino-portal Affleck-Dine (AD) mechanism, which generates the asymmetry of the AD sector during the radiation-dominated era and subsequently transfers it to the baryon number before the…
▽ More
We propose a novel baryogenesis scenario where the baryon asymmetry originates directly from a hierarchy between two fundamental mass scales: the electroweak scale and the Planck scale. Our model is based on the neutrino-portal Affleck-Dine (AD) mechanism, which generates the asymmetry of the AD sector during the radiation-dominated era and subsequently transfers it to the baryon number before the electroweak phase transition. The observed baryon asymmetry is then a natural outcome of this scenario. The model is testable as it predicts the existence of a Majoron with a keV mass and an electroweak scale decay constant. The impact of the relic Majoron on $ΔN_{\rm eff}$ can be measured through near-future CMB observations.
△ Less
Submitted 5 February, 2024; v1 submitted 24 January, 2024;
originally announced January 2024.
-
Deep Learning in Physical Layer: Review on Data Driven End-to-End Communication Systems and their Enabling Semantic Applications
Authors:
Nazmul Islam,
Seokjoo Shin
Abstract:
Deep learning (DL) has revolutionized wireless communication systems by introducing datadriven end-to-end (E2E) learning, where the physical layer (PHY) is transformed into DL architectures to achieve peak optimization. Leveraging DL for E2E optimization in PHY significantly enhances its adaptability and performance in complex wireless environments, meeting the demands of advanced network systems…
▽ More
Deep learning (DL) has revolutionized wireless communication systems by introducing datadriven end-to-end (E2E) learning, where the physical layer (PHY) is transformed into DL architectures to achieve peak optimization. Leveraging DL for E2E optimization in PHY significantly enhances its adaptability and performance in complex wireless environments, meeting the demands of advanced network systems such as 5G and beyond. Furthermore, this evolution of data-driven PHY optimization has also enabled advanced semantic applications across various modalities, including text, image, audio, video, and multimodal transmissions. These applications elevate communication from bit-level to semantic-level intelligence, making it capable of discerning context and intent. Although the PHY, as a DL architecture, plays a crucial role in enabling semantic communication (SemCom) systems, comprehensive studies that integrate both E2E communication and SemCom systems remain significantly underexplored. This highlights the novelty and potential of these integrative fields, marking them as a promising research domain. Therefore, this article provides a comprehensive review of the emerging field of data-driven PHY for E2E communication systems, emphasizing their role in enabling semantic applications across various modalities. It also identifies key challenges and potential research directions, serving as a crucial guide for future advancements in DL for E2E communication and SemCom systems.
△ Less
Submitted 8 July, 2024; v1 submitted 8 January, 2024;
originally announced January 2024.
-
New Beam Dynamics Code for Cyclotron Analysis
Authors:
G-H. Kim,
H-J. Cho,
B-H. Oh,
G-R. Hahn,
M. Chung,
S. Park,
S. Shin
Abstract:
This paper describes the beam dynamic simulation with transfer matrix method for cyclotron. Starting from a description on the equation of motion in the cyclotron, lattice functions were determined from transfer matrix method and the solutions for the 2nd-order nonlinear Hamiltonian were introduced and used in phase space particle tracking. Based on the description of beam dynamics in the cyclotro…
▽ More
This paper describes the beam dynamic simulation with transfer matrix method for cyclotron. Starting from a description on the equation of motion in the cyclotron, lattice functions were determined from transfer matrix method and the solutions for the 2nd-order nonlinear Hamiltonian were introduced and used in phase space particle tracking. Based on the description of beam dynamics in the cyclotron, simulation code was also developed for cyclotron design.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Make Prompts Adaptable: Bayesian Modeling for Vision-Language Prompt Learning with Data-Dependent Prior
Authors:
Youngjae Cho,
HeeSun Bae,
Seungjae Shin,
Yeo Dong Youn,
Weonyoung Joo,
Il-Chul Moon
Abstract:
Recent Vision-Language Pretrained (VLP) models have become the backbone for many downstream tasks, but they are utilized as frozen model without learning. Prompt learning is a method to improve the pre-trained VLP model by adding a learnable context vector to the inputs of the text encoder. In a few-shot learning scenario of the downstream task, MLE training can lead the context vector to over-fit…
▽ More
Recent Vision-Language Pretrained (VLP) models have become the backbone for many downstream tasks, but they are utilized as frozen model without learning. Prompt learning is a method to improve the pre-trained VLP model by adding a learnable context vector to the inputs of the text encoder. In a few-shot learning scenario of the downstream task, MLE training can lead the context vector to over-fit dominant image features in the training data. This overfitting can potentially harm the generalization ability, especially in the presence of a distribution shift between the training and test dataset. This paper presents a Bayesian-based framework of prompt learning, which could alleviate the overfitting issues on few-shot learning application and increase the adaptability of prompts on unseen instances. Specifically, modeling data-dependent prior enhances the adaptability of text features for both seen and unseen image features without the trade-off of performance between them. Based on the Bayesian framework, we utilize the Wasserstein Gradient Flow in the estimation of our target posterior distribution, which enables our prompt to be flexible in capturing the complex modes of image features. We demonstrate the effectiveness of our method on benchmark datasets for several experiments by showing statistically significant improvements on performance compared to existing methods. The code is available at https://github.com/youngjae-cho/APP.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
Limitations of Data-Driven Spectral Reconstruction -- Optics-Aware Analysis and Mitigation
Authors:
Qiang Fu,
Matheus Souza,
Eunsue Choi,
Suhyun Shin,
Seung-Hwan Baek,
Wolfgang Heidrich
Abstract:
Hyperspectral imaging empowers machine vision systems with the distinct capability of identifying materials through recording their spectral signatures. Recent efforts in data-driven spectral reconstruction aim at extracting spectral information from RGB images captured by cost-effective RGB cameras, instead of dedicated hardware.
In this paper we systematically analyze the performance of such m…
▽ More
Hyperspectral imaging empowers machine vision systems with the distinct capability of identifying materials through recording their spectral signatures. Recent efforts in data-driven spectral reconstruction aim at extracting spectral information from RGB images captured by cost-effective RGB cameras, instead of dedicated hardware.
In this paper we systematically analyze the performance of such methods, evaluating both the practical limitations with respect to current datasets and overfitting, as well as fundamental limitations with respect to the nature of the information encoded in the RGB images, and the dependency of this information on the optical system of the camera.
We find that, the current models are not robust under slight variations, e.g., in noise level or compression of the RGB file. Without modeling underrepresented spectral content, existing datasets and the models trained on them are limited in their ability to cope with challenging metameric colors. To mitigate this issue, we propose to exploit the combination of metameric data augmentation and optical lens aberrations to improve the encoding of the metameric information into the RGB image, which paves the road towards higher performing spectral imaging and reconstruction approaches.
△ Less
Submitted 2 April, 2024; v1 submitted 8 January, 2024;
originally announced January 2024.
-
A least distance estimator for a multivariate regression model using deep neural networks
Authors:
Jungmin Shin,
Seung Jun Shin,
Sungwan Bang
Abstract:
We propose a deep neural network (DNN) based least distance (LD) estimator (DNN-LD) for a multivariate regression problem, addressing the limitations of the conventional methods. Due to the flexibility of a DNN structure, both linear and nonlinear conditional mean functions can be easily modeled, and a multivariate regression model can be realized by simply adding extra nodes at the output layer.…
▽ More
We propose a deep neural network (DNN) based least distance (LD) estimator (DNN-LD) for a multivariate regression problem, addressing the limitations of the conventional methods. Due to the flexibility of a DNN structure, both linear and nonlinear conditional mean functions can be easily modeled, and a multivariate regression model can be realized by simply adding extra nodes at the output layer. The proposed method is more efficient in capturing the dependency structure among responses than the least squares loss, and robust to outliers. In addition, we consider $L_1$-type penalization for variable selection, crucial in analyzing high-dimensional data. Namely, we propose what we call (A)GDNN-LD estimator that enjoys variable selection and model estimation simultaneously, by applying the (adaptive) group Lasso penalty to weight parameters in the DNN structure. For the computation, we propose a quadratic smoothing approximation method to facilitate optimizing the non-smooth objective function based on the least distance loss. The simulation studies and a real data analysis demonstrate the promising performance of the proposed method.
△ Less
Submitted 5 January, 2024;
originally announced January 2024.
-
Hunting for Hypercharge Anapole Dark Matter in All Spin Scenarios
Authors:
Seong Youl Choi,
Jaehoon Jeong,
Dong Woo Kang,
Seodong Shin
Abstract:
We conduct a combined analysis to investigate dark matter (DM) with hypercharge anapole moments, focusing on scenarios where Majorana DM particles with spin 1/2, 1, 3/2, and 2 interact exclusively with Standard Model particles through U(1)$_{Y}$ hypercharge anapole terms for the first time. For completeness, we construct general effective U(1) gauge-invariant three-point vertices. These enable the…
▽ More
We conduct a combined analysis to investigate dark matter (DM) with hypercharge anapole moments, focusing on scenarios where Majorana DM particles with spin 1/2, 1, 3/2, and 2 interact exclusively with Standard Model particles through U(1)$_{Y}$ hypercharge anapole terms for the first time. For completeness, we construct general effective U(1) gauge-invariant three-point vertices. These enable the generation of hypercharge gauge-invariant interaction vertices for both a virtual photon $γ$ and a virtual $Z$ boson with two identical massive Majorana particles of any non-zero spin $s$, after the spontaneous breaking of electroweak gauge symmetry. For complementarity, we adopt effective operators tailored to each dark matter spin allowing crossing symmetry. We calculate the relic abundance, analyze current constraints and future sensitivities from dark matter direct detection and collider experiments, and apply the conceptual naive perturbativity bound. Our estimations based on a generalized vertex calculation demonstrate that the scenario with a higher-spin DM is more stringently constrained than a lower-spin DM, primarily due to the reduced annihilation cross-section and/or the enhanced rate of LHC mono-jet events. As a remarkable outcome, the spin-2 anapole DM scenario is almost entirely excluded, while the high-luminosity LHC exhibits high sensitivities in probing spin-1 and 3/2 scenarios, except for a tiny parameter range of DM mass around 1 TeV. A significant portion of the remaining parameter space in the spin-1/2 DM scenario can be explored through upcoming Xenon experiments, with more than 20 ton-year exposure equivalent to approximately 5 years of running the XENONnT experiment.
△ Less
Submitted 27 March, 2024; v1 submitted 5 January, 2024;
originally announced January 2024.
-
Taming the Beast: Fully Automated Unit Testing with Coyote C++
Authors:
Sanghoon Rho,
Philipp Martens,
Seungcheol Shin,
Yeoneo Kim
Abstract:
In this paper, we present Coyote C++, a fully automated white-box unit testing tool for C and C++. Whereas existing tools have struggled to realize unit test generation for C++, Coyote C++ is able to produce high coverage results from unit test generation at a testing speed of over 10,000 statements per hour. This impressive feat is made possible by the combination of a powerful concolic execution…
▽ More
In this paper, we present Coyote C++, a fully automated white-box unit testing tool for C and C++. Whereas existing tools have struggled to realize unit test generation for C++, Coyote C++ is able to produce high coverage results from unit test generation at a testing speed of over 10,000 statements per hour. This impressive feat is made possible by the combination of a powerful concolic execution engine with sophisticated automated test harness generation. Additionally, the GUI of Coyote C++ displays detailed code coverage visualizations and provides various configuration features for users seeking to manually optimize their coverage results. Combining potent one-click automated testing with rich support for manual tweaking, Coyote C++ is the first automated testing tool that is practical enough to make automated testing of C++ code truly viable in industrial applications.
△ Less
Submitted 4 January, 2024; v1 submitted 2 January, 2024;
originally announced January 2024.
-
Constraining MeV to 10 GeV majoron by Big Bang Nucleosynthesis
Authors:
Sanghyeon Chang,
Sougata Ganguly,
Tae Hyun Jung,
Tae-Sun Park,
Chang Sub Shin
Abstract:
We estimate the Big Bang nucleosynthesis (BBN) constraint on the majoron in the mass range between $1\,{\rm MeV}$ to $10\,{\rm GeV}$ which dominantly decays into the standard model neutrinos. When the majoron lifetime is shorter than $1\,{\rm sec}$, the injected neutrinos mainly heat up background plasma, which alters the relation between photon temperature and background neutrino temperature. For…
▽ More
We estimate the Big Bang nucleosynthesis (BBN) constraint on the majoron in the mass range between $1\,{\rm MeV}$ to $10\,{\rm GeV}$ which dominantly decays into the standard model neutrinos. When the majoron lifetime is shorter than $1\,{\rm sec}$, the injected neutrinos mainly heat up background plasma, which alters the relation between photon temperature and background neutrino temperature. For a lifetime longer than $1\,{\rm sec}$, most of the injected neutrinos directly contribute to the protons-to-neutrons conversion. In both cases, deuterium and helium abundances are enhanced, while the constraint from the deuterium is stronger than that from the helium. $^7{\rm Li}$ abundance gets decreased as a consequence of additional neutrons, but the parameter range that fits the observed $^7{\rm Li}$ abundance is excluded by the deuterium constraint. We also estimate other cosmological constraints and compare them with the BBN bound.
△ Less
Submitted 1 July, 2024; v1 submitted 1 January, 2024;
originally announced January 2024.
-
Replication-proof Bandit Mechanism Design
Authors:
Seyed Esmaeili,
MohammadTaghi Hajiaghayi,
Suho Shin
Abstract:
We study a problem of designing replication-proof bandit mechanisms when agents strategically register or replicate their own arms to maximize their payoff. We consider Bayesian agents who are unaware of ex-post realization of their own arms' mean rewards, which is the first to study Bayesian extension of Shin et al. (2022). This extension presents significant challenges in analyzing equilibrium,…
▽ More
We study a problem of designing replication-proof bandit mechanisms when agents strategically register or replicate their own arms to maximize their payoff. We consider Bayesian agents who are unaware of ex-post realization of their own arms' mean rewards, which is the first to study Bayesian extension of Shin et al. (2022). This extension presents significant challenges in analyzing equilibrium, in contrast to the fully-informed setting by Shin et al. (2022) under which the problem simply reduces to a case where each agent only has a single arm. With Bayesian agents, even in a single-agent setting, analyzing the replication-proofness of an algorithm becomes complicated. Remarkably, we first show that the algorithm proposed by Shin et al. (2022), defined H-UCB, is no longer replication-proof for any exploration parameters. Then, we provide sufficient and necessary conditions for an algorithm to be replication-proof in the single-agent setting. These results centers around several analytical results in comparing the expected regret of multiple bandit instances, which might be of independent interest. We further prove that exploration-then-commit (ETC) algorithm satisfies these properties, whereas UCB does not, which in fact leads to the failure of being replication-proof. We expand this result to multi-agent setting, and provide a replication-proof algorithm for any problem instance. The proof mainly relies on the single-agent result, as well as some structural properties of ETC and the novel introduction of a restarting round, which largely simplifies the analysis while maintaining the regret unchanged (up to polylogarithmic factor). We finalize our result by proving its sublinear regret upper bound, which matches that of H-UCB.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.