-
Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling
Authors:
Jaehyeok Kim,
Dongyoon Wee,
Dan Xu
Abstract:
This paper introduces Motion-oriented Compositional Neural Radiance Fields (MoCo-NeRF), a framework designed to perform free-viewpoint rendering of monocular human videos via novel non-rigid motion modeling approach. In the context of dynamic clothed humans, complex cloth dynamics generate non-rigid motions that are intrinsically distinct from skeletal articulations and critically important for th…
▽ More
This paper introduces Motion-oriented Compositional Neural Radiance Fields (MoCo-NeRF), a framework designed to perform free-viewpoint rendering of monocular human videos via novel non-rigid motion modeling approach. In the context of dynamic clothed humans, complex cloth dynamics generate non-rigid motions that are intrinsically distinct from skeletal articulations and critically important for the rendering quality. The conventional approach models non-rigid motions as spatial (3D) deviations in addition to skeletal transformations. However, it is either time-consuming or challenging to achieve optimal quality due to its high learning complexity without a direct supervision. To target this problem, we propose a novel approach of modeling non-rigid motions as radiance residual fields to benefit from more direct color supervision in the rendering and utilize the rigid radiance fields as a prior to reduce the complexity of the learning process. Our approach utilizes a single multiresolution hash encoding (MHE) to concurrently learn the canonical T-pose representation from rigid skeletal motions and the radiance residual field for non-rigid motions. Additionally, to further improve both training efficiency and usability, we extend MoCo-NeRF to support simultaneous training of multiple subjects within a single framework, thanks to our effective design for modeling non-rigid motions. This scalability is achieved through the integration of a global MHE and learnable identity codes in addition to multiple local MHEs. We present extensive results on ZJU-MoCap and MonoCap, clearly demonstrating state-of-the-art performance in both single- and multi-subject settings. The code and model will be made publicly available at the project page: https://stevejaehyeok.github.io/publications/moco-nerf.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Calibration and simulation of ionization signal and electronics noise in the ICARUS liquid argon time projection chamber
Authors:
ICARUS collaboration,
P. Abratenko,
N. Abrego-Martinez,
A. Aduszkiewicz,
F. Akbar,
L. Aliaga Soplin,
M. Artero Pons,
J. Asaadi,
W. F. Badgett,
B. Baibussinov,
B. Behera,
V. Bellini,
R. Benocci,
S. Berkman,
S. Bertolucci,
M. Betancourt,
M. Bonesini,
T. Boone,
B. Bottino,
A. Braggiotti,
D. Brailsford,
S. J. Brice,
V. Brio,
C. Brizzolari,
H. S. Budd A. Campani
, et al. (153 additional authors not shown)
Abstract:
The ICARUS liquid argon time projection chamber (LArTPC) neutrino detector has been taking physics data since 2022 as part of the Short-Baseline Neutrino (SBN) Program. This paper details the equalization of the response to charge in the ICARUS time projection chamber (TPC), as well as data-driven tuning of the simulation of ionization charge signals and electronics noise. The equalization procedu…
▽ More
The ICARUS liquid argon time projection chamber (LArTPC) neutrino detector has been taking physics data since 2022 as part of the Short-Baseline Neutrino (SBN) Program. This paper details the equalization of the response to charge in the ICARUS time projection chamber (TPC), as well as data-driven tuning of the simulation of ionization charge signals and electronics noise. The equalization procedure removes non-uniformities in the ICARUS TPC response to charge in space and time. This work leverages the copious number of cosmic ray muons available to ICARUS at the surface. The ionization signal shape simulation applies a novel procedure that tunes the simulation to match what is measured in data. The end result of the equalization procedure and simulation tuning allows for a comparison of charge measurements in ICARUS between Monte Carlo simulation and data, showing good performance with minimal residual bias between the two.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Click-Gaussian: Interactive Segmentation to Any 3D Gaussians
Authors:
Seokhun Choi,
Hyeonseop Song,
Jaechul Kim,
Taehyeong Kim,
Hoseok Do
Abstract:
Interactive segmentation of 3D Gaussians opens a great opportunity for real-time manipulation of 3D scenes thanks to the real-time rendering capability of 3D Gaussian Splatting. However, the current methods suffer from time-consuming post-processing to deal with noisy segmentation output. Also, they struggle to provide detailed segmentation, which is important for fine-grained manipulation of 3D s…
▽ More
Interactive segmentation of 3D Gaussians opens a great opportunity for real-time manipulation of 3D scenes thanks to the real-time rendering capability of 3D Gaussian Splatting. However, the current methods suffer from time-consuming post-processing to deal with noisy segmentation output. Also, they struggle to provide detailed segmentation, which is important for fine-grained manipulation of 3D scenes. In this study, we propose Click-Gaussian, which learns distinguishable feature fields of two-level granularity, facilitating segmentation without time-consuming post-processing. We delve into challenges stemming from inconsistently learned feature fields resulting from 2D segmentation obtained independently from a 3D scene. 3D segmentation accuracy deteriorates when 2D segmentation results across the views, primary cues for 3D segmentation, are in conflict. To overcome these issues, we propose Global Feature-guided Learning (GFL). GFL constructs the clusters of global feature candidates from noisy 2D segments across the views, which smooths out noises when training the features of 3D Gaussians. Our method runs in 10 ms per click, 15 to 130 times as fast as the previous methods, while also significantly improving segmentation accuracy. Our project page is available at https://seokhunchoi.github.io/Click-Gaussian
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
Authors:
Jung Hyun Lee,
Jeonghoon Kim,
June Yong Yang,
Se Jung Kwon,
Eunho Yang,
Kang Min Yoo,
Dongsoo Lee
Abstract:
With the commercialization of large language models (LLMs), weight-activation quantization has emerged to compress and accelerate LLMs, achieving high throughput while reducing inference costs. However, existing post-training quantization (PTQ) techniques for quantizing weights and activations of LLMs still suffer from non-negligible accuracy drops, especially on massive multitask language underst…
▽ More
With the commercialization of large language models (LLMs), weight-activation quantization has emerged to compress and accelerate LLMs, achieving high throughput while reducing inference costs. However, existing post-training quantization (PTQ) techniques for quantizing weights and activations of LLMs still suffer from non-negligible accuracy drops, especially on massive multitask language understanding. To address this issue, we propose Low-Rank Quantization (LRQ) $-$ a simple yet effective post-training weight quantization method for LLMs that reconstructs the outputs of an intermediate Transformer block by leveraging low-rank weight-scaling matrices, replacing the conventional full weight-scaling matrices that entail as many learnable scales as their associated weights. Thanks to parameter sharing via low-rank structure, LRQ only needs to learn significantly fewer parameters while enabling the individual scaling of weights, thus boosting the generalization capability of quantized LLMs. We show the superiority of LRQ over prior LLM PTQ works under (i) $8$-bit weight and per-tensor activation quantization, (ii) $4$-bit weight and $8$-bit per-token activation quantization, and (iii) low-bit weight-only quantization schemes. Our code is available at \url{https://github.com/onliwad101/FlexRound_LRQ} to inspire LLM researchers and engineers.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation
Authors:
Jiwook Kim,
Seonho Lee,
Jaeyo Shin,
Jiho Choi,
Hyunjung Shim
Abstract:
Score distillation sampling (SDS) has emerged as an effective framework in text-driven 3D editing tasks due to its inherent 3D consistency. However, existing SDS-based 3D editing methods suffer from extensive training time and lead to low-quality results, primarily because these methods deviate from the sampling dynamics of diffusion models. In this paper, we propose DreamCatalyst, a novel framewo…
▽ More
Score distillation sampling (SDS) has emerged as an effective framework in text-driven 3D editing tasks due to its inherent 3D consistency. However, existing SDS-based 3D editing methods suffer from extensive training time and lead to low-quality results, primarily because these methods deviate from the sampling dynamics of diffusion models. In this paper, we propose DreamCatalyst, a novel framework that interprets SDS-based editing as a diffusion reverse process. Our objective function considers the sampling dynamics, thereby making the optimization process of DreamCatalyst an approximation of the diffusion reverse process in editing tasks. DreamCatalyst aims to reduce training time and improve editing quality. DreamCatalyst presents two modes: (1) a faster mode, which edits the NeRF scene in only about 25 minutes, and (2) a high-quality mode, which produces superior results in less than 70 minutes. Specifically, our high-quality mode outperforms current state-of-the-art NeRF editing methods both in terms of speed and quality. See more extensive results on our project page: https://dream-catalyst.github.io.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
A practical approach to calculating magnetic Johnson noise for precision measurements
Authors:
N. S. Phan,
S. M. Clayton,
Y. J. Kim,
T. M. Ito
Abstract:
Magnetic Johnson noise is an important consideration for many applications involving precision magnetometry, and its significance will only increase in the future with improvements in measurement sensitivity. The fluctuation-dissipation theorem can be utilized to derive analytic expressions for magnetic Johnson noise in certain situations. But when used in conjunction with commercially available f…
▽ More
Magnetic Johnson noise is an important consideration for many applications involving precision magnetometry, and its significance will only increase in the future with improvements in measurement sensitivity. The fluctuation-dissipation theorem can be utilized to derive analytic expressions for magnetic Johnson noise in certain situations. But when used in conjunction with commercially available finite element analysis tools, the combined approach is particularly powerful as it provides a practical means to calculate the magnetic Johnson noise arising from conductors of arbitrary geometry and permeability. In this paper, we demonstrate this method to be one of the most comprehensive approaches presently available to calculate thermal magnetic noise. In particular, its applicability is shown to not be limited to cases where the noise is evaluated at a point in space but also can be expanded to include cases where the magnetic field detector has a more general shape, such as a finite size loop, a gradiometer, or a detector that consists of a polarized atomic species trapped in a volume. Furthermore, some physics insights gained through studies made using this method are discussed.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Competition between group interactions and nonlinearity in voter dynamics on hypergraphs
Authors:
Jihye Kim,
Deok-Sun Lee,
Byungjoon Min,
Mason A. Porter,
Maxi San Miguel,
K. -I. Goh
Abstract:
Social dynamics are often driven by both pairwise (i.e., dyadic) relationships and higher-order (i.e., polyadic) group relationships, which one can describe using hypergraphs. To gain insight into the impact of polyadic relationships on dynamical processes on networks, we formulate and study a polyadic voter process, which we call the group-driven voter model (GVM), in which we incorporate the eff…
▽ More
Social dynamics are often driven by both pairwise (i.e., dyadic) relationships and higher-order (i.e., polyadic) group relationships, which one can describe using hypergraphs. To gain insight into the impact of polyadic relationships on dynamical processes on networks, we formulate and study a polyadic voter process, which we call the group-driven voter model (GVM), in which we incorporate the effect of group interactions by nonlinear interactions that are subject to a group (i.e., hyperedge) constraint. By examining the competition between nonlinearity and group sizes, we show that the GVM achieves consensus faster than standard voter-model dynamics, with an optimum minimizing exit time τ . We substantiate this finding by using mean-field theory on annealed uniform hypergraphs with N nodes, for which τ scales as A ln N, where the prefactor A depends both on the nonlinearity and on group-constraint factors. Our results reveal how competition between group interactions and nonlinearity shapes GVM dynamics. We thereby highlight the importance of such competing effects in complex systems with polyadic interactions.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
DataDream: Few-shot Guided Dataset Generation
Authors:
Jae Myung Kim,
Jessica Bader,
Stephan Alaniz,
Cordelia Schmid,
Zeynep Akata
Abstract:
While text-to-image diffusion models have been shown to achieve state-of-the-art results in image synthesis, they have yet to prove their effectiveness in downstream applications. Previous work has proposed to generate data for image classifier training given limited real data access. However, these methods struggle to generate in-distribution images or depict fine-grained features, thereby hinder…
▽ More
While text-to-image diffusion models have been shown to achieve state-of-the-art results in image synthesis, they have yet to prove their effectiveness in downstream applications. Previous work has proposed to generate data for image classifier training given limited real data access. However, these methods struggle to generate in-distribution images or depict fine-grained features, thereby hindering the generalization of classification models trained on synthetic datasets. We propose DataDream, a framework for synthesizing classification datasets that more faithfully represents the real data distribution when guided by few-shot examples of the target classes. DataDream fine-tunes LoRA weights for the image generation model on the few real images before generating the training data using the adapted model. We then fine-tune LoRA weights for CLIP using the synthetic data to improve downstream image classification over previous approaches on a large variety of datasets. We demonstrate the efficacy of DataDream through extensive experiments, surpassing state-of-the-art classification accuracy with few-shot data across 7 out of 10 datasets, while being competitive on the other 3. Additionally, we provide insights into the impact of various factors, such as the number of real-shot and generated images as well as the fine-tuning compute on model performance. The code is available at https://github.com/ExplainableML/DataDream.
△ Less
Submitted 16 July, 2024; v1 submitted 15 July, 2024;
originally announced July 2024.
-
Joint-Embedding Predictive Architecture for Self-Supervised Learning of Mask Classification Architecture
Authors:
Dong-Hee Kim,
Sungduk Cho,
Hyeonwoo Cho,
Chanmin Park,
Jinyoung Kim,
Won Hwa Kim
Abstract:
In this work, we introduce Mask-JEPA, a self-supervised learning framework tailored for mask classification architectures (MCA), to overcome the traditional constraints associated with training segmentation models. Mask-JEPA combines a Joint Embedding Predictive Architecture with MCA to adeptly capture intricate semantics and precise object boundaries. Our approach addresses two critical challenge…
▽ More
In this work, we introduce Mask-JEPA, a self-supervised learning framework tailored for mask classification architectures (MCA), to overcome the traditional constraints associated with training segmentation models. Mask-JEPA combines a Joint Embedding Predictive Architecture with MCA to adeptly capture intricate semantics and precise object boundaries. Our approach addresses two critical challenges in self-supervised learning: 1) extracting comprehensive representations for universal image segmentation from a pixel decoder, and 2) effectively training the transformer decoder. The use of the transformer decoder as a predictor within the JEPA framework allows proficient training in universal image segmentation tasks. Through rigorous evaluations on datasets such as ADE20K, Cityscapes and COCO, Mask-JEPA demonstrates not only competitive results but also exceptional adaptability and robustness across various training scenarios. The architecture-agnostic nature of Mask-JEPA further underscores its versatility, allowing seamless adaptation to various mask classification family.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Geodesics on the Kahler cone of the Heisenberg group
Authors:
Joonhyung Kim,
Ioannis D. Platis,
Li-Jie Sun
Abstract:
In this paper we describe the geodesics on the Kähler cone of the Heisenberg group. Furthermore we also prove that this is not a complete manifold.
In this paper we describe the geodesics on the Kähler cone of the Heisenberg group. Furthermore we also prove that this is not a complete manifold.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Supernova Pointing Capabilities of DUNE
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
B. Aimard,
F. Akbar,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
H. Amar,
P. Amedo,
J. Anderson,
D. A. Andrade
, et al. (1340 additional authors not shown)
Abstract:
The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electr…
▽ More
The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electron-neutrino charged-current absorption on $^{40}$Ar and elastic scattering of neutrinos on electrons. Procedures to reconstruct individual interactions, including a newly developed technique called ``brems flipping'', as well as the burst direction from an ensemble of interactions are described. Performance of the burst direction reconstruction is evaluated for supernovae happening at a distance of 10 kpc for a specific supernova burst flux model. The pointing resolution is found to be 3.4 degrees at 68% coverage for a perfect interaction-channel classification and a fiducial mass of 40 kton, and 6.6 degrees for a 10 kton fiducial mass respectively. Assuming a 4% rate of charged-current interactions being misidentified as elastic scattering, DUNE's burst pointing resolution is found to be 4.3 degrees (8.7 degrees) at 68% coverage.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Dominant Design Prediction with Phylogenetic Networks
Authors:
Youwei He,
Jeong-Dong Lee,
Dawoon Jeong,
Sungjun Choi,
Jiyong Kim
Abstract:
This study proposes an effective method to predict technology development from an evolutionary perspective. Product evolution is the result of technological evolution and market selection. A phylogenetic network is the main method to study product evolution. The formation of the dominant design determines the trajectory of technology development. How to predict future dominant design has become a…
▽ More
This study proposes an effective method to predict technology development from an evolutionary perspective. Product evolution is the result of technological evolution and market selection. A phylogenetic network is the main method to study product evolution. The formation of the dominant design determines the trajectory of technology development. How to predict future dominant design has become a key issue in technology forecasting and new product development. We define the dominant product and use machine learning methods, combined with product evolutionary theory, to construct a Fully Connected Phylogenetic Network dataset to effectively predict the future dominant design.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation
Authors:
Ge Gao,
Jongin Kim,
Sejin Paik,
Ekaterina Novozhilova,
Yi Liu,
Sarah T. Bonna,
Margrit Betke,
Derry Tanti Wijaya
Abstract:
Predicting emotions elicited by news headlines can be challenging as the task is largely influenced by the varying nature of people's interpretations and backgrounds. Previous works have explored classifying discrete emotions directly from news headlines. We provide a different approach to tackling this problem by utilizing people's explanations of their emotion, written in free-text, on how they…
▽ More
Predicting emotions elicited by news headlines can be challenging as the task is largely influenced by the varying nature of people's interpretations and backgrounds. Previous works have explored classifying discrete emotions directly from news headlines. We provide a different approach to tackling this problem by utilizing people's explanations of their emotion, written in free-text, on how they feel after reading a news headline. Using the dataset BU-NEmo+ (Gao et al., 2022), we found that for emotion classification, the free-text explanations have a strong correlation with the dominant emotion elicited by the headlines. The free-text explanations also contain more sentimental context than the news headlines alone and can serve as a better input to emotion classification models. Therefore, in this work we explored generating emotion explanations from headlines by training a sequence-to-sequence transformer model and by using pretrained large language model, ChatGPT (GPT-4). We then used the generated emotion explanations for emotion classification. In addition, we also experimented with training the pretrained T5 model for the intermediate task of explanation generation before fine-tuning it for emotion classification. Using McNemar's significance test, methods that incorporate GPT-generated free-text emotion explanations demonstrated significant improvement (P-value < 0.05) in emotion classification from headlines, compared to methods that only use headlines. This underscores the value of using intermediate free-text explanations for emotion prediction tasks with headlines.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Automated detection of gibbon calls from passive acoustic monitoring data using convolutional neural networks in the "torch for R" ecosystem
Authors:
Dena J. Clink,
Jinsung Kim,
Hope Cross-Jaya,
Abdul Hamid Ahmad,
Moeurk Hong,
Roeun Sala,
Hélène Birot,
Cain Agger,
Thinh Tien Vu,
Hoa Nguyen Thi,
Thanh Nguyen Chi,
Holger Klinck
Abstract:
Automated detection of acoustic signals is crucial for effective monitoring of vocal animals and their habitats across ecologically-relevant spatial and temporal scales. Recent advances in deep learning have made these approaches more accessible. However, there are few deep learning approaches that can be implemented natively in the R programming environment; approaches that run natively in R may…
▽ More
Automated detection of acoustic signals is crucial for effective monitoring of vocal animals and their habitats across ecologically-relevant spatial and temporal scales. Recent advances in deep learning have made these approaches more accessible. However, there are few deep learning approaches that can be implemented natively in the R programming environment; approaches that run natively in R may be more accessible for ecologists. The "torch for R" ecosystem has made the use of transfer learning with convolutional neural networks accessible for R users. Here, we evaluate a workflow that uses transfer learning for the automated detection of acoustic signals from passive acoustic monitoring (PAM) data. Our specific goals include: 1) present a method for automated detection of gibbon calls from PAM data using the "torch for R" ecosystem; 2) compare the results of transfer learning for six pretrained CNN architectures; and 3) investigate how well the different architectures perform on datasets of the female calls from two different gibbon species: the northern grey gibbon (Hylobates funereus) and the southern yellow-cheeked crested gibbon (Nomascus gabriellae). We found that the highest performing architecture depended on the test dataset. We successfully deployed the top performing model for each gibbon species to investigate spatial of variation in gibbon calling behavior across two grids of autonomous recording units in Danum Valley Conservation Area, Malaysia and Keo Seima Wildlife Sanctuary, Cambodia. The fields of deep learning and automated detection are rapidly evolving, and we provide the methods and datasets as benchmarks for future work.
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
Population Concentration in High-Complexity Regions within City during the heat wave
Authors:
Hyoji Choi,
Jonghyun Kim,
Donghyeon Yu,
Bogang Jun
Abstract:
This study investigates the impact of the 2018 summer heat wave on urban mobility in Seoul and the role of economic complexity in the region's resilience. Findings from subway and mobile phone data indicate a significant decrease in the floating population during extreme heat wave, underscoring the thermal vulnerability of urban areas. However, urban regions with higher complexity demonstrate resi…
▽ More
This study investigates the impact of the 2018 summer heat wave on urban mobility in Seoul and the role of economic complexity in the region's resilience. Findings from subway and mobile phone data indicate a significant decrease in the floating population during extreme heat wave, underscoring the thermal vulnerability of urban areas. However, urban regions with higher complexity demonstrate resilience, attracting more visitors despite high temperatures. Our results suggest the centrality of economic complexity in urban resilience against climate-induced stressors. Additionally, it implies that high-complexity small businesses' clusters can serve as focal points for sustaining urban vitality in the face of thermal shocks within city. In the long run perspective, our results imply the possibility that people are more concentrated in high complexity region in the era of global warming.
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
Machine Learning Based Prediction of Proton Conductivity in Metal-Organic Frameworks
Authors:
Seunghee Han,
Byeong Gwan Lee,
Dae Woon Lim,
Jihan Kim
Abstract:
Recently, metal-organic frameworks (MOFs) have demonstrated their potential as solid-state electrolytes in proton exchange membrane fuel cells. However, the number of MOFs reported to exhibit proton conductivity remains limited, and the mechanisms underlying this phenomenon are not fully elucidated, complicating the design of proton-conductive MOFs. In response, we developed a comprehensive databa…
▽ More
Recently, metal-organic frameworks (MOFs) have demonstrated their potential as solid-state electrolytes in proton exchange membrane fuel cells. However, the number of MOFs reported to exhibit proton conductivity remains limited, and the mechanisms underlying this phenomenon are not fully elucidated, complicating the design of proton-conductive MOFs. In response, we developed a comprehensive database of proton-conductive MOFs and applied machine learning techniques to predict their proton conductivity. Our approach included the construction of both descriptor-based and transformer-based models. Notably, the transformer-based transfer learning (Freeze) model performed the best with a mean absolute error (MAE) of 0.91, suggesting that the proton conductivity of MOFs can be estimated within one order of magnitude using this model. Additionally, we employed feature importance and principal component analysis to explore the factors influencing proton conductivity. The insights gained from our database and machine learning model are expected to facilitate the targeted design of proton-conductive MOFs.
△ Less
Submitted 18 June, 2024;
originally announced July 2024.
-
MIXED-SENSE: A Mixed Reality Sensor Emulation Framework for Test and Evaluation of UAVs Against False Data Injection Attacks
Authors:
Kartik A. Pant,
Li-Yu Lin,
Jaehyeok Kim,
Worawis Sribunma,
James M. Goppert,
Inseok Hwang
Abstract:
We present a high-fidelity Mixed Reality sensor emulation framework for testing and evaluating the resilience of Unmanned Aerial Vehicles (UAVs) against false data injection (FDI) attacks. The proposed approach can be utilized to assess the impact of FDI attacks, benchmark attack detector performance, and validate the effectiveness of mitigation/reconfiguration strategies in single-UAV and UAV swa…
▽ More
We present a high-fidelity Mixed Reality sensor emulation framework for testing and evaluating the resilience of Unmanned Aerial Vehicles (UAVs) against false data injection (FDI) attacks. The proposed approach can be utilized to assess the impact of FDI attacks, benchmark attack detector performance, and validate the effectiveness of mitigation/reconfiguration strategies in single-UAV and UAV swarm operations. Our Mixed Reality framework leverages high-fidelity simulations of Gazebo and a Motion Capture system to emulate proprioceptive (e.g., GNSS) and exteroceptive (e.g., camera) sensor measurements in real-time. We propose an empirical approach to faithfully recreate signal characteristics such as latency and noise in these measurements. Finally, we illustrate the efficacy of our proposed framework through a Mixed Reality experiment consisting of an emulated GNSS attack on an actual UAV, which (i) demonstrates the impact of false data injection attacks on GNSS measurements and (ii) validates a mitigation strategy utilizing a distributed camera network developed in our previous work. Our open-source implementation is available at \href{https://github.com/CogniPilot/mixed\_sense}{\texttt{https://github.com/CogniPilot/mixed\_sense}}
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
ProDepth: Boosting Self-Supervised Multi-Frame Monocular Depth with Probabilistic Fusion
Authors:
Sungmin Woo,
Wonjoon Lee,
Woo Jin Kim,
Dogyoon Lee,
Sangyoun Lee
Abstract:
Self-supervised multi-frame monocular depth estimation relies on the geometric consistency between successive frames under the assumption of a static scene. However, the presence of moving objects in dynamic scenes introduces inevitable inconsistencies, causing misaligned multi-frame feature matching and misleading self-supervision during training. In this paper, we propose a novel framework calle…
▽ More
Self-supervised multi-frame monocular depth estimation relies on the geometric consistency between successive frames under the assumption of a static scene. However, the presence of moving objects in dynamic scenes introduces inevitable inconsistencies, causing misaligned multi-frame feature matching and misleading self-supervision during training. In this paper, we propose a novel framework called ProDepth, which effectively addresses the mismatch problem caused by dynamic objects using a probabilistic approach. We initially deduce the uncertainty associated with static scene assumption by adopting an auxiliary decoder. This decoder analyzes inconsistencies embedded in the cost volume, inferring the probability of areas being dynamic. We then directly rectify the erroneous cost volume for dynamic areas through a Probabilistic Cost Volume Modulation (PCVM) module. Specifically, we derive probability distributions of depth candidates from both single-frame and multi-frame cues, modulating the cost volume by adaptively fusing those distributions based on the inferred uncertainty. Additionally, we present a self-supervision loss reweighting strategy that not only masks out incorrect supervision with high uncertainty but also mitigates the risks in remaining possible dynamic areas in accordance with the probability. Our proposed method excels over state-of-the-art approaches in all metrics on both Cityscapes and KITTI datasets, and demonstrates superior generalization ability on the Waymo Open dataset.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Does Incomplete Syntax Influence Korean Language Model? Focusing on Word Order and Case Markers
Authors:
Jong Myoung Kim,
Young-Jun Lee,
Yong-jin Han,
Sangkeun Jung,
Ho-Jin Choi
Abstract:
Syntactic elements, such as word order and case markers, are fundamental in natural language processing. Recent studies show that syntactic information boosts language model performance and offers clues for people to understand their learning mechanisms. Unlike languages with a fixed word order such as English, Korean allows for varied word sequences, despite its canonical structure, due to case m…
▽ More
Syntactic elements, such as word order and case markers, are fundamental in natural language processing. Recent studies show that syntactic information boosts language model performance and offers clues for people to understand their learning mechanisms. Unlike languages with a fixed word order such as English, Korean allows for varied word sequences, despite its canonical structure, due to case markers that indicate the functions of sentence components. This study explores whether Korean language models can accurately capture this flexibility. We note that incomplete word orders and omitted case markers frequently appear in ordinary Korean communication. To investigate this further, we introduce the Syntactically Incomplete Korean (SIKO) dataset. Through SIKO, we assessed Korean language models' flexibility with incomplete syntax and confirmed the dataset's training value. Results indicate these models reflect Korean's inherent flexibility, accurately handling incomplete inputs. Moreover, fine-tuning with SIKO enhances the ability to handle common incomplete Korean syntactic forms. The dataset's simple construction process, coupled with significant performance enhancements, solidifies its standing as an effective data augmentation technique.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Multistate ferroelectric diodes with high electroresistance based on van der Waals heterostructures
Authors:
Soumya Sarkar,
Zirun Han,
Maheera Abdul Ghani,
Nives Strkalj,
Jung Ho Kim,
Yan Wang,
Deep Jariwala,
Manish Chhowalla
Abstract:
Some van der Waals (vdW) materials exhibit ferroelectricity, making them promising for novel non-volatile memories (NVMs) such as ferroelectric diodes (FeDs). CuInP2S6 (CIPS) is a well-known vdW ferroelectric that has been integrated with graphene for memory devices. Here we demonstrate FeDs with self-rectifying, hysteretic current-voltage characteristics based on vertical heterostructures of 10-n…
▽ More
Some van der Waals (vdW) materials exhibit ferroelectricity, making them promising for novel non-volatile memories (NVMs) such as ferroelectric diodes (FeDs). CuInP2S6 (CIPS) is a well-known vdW ferroelectric that has been integrated with graphene for memory devices. Here we demonstrate FeDs with self-rectifying, hysteretic current-voltage characteristics based on vertical heterostructures of 10-nm-thick CIPS and graphene. By using vdW indium-cobalt top electrodes and graphene bottom electrodes, we achieve high electroresistance (on- and off-state resistance ratios) of ~10^6, on-state rectification ratios of ~2500 for read/write voltages of 2 V/0.5 V and maximum output current densities of 100 A/cm^2. These metrics compare favourably with state-of-the-art FeDs. Piezoresponse force microscopy measurements show that stabilization of intermediate net polarization states in CIPS leads to stable multi-bit data retention at room temperature. The combination of two-terminal design, multi-bit memory, and low-power operation in CIPS-based FeDs is potentially interesting for compute-in-memory and neuromorphic computing applications.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models
Authors:
Jeongho Kim,
Min-Jung Kim,
Junsoo Lee,
Jaegul Choo
Abstract:
Pose-driven human-image animation diffusion models have shown remarkable capabilities in realistic human video synthesis. Despite the promising results achieved by previous approaches, challenges persist in achieving temporally consistent animation and ensuring robustness with off-the-shelf pose detectors. In this paper, we present TCAN, a pose-driven human image animation method that is robust to…
▽ More
Pose-driven human-image animation diffusion models have shown remarkable capabilities in realistic human video synthesis. Despite the promising results achieved by previous approaches, challenges persist in achieving temporally consistent animation and ensuring robustness with off-the-shelf pose detectors. In this paper, we present TCAN, a pose-driven human image animation method that is robust to erroneous poses and consistent over time. In contrast to previous methods, we utilize the pre-trained ControlNet without fine-tuning to leverage its extensive pre-acquired knowledge from numerous pose-image-caption pairs. To keep the ControlNet frozen, we adapt LoRA to the UNet layers, enabling the network to align the latent space between the pose and appearance features. Additionally, by introducing an additional temporal layer to the ControlNet, we enhance robustness against outliers of the pose detector. Through the analysis of attention maps over the temporal axis, we also designed a novel temperature map leveraging pose information, allowing for a more static background. Extensive experiments demonstrate that the proposed method can achieve promising results in video synthesis tasks encompassing various poses, like chibi. Project Page: https://eccv2024tcan.github.io/
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Constructing Concept-based Models to Mitigate Spurious Correlations with Minimal Human Effort
Authors:
Jeeyung Kim,
Ze Wang,
Qiang Qiu
Abstract:
Enhancing model interpretability can address spurious correlations by revealing how models draw their predictions. Concept Bottleneck Models (CBMs) can provide a principled way of disclosing and guiding model behaviors through human-understandable concepts, albeit at a high cost of human efforts in data annotation. In this paper, we leverage a synergy of multiple foundation models to construct CBM…
▽ More
Enhancing model interpretability can address spurious correlations by revealing how models draw their predictions. Concept Bottleneck Models (CBMs) can provide a principled way of disclosing and guiding model behaviors through human-understandable concepts, albeit at a high cost of human efforts in data annotation. In this paper, we leverage a synergy of multiple foundation models to construct CBMs with nearly no human effort. We discover undesirable biases in CBMs built on pre-trained models and propose a novel framework designed to exploit pre-trained models while being immune to these biases, thereby reducing vulnerability to spurious correlations. Specifically, our method offers a seamless pipeline that adopts foundation models for assessing potential spurious correlations in datasets, annotating concepts for images, and refining the annotations for improved robustness. We evaluate the proposed method on multiple datasets, and the results demonstrate its effectiveness in reducing model reliance on spurious correlations while preserving its interpretability.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Centrality dependence of Lévy-stable two-pion Bose-Einstein correlations in $\sqrt{s_{_{NN}}}=200$ GeV Au$+$Au collisions
Authors:
PHENIX Collaboration,
N. J. Abdulameer,
U. Acharya,
A. Adare,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
R. Akimoto,
H. Al-Ta'ani,
J. Alexander,
A. Angerami,
K. Aoki,
N. Apadula,
Y. Aramaki,
H. Asano,
E. C. Aschenauer,
E. T. Atomssa,
T. C. Awes,
B. Azmoun,
V. Babintsev,
M. Bai,
B. Bannier,
K. N. Barish,
B. Bassalleck,
S. Bathe
, et al. (377 additional authors not shown)
Abstract:
The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability…
▽ More
The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability $α$, and the Lévy-scale parameter $R$ as a function of transverse mass $m_T$ and centrality. The $λ(m_T)$ parameter is constant at larger values of $m_T$, but decreases as $m_T$ decreases. The Lévy scale parameter $R(m_T)$ decreases with $m_T$ and exhibits proportionality to the length scale of the nuclear overlap region. The Lévy exponent $α(m_T)$ is independent of $m_T$ within uncertainties in each investigated centrality bin, but shows a clear centrality dependence. At all centralities, the Lévy exponent $α$ is significantly different from that of Gaussian ($α=2$) or Cauchy ($α=1$) source distributions. Comparisons to the predictions of Monte-Carlo simulations of resonance-decay chains show that in all but the most peripheral centrality class (50%-60%), the obtained results are inconsistent with the measurements, unless a significant reduction of the in-medium mass of the $η'$ meson is included. In each centrality class, the best value of the in-medium $η'$ mass is compared to the mass of the $η$ meson, as well as to several theoretical predictions that consider restoration of $U_A(1)$ symmetry in hot hadronic matter.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Flow4D: Leveraging 4D Voxel Network for LiDAR Scene Flow Estimation
Authors:
Jaeyeul Kim,
Jungwan Woo,
Ukcheol Shin,
Jean Oh,
Sunghoon Im
Abstract:
Understanding the motion states of the surrounding environment is critical for safe autonomous driving. These motion states can be accurately derived from scene flow, which captures the three-dimensional motion field of points. Existing LiDAR scene flow methods extract spatial features from each point cloud and then fuse them channel-wise, resulting in the implicit extraction of spatio-temporal fe…
▽ More
Understanding the motion states of the surrounding environment is critical for safe autonomous driving. These motion states can be accurately derived from scene flow, which captures the three-dimensional motion field of points. Existing LiDAR scene flow methods extract spatial features from each point cloud and then fuse them channel-wise, resulting in the implicit extraction of spatio-temporal features. Furthermore, they utilize 2D Bird's Eye View and process only two frames, missing crucial spatial information along the Z-axis and the broader temporal context, leading to suboptimal performance. To address these limitations, we propose Flow4D, which temporally fuses multiple point clouds after the 3D intra-voxel feature encoder, enabling more explicit extraction of spatio-temporal features through a 4D voxel network. However, while using 4D convolution improves performance, it significantly increases the computational load. For further efficiency, we introduce the Spatio-Temporal Decomposition Block (STDB), which combines 3D and 1D convolutions instead of using heavy 4D convolution. In addition, Flow4D further improves performance by using five frames to take advantage of richer temporal information. As a result, the proposed method achieves a 45.9% higher performance compared to the state-of-the-art while running in real-time, and won 1st place in the 2024 Argoverse 2 Scene Flow Challenge. The code is available at https://github.com/dgist-cvlab/Flow4D.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning
Authors:
Jongsuk Kim,
Jiwon Shin,
Junmo Kim
Abstract:
In recent years, advancements in representation learning and language models have propelled Automated Captioning (AC) to new heights, enabling the generation of human-level descriptions. Leveraging these advancements, we propose AVCap, an Audio-Visual Captioning framework, a simple yet powerful baseline approach applicable to audio-visual captioning. AVCap utilizes audio-visual features as text to…
▽ More
In recent years, advancements in representation learning and language models have propelled Automated Captioning (AC) to new heights, enabling the generation of human-level descriptions. Leveraging these advancements, we propose AVCap, an Audio-Visual Captioning framework, a simple yet powerful baseline approach applicable to audio-visual captioning. AVCap utilizes audio-visual features as text tokens, which has many advantages not only in performance but also in the extensibility and scalability of the model. AVCap is designed around three pivotal dimensions: the exploration of optimal audio-visual encoder architectures, the adaptation of pre-trained models according to the characteristics of generated text, and the investigation into the efficacy of modality fusion in captioning. Our method outperforms existing audio-visual captioning methods across all metrics and the code is available on https://github.com/JongSuk1/AVCap
△ Less
Submitted 10 July, 2024; v1 submitted 10 July, 2024;
originally announced July 2024.
-
KpopMT: Translation Dataset with Terminology for Kpop Fandom
Authors:
JiWoo Kim,
Yunsu Kim,
JinYeong Bak
Abstract:
While machines learn from existing corpora, humans have the unique capability to establish and accept new language systems. This makes human form unique language systems within social groups. Aligning with this, we focus on a gap remaining in addressing translation challenges within social groups, where in-group members utilize unique terminologies. We propose KpopMT dataset, which aims to fill th…
▽ More
While machines learn from existing corpora, humans have the unique capability to establish and accept new language systems. This makes human form unique language systems within social groups. Aligning with this, we focus on a gap remaining in addressing translation challenges within social groups, where in-group members utilize unique terminologies. We propose KpopMT dataset, which aims to fill this gap by enabling precise terminology translation, choosing Kpop fandom as an initiative for social groups given its global popularity. Expert translators provide 1k English translations for Korean posts and comments, each annotated with specific terminology within social groups' language systems. We evaluate existing translation systems including GPT models on KpopMT to identify their failure cases. Results show overall low scores, underscoring the challenges of reflecting group-specific terminologies and styles in translation. We make KpopMT publicly available.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Flow-acoustic resonance in deep and inclined cavities
Authors:
You Wei Ho,
Jae Wook Kim
Abstract:
This paper presents numerical investigations of flow-acoustic resonances in deep and inclined cavities using wall-resolved large eddy simulations. The study focuses on cavity configurations with an aspect ratio of $D/L = 2.632$, subjected to two Mach numbers of $0.2$ and $0.3$ at three different inclination angles ($α=30^{\circ}$, $60^{\circ}$, and $90^{\circ}$). Fully turbulent boundary layers ge…
▽ More
This paper presents numerical investigations of flow-acoustic resonances in deep and inclined cavities using wall-resolved large eddy simulations. The study focuses on cavity configurations with an aspect ratio of $D/L = 2.632$, subjected to two Mach numbers of $0.2$ and $0.3$ at three different inclination angles ($α=30^{\circ}$, $60^{\circ}$, and $90^{\circ}$). Fully turbulent boundary layers generated from independent precursor simulations are employed upstream of the cavities. Initial results highlight distinct aeroacoustic responses between inclined and orthogonal cavities, particularly at $M_{\infty}=0.3$, where inclined cavities exhibit stronger resonances at a lower peak frequency ($St\approx 0.27$) compared to the orthogonal cavity. Further analysis reveals that this lower Strouhal number corresponds to a reduced vortex convection speed linked to large shear-layer oscillations. Additionally, the acoustic input-output analysis indicates that the inclined cavities amplify acoustic responses more effectively and exhibit weaker source-sink cancellations compared to the orthogonal cavity. These mechanisms are identified as the primary contributors to the enhanced aeroacoustic responses in the inclined cavities. Finally, this paper proposes that the ratio between acoustic particle displacement and momentum thickness may be used as a criterion to predict the onset of the distinctive resonance at $St\approx 0.27$. It is suggested that the amplified resonances may be linked to a nonlinear mode shift of the first hydrodynamic mode through enhanced shear-layer oscillation taking place when the proposed criterion is met.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization
Authors:
Jeongseok Hyun,
Su Ho Han,
Hyolim Kang,
Joon-Young Lee,
Seon Joo Kim
Abstract:
The vocabulary size in temporal action localization (TAL) is constrained by the scarcity of large-scale annotated datasets. To address this, recent works incorporate powerful pre-trained vision-language models (VLMs), such as CLIP, to perform open-vocabulary TAL (OV-TAL). However, unlike VLMs trained on extensive image/video-text pairs, existing OV-TAL methods still rely on small, fully labeled TA…
▽ More
The vocabulary size in temporal action localization (TAL) is constrained by the scarcity of large-scale annotated datasets. To address this, recent works incorporate powerful pre-trained vision-language models (VLMs), such as CLIP, to perform open-vocabulary TAL (OV-TAL). However, unlike VLMs trained on extensive image/video-text pairs, existing OV-TAL methods still rely on small, fully labeled TAL datasets for training an action localizer. In this paper, we explore the scalability of self-training with unlabeled YouTube videos for OV-TAL. Our self-training approach consists of two stages. First, a class-agnostic action localizer is trained on a human-labeled TAL dataset and used to generate pseudo-labels for unlabeled videos. Second, the large-scale pseudo-labeled dataset is combined with the human-labeled dataset to train the localizer. Extensive experiments demonstrate that leveraging web-scale videos in self-training significantly enhances the generalizability of an action localizer. Additionally, we highlighted issues with existing OV-TAL evaluation schemes and proposed a new evaluation protocol. Code is released at https://github.com/HYUNJS/STOV-TAL
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Safe-Embed: Unveiling the Safety-Critical Knowledge of Sentence Encoders
Authors:
Jinseok Kim,
Jaewon Jung,
Sangyeop Kim,
Sohyung Park,
Sungzoon Cho
Abstract:
Despite the impressive capabilities of Large Language Models (LLMs) in various tasks, their vulnerability to unsafe prompts remains a critical issue. These prompts can lead LLMs to generate responses on illegal or sensitive topics, posing a significant threat to their safe and ethical use. Existing approaches attempt to address this issue using classification models, but they have several drawback…
▽ More
Despite the impressive capabilities of Large Language Models (LLMs) in various tasks, their vulnerability to unsafe prompts remains a critical issue. These prompts can lead LLMs to generate responses on illegal or sensitive topics, posing a significant threat to their safe and ethical use. Existing approaches attempt to address this issue using classification models, but they have several drawbacks. With the increasing complexity of unsafe prompts, similarity search-based techniques that identify specific features of unsafe prompts provide a more robust and effective solution to this evolving problem. This paper investigates the potential of sentence encoders to distinguish safe from unsafe prompts, and the ability to classify various unsafe prompts according to a safety taxonomy. We introduce new pairwise datasets and the Categorical Purity (CP) metric to measure this capability. Our findings reveal both the effectiveness and limitations of existing sentence encoders, proposing directions to improve sentence encoders to operate as more robust safety detectors. Our code is available at https://github.com/JwdanielJung/Safe-Embed.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Analyzing the Effectiveness of Listwise Reranking with Positional Invariance on Temporal Generalizability
Authors:
Soyoung Yoon,
Jongyoon Kim,
Seung-won Hwang
Abstract:
Benchmarking the performance of information retrieval (IR) methods are mostly conducted within a fixed set of documents (static corpora). However, in real-world web search engine environments, the document set is continuously updated and expanded. Addressing these discrepancies and measuring the temporal persistence of IR systems is crucial. By investigating the LongEval benchmark, specifically de…
▽ More
Benchmarking the performance of information retrieval (IR) methods are mostly conducted within a fixed set of documents (static corpora). However, in real-world web search engine environments, the document set is continuously updated and expanded. Addressing these discrepancies and measuring the temporal persistence of IR systems is crucial. By investigating the LongEval benchmark, specifically designed for such dynamic environments, our findings demonstrate the effectiveness of a listwise reranking approach, which proficiently handles inaccuracies induced by temporal distribution shifts. Among listwise rerankers, our findings show that ListT5, which effectively mitigates the positional bias problem by adopting the Fusion-in-Decoder architecture, is especially effective, and more so, as temporal drift increases, on the test-long subset.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Unveiling the Electronic, Transport, and Migration Properties of the Te-Defect Lattice in DyTe$_{1.8}$
Authors:
Jinwoong Kim,
Nicholas Kioussis
Abstract:
The rare-earth ditellurides are known to form two-dimensional square lattice where the strong Fermi surface nesting leads to structural modulation. In contrast to charge density waves, the supercell modulation is accompanied by the formation of the periodic Te vacancy network, where the Te deficiency affects the nesting vector (i.e. the supercell size) via tuning the chemical potential. In this wo…
▽ More
The rare-earth ditellurides are known to form two-dimensional square lattice where the strong Fermi surface nesting leads to structural modulation. In contrast to charge density waves, the supercell modulation is accompanied by the formation of the periodic Te vacancy network, where the Te deficiency affects the nesting vector (i.e. the supercell size) via tuning the chemical potential. In this work, first principles electronic structure calculations for the $\sqrt{5}\times\sqrt{5}$ supercell, that commonly appears in this family of tellurides, unveil interesting electronic, transport, and migration properties of the Te defect lattice in DyTe$_{1.8}$. The reconstruction of the Te-deficient square lattice, consisting of a single Te-dimer and a pair Te-trimers per unit cell, gives rise to an out-of-plane polarization, whose direction depends on the position of the dimer. This results in various close-in-energy parallel and antiparallel polarization configurations of successive Te layers depending on the dimer positions. We predict that the orientation of the Te dimers, and hence the corresponding structural motifs, can be reversibly switched between two in-plane perpendicular directions under tensile epitaxial strain via a piezoelectric substrate, resulting in a colossal conductivity switching. Furthermore, the Te-dimer orientations result in asymmetric Fermi surface which can be confirmed by quantum oscillations measurements. Finally, we present numerical results for the migration paths and energy landscape through various divacancy configurations in the presence or absence of epitaxial strain.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Perceptions to Beliefs: Exploring Precursory Inferences for Theory of Mind in Large Language Models
Authors:
Chani Jung,
Dongkwan Kim,
Jiho Jin,
Jiseon Kim,
Yeon Seonwoo,
Yejin Choi,
Alice Oh,
Hyunwoo Kim
Abstract:
While humans naturally develop theory of mind (ToM), the capability to understand other people's mental states and beliefs, state-of-the-art large language models (LLMs) underperform on simple ToM benchmarks. We posit that we can extend our understanding of LLMs' ToM abilities by evaluating key human ToM precursors -- perception inference and perception-to-belief inference -- in LLMs. We introduce…
▽ More
While humans naturally develop theory of mind (ToM), the capability to understand other people's mental states and beliefs, state-of-the-art large language models (LLMs) underperform on simple ToM benchmarks. We posit that we can extend our understanding of LLMs' ToM abilities by evaluating key human ToM precursors -- perception inference and perception-to-belief inference -- in LLMs. We introduce two datasets, Percept-ToMi and Percept-FANToM, to evaluate these precursory inferences for ToM in LLMs by annotating characters' perceptions on ToMi and FANToM, respectively. Our evaluation of eight state-of-the-art LLMs reveals that the models generally perform well in perception inference while exhibiting limited capability in perception-to-belief inference (e.g., lack of inhibitory control). Based on these results, we present PercepToM, a novel ToM method leveraging LLMs' strong perception inference capability while supplementing their limited perception-to-belief inference. Experimental results demonstrate that PercepToM significantly enhances LLM's performance, especially in false belief scenarios.
△ Less
Submitted 9 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
RadiomicsFill-Mammo: Synthetic Mammogram Mass Manipulation with Radiomics Features
Authors:
Inye Na,
Jonghun Kim,
Eun Sook Ko,
Hyunjin Park
Abstract:
Motivated by the question, "Can we generate tumors with desired attributes?'' this study leverages radiomics features to explore the feasibility of generating synthetic tumor images. Characterized by its low-dimensional yet biologically meaningful markers, radiomics bridges the gap between complex medical imaging data and actionable clinical insights. We present RadiomicsFill-Mammo, the first of t…
▽ More
Motivated by the question, "Can we generate tumors with desired attributes?'' this study leverages radiomics features to explore the feasibility of generating synthetic tumor images. Characterized by its low-dimensional yet biologically meaningful markers, radiomics bridges the gap between complex medical imaging data and actionable clinical insights. We present RadiomicsFill-Mammo, the first of the RadiomicsFill series, an innovative technique that generates realistic mammogram mass images mirroring specific radiomics attributes using masked images and opposite breast images, leveraging a recent stable diffusion model. This approach also allows for the incorporation of essential clinical variables, such as BI-RADS and breast density, alongside radiomics features as conditions for mass generation. Results indicate that RadiomicsFill-Mammo effectively generates diverse and realistic tumor images based on various radiomics conditions. Results also demonstrate a significant improvement in mass detection capabilities, leveraging RadiomicsFill-Mammo as a strategy to generate simulated samples. Furthermore, RadiomicsFill-Mammo not only advances medical imaging research but also opens new avenues for enhancing treatment planning and tumor simulation. Our code is available at https://github.com/nainye/RadiomicsFill.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Improved limit on neutrinoless double beta decay of \mohundred~from AMoRE-I
Authors:
A. Agrawal,
V. V. Alenkov,
P. Aryal,
J. Beyer,
B. Bhandari,
R. S. Boiko,
K. Boonin,
O. Buzanov,
C. R. Byeon,
N. Chanthima,
M. K. Cheoun,
J. S. Choe,
Seonho Choi,
S. Choudhury,
J. S. Chung,
F. A. Danevich,
M. Djamal,
D. Drung,
C. Enss,
A. Fleischmann,
A. M. Gangapshev,
L. Gastaldo,
Y. M. Gavrilyuk,
A. M. Gezhaev,
O. Gileva
, et al. (83 additional authors not shown)
Abstract:
AMoRE searches for the signature of neutrinoless double beta decay of $^{100}$Mo with a 100 kg sample of enriched $^{100}$Mo. Scintillating molybdate crystals coupled with a metallic magnetic calorimeter operate at milli-Kelvin temperatures to measure the energy of electrons emitted in the decay. As a demonstration of the full-scale AMoRE, we conducted AMoRE-I, a pre-experiment with 18 molybdate c…
▽ More
AMoRE searches for the signature of neutrinoless double beta decay of $^{100}$Mo with a 100 kg sample of enriched $^{100}$Mo. Scintillating molybdate crystals coupled with a metallic magnetic calorimeter operate at milli-Kelvin temperatures to measure the energy of electrons emitted in the decay. As a demonstration of the full-scale AMoRE, we conducted AMoRE-I, a pre-experiment with 18 molybdate crystals, at the Yangyang Underground Laboratory for over two years. The exposure was 8.02 kg$\cdot$year (or 3.89 kg$_{\mathrm{^{100}Mo}}\cdot$year) and the total background rate near the Q-value was 0.025 $\pm$ 0.002 counts/keV/kg/year. We observed no indication of $0νββ$ decay and report a new lower limit of the half-life of $^{100}$Mo $0νββ$ decay as $ T^{0ν}_{1/2}>3.0\times10^{24}~\mathrm{years}$ at 90\% confidence level. The effective Majorana mass limit range is $m_{ββ}<$(210--610) meV using nuclear matrix elements estimated in the framework of different models, including the recent shell model calculations.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Can Machines Learn the True Probabilities?
Authors:
Jinsook Kim
Abstract:
When there exists uncertainty, AI machines are designed to make decisions so as to reach the best expected outcomes. Expectations are based on true facts about the objective environment the machines interact with, and those facts can be encoded into AI models in the form of true objective probability functions. Accordingly, AI models involve probabilistic machine learning in which the probabilitie…
▽ More
When there exists uncertainty, AI machines are designed to make decisions so as to reach the best expected outcomes. Expectations are based on true facts about the objective environment the machines interact with, and those facts can be encoded into AI models in the form of true objective probability functions. Accordingly, AI models involve probabilistic machine learning in which the probabilities should be objectively interpreted. We prove under some basic assumptions when machines can learn the true objective probabilities, if any, and when machines cannot learn them.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
A Theory of Machine Learning
Authors:
Jinsook Kim,
Jinho Kang
Abstract:
We critically review three major theories of machine learning and provide a new theory according to which machines learn a function when the machines successfully compute it. We show that this theory challenges common assumptions in the statistical and the computational learning theories, for it implies that learning true probabilities is equivalent neither to obtaining a correct calculation of th…
▽ More
We critically review three major theories of machine learning and provide a new theory according to which machines learn a function when the machines successfully compute it. We show that this theory challenges common assumptions in the statistical and the computational learning theories, for it implies that learning true probabilities is equivalent neither to obtaining a correct calculation of the true probabilities nor to obtaining an almost-sure convergence to them. We also briefly discuss some case studies from natural language processing and macroeconomics from the perspective of the new theory.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Beyond Binary Gender Labels: Revealing Gender Biases in LLMs through Gender-Neutral Name Predictions
Authors:
Zhiwen You,
HaeJin Lee,
Shubhanshu Mishra,
Sullam Jeoung,
Apratim Mishra,
Jinseok Kim,
Jana Diesner
Abstract:
Name-based gender prediction has traditionally categorized individuals as either female or male based on their names, using a binary classification system. That binary approach can be problematic in the cases of gender-neutral names that do not align with any one gender, among other reasons. Relying solely on binary gender categories without recognizing gender-neutral names can reduce the inclusiv…
▽ More
Name-based gender prediction has traditionally categorized individuals as either female or male based on their names, using a binary classification system. That binary approach can be problematic in the cases of gender-neutral names that do not align with any one gender, among other reasons. Relying solely on binary gender categories without recognizing gender-neutral names can reduce the inclusiveness of gender prediction tasks. We introduce an additional gender category, i.e., "neutral", to study and address potential gender biases in Large Language Models (LLMs). We evaluate the performance of several foundational and large language models in predicting gender based on first names only. Additionally, we investigate the impact of adding birth years to enhance the accuracy of gender prediction, accounting for shifting associations between names and genders over time. Our findings indicate that most LLMs identify male and female names with high accuracy (over 80%) but struggle with gender-neutral names (under 40%), and the accuracy of gender prediction is higher for English-based first names than non-English names. The experimental results show that incorporating the birth year does not improve the overall accuracy of gender prediction, especially for names with evolving gender associations. We recommend using caution when applying LLMs for gender identification in downstream tasks, particularly when dealing with non-binary gender labels.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Search for the baryon number and lepton number violating decays $τ^-\to Λπ^-$ and $τ^-\to \barΛπ^-$ at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (349 additional authors not shown)
Abstract:
We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper…
▽ More
We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper limits at 90\% credibility level on the branching fractions of $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛπ^-$ are determined to be $4.7 \times 10^{-8}$ and $4.3 \times 10^{-8}$, respectively.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Cosmological constraints from the cross-correlation of DESI Luminous Red Galaxies with CMB lensing from Planck PR4 and ACT DR6
Authors:
Noah Sailer,
Joshua Kim,
Simone Ferraro,
Mathew S. Madhavacheril,
Martin White,
Irene Abril-Cabezas,
Jessica Nicole Aguilar,
Steven Ahlen,
J. Richard Bond,
David Brooks,
Etienne Burtin,
Erminia Calabrese,
Shi-Fan Chen,
Steve K. Choi,
Todd Claybaugh,
Kyle Dawson,
Axel de la Macorra,
Joseph DeRose,
Arjun Dey,
Biprateep Dey,
Peter Doel,
Jo Dunkley,
Carmen Embil-Villagra,
Gerrit S. Farren,
Andreu Font-Ribera
, et al. (41 additional authors not shown)
Abstract:
We infer the growth of large scale structure over the redshift range $0.4\lesssim z \lesssim 1$ from the cross-correlation of spectroscopically calibrated Luminous Red Galaxies (LRGs) selected from the Dark Energy Spectroscopic Instrument (DESI) legacy imaging survey with CMB lensing maps reconstructed from the latest Planck and ACT data. We adopt a hybrid effective field theory (HEFT) model that…
▽ More
We infer the growth of large scale structure over the redshift range $0.4\lesssim z \lesssim 1$ from the cross-correlation of spectroscopically calibrated Luminous Red Galaxies (LRGs) selected from the Dark Energy Spectroscopic Instrument (DESI) legacy imaging survey with CMB lensing maps reconstructed from the latest Planck and ACT data. We adopt a hybrid effective field theory (HEFT) model that robustly regulates the cosmological information obtainable from smaller scales, such that our cosmological constraints are reliably derived from the (predominantly) linear regime. We perform an extensive set of bandpower- and parameter-level systematics checks to ensure the robustness of our results and to characterize the uniformity of the LRG sample. We demonstrate that our results are stable to a wide range of modeling assumptions, finding excellent agreement with a linear theory analysis performed on a restricted range of scales. From a tomographic analysis of the four LRG photometric redshift bins we find that the rate of structure growth is consistent with $Λ$CDM with an overall amplitude that is $\simeq5-7\%$ lower than predicted by primary CMB measurements with modest $(\sim2σ)$ statistical significance. From the combined analysis of all four bins and their cross-correlations with Planck we obtain $S_8 = 0.765\pm0.023$, which is less discrepant with primary CMB measurements than previous DESI LRG cross Planck CMB lensing results. From the cross-correlation with ACT we obtain $S_8 = 0.790^{+0.024}_{-0.027}$, while when jointly analyzing Planck and ACT we find $S_8 = 0.775^{+0.019}_{-0.022}$ from our data alone and $σ_8 = 0.772^{+0.020}_{-0.023}$ with the addition of BAO data. These constraints are consistent with the latest Planck primary CMB analyses at the $\simeq 1.6-2.2σ$ level, and are in excellent agreement with galaxy lensing surveys.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
The Atacama Cosmology Telescope DR6 and DESI: Structure formation over cosmic time with a measurement of the cross-correlation of CMB Lensing and Luminous Red Galaxies
Authors:
Joshua Kim,
Noah Sailer,
Mathew S. Madhavacheril,
Simone Ferraro,
Irene Abril-Cabezas,
Jessica Nicole Aguilar,
Steven Ahlen,
J. Richard Bond,
David Brooks,
Etienne Burtin,
Erminia Calabrese,
Shi-Fan Chen,
Steve K. Choi,
Todd Claybaugh,
Omar Darwish,
Axel de la Macorra,
Joseph DeRose,
Mark Devlin,
Arjun Dey,
Peter Doel,
Jo Dunkley,
Carmen Embil-Villagra,
Gerrit S. Farren,
Andreu Font-Ribera,
Jaime E. Forero-Romero
, et al. (48 additional authors not shown)
Abstract:
We present a high-significance cross-correlation of CMB lensing maps from the Atacama Cosmology Telescope (ACT) Data Release 6 (DR6) with spectroscopically calibrated luminous red galaxies (LRGs) from the Dark Energy Spectroscopic Instrument (DESI). We detect this cross-correlation at a significance of 38$σ$; combining our measurement with the Planck Public Release 4 (PR4) lensing map, we detect t…
▽ More
We present a high-significance cross-correlation of CMB lensing maps from the Atacama Cosmology Telescope (ACT) Data Release 6 (DR6) with spectroscopically calibrated luminous red galaxies (LRGs) from the Dark Energy Spectroscopic Instrument (DESI). We detect this cross-correlation at a significance of 38$σ$; combining our measurement with the Planck Public Release 4 (PR4) lensing map, we detect the cross-correlation at 50$σ$. Fitting this jointly with the galaxy auto-correlation power spectrum to break the galaxy bias degeneracy with $σ_8$, we perform a tomographic analysis in four LRG redshift bins spanning $0.4 \le z \le 1.0$ to constrain the amplitude of matter density fluctuations through the parameter combination $S_8^\times = σ_8 \left(Ω_m / 0.3\right)^{0.4}$. Prior to unblinding, we confirm with extragalactic simulations that foreground biases are negligible and carry out a comprehensive suite of null and consistency tests. Using a hybrid effective field theory (HEFT) model that allows scales as small as $k_{\rm max}=0.6$ $h/{\rm Mpc}$, we obtain a 3.3% constraint on $S_8^\times = σ_8 \left(Ω_m / 0.3\right)^{0.4} = 0.792^{+0.024}_{-0.028}$ from ACT data, as well as constraints on $S_8^\times(z)$ that probe structure formation over cosmic time. Our result is consistent with the early-universe extrapolation from primary CMB anisotropies measured by Planck PR4 within 1.2$σ$. Jointly fitting ACT and Planck lensing cross-correlations we obtain a 2.7% constraint of $S_8^\times = 0.776^{+0.019}_{-0.021}$, which is consistent with the Planck early-universe extrapolation within 2.1$σ$, with the lowest redshift bin showing the largest difference in mean. The latter may motivate further CMB lensing tomography analyses at $z<0.6$ to assess the impact of potential systematics or the consistency of the $Λ$CDM model over cosmic time.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection
Authors:
YeongHyeon Park,
Sungho Kang,
Myung Jin Kim,
Hyeong Seok Kim,
Juneho Yi
Abstract:
In unsupervised anomaly detection (UAD) research, while state-of-the-art models have reached a saturation point with extensive studies on public benchmark datasets, they adopt large-scale tailor-made neural networks (NN) for detection performance or pursued unified models for various tasks. Towards edge computing, it is necessary to develop a computationally efficient and scalable solution that av…
▽ More
In unsupervised anomaly detection (UAD) research, while state-of-the-art models have reached a saturation point with extensive studies on public benchmark datasets, they adopt large-scale tailor-made neural networks (NN) for detection performance or pursued unified models for various tasks. Towards edge computing, it is necessary to develop a computationally efficient and scalable solution that avoids large-scale complex NNs. Motivated by this, we aim to optimize the UAD performance with minimal changes to NN settings. Thus, we revisit the reconstruction-by-inpainting approach and rethink to improve it by analyzing strengths and weaknesses. The strength of the SOTA methods is a single deterministic masking approach that addresses the challenges of random multiple masking that is inference latency and output inconsistency. Nevertheless, the issue of failure to provide a mask to completely cover anomalous regions is a remaining weakness. To mitigate this issue, we propose Feature Attenuation of Defective Representation (FADeR) that only employs two MLP layers which attenuates feature information of anomaly reconstruction during decoding. By leveraging FADeR, features of unseen anomaly patterns are reconstructed into seen normal patterns, reducing false alarms. Experimental results demonstrate that FADeR achieves enhanced performance compared to similar-scale NNs. Furthermore, our approach exhibits scalability in performance enhancement when integrated with other single deterministic masking methods in a plug-and-play manner.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
LearnerVoice: A Dataset of Non-Native English Learners' Spontaneous Speech
Authors:
Haechan Kim,
Junho Myung,
Seoyoung Kim,
Sungpah Lee,
Dongyeop Kang,
Juho Kim
Abstract:
Prevalent ungrammatical expressions and disfluencies in spontaneous speech from second language (L2) learners pose unique challenges to Automatic Speech Recognition (ASR) systems. However, few datasets are tailored to L2 learner speech. We publicly release LearnerVoice, a dataset consisting of 50.04 hours of audio and transcriptions of L2 learners' spontaneous speech. Our linguistic analysis revea…
▽ More
Prevalent ungrammatical expressions and disfluencies in spontaneous speech from second language (L2) learners pose unique challenges to Automatic Speech Recognition (ASR) systems. However, few datasets are tailored to L2 learner speech. We publicly release LearnerVoice, a dataset consisting of 50.04 hours of audio and transcriptions of L2 learners' spontaneous speech. Our linguistic analysis reveals that transcriptions in our dataset contain L2S (L2 learner's Spontaneous speech) features, consisting of ungrammatical expressions and disfluencies (e.g., filler words, word repetitions, self-repairs, false starts), significantly more than native speech datasets. Fine-tuning whisper-small.en with LearnerVoice achieves a WER of 10.26%, 44.2% lower than vanilla whisper-small.en. Furthermore, our qualitative analysis indicates that 54.2% of errors from the vanilla model on LearnerVoice are attributable to L2S features, with 48.1% of them being reduced in the fine-tuned model.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Computer Vision for Clinical Gait Analysis: A Gait Abnormality Video Dataset
Authors:
Rahm Ranjan,
David Ahmedt-Aristizabal,
Mohammad Ali Armin,
Juno Kim
Abstract:
Clinical gait analysis (CGA) using computer vision is an emerging field in artificial intelligence that faces barriers of accessible, real-world data, and clear task objectives. This paper lays the foundation for current developments in CGA as well as vision-based methods and datasets suitable for gait analysis. We introduce The Gait Abnormality in Video Dataset (GAVD) in response to our review of…
▽ More
Clinical gait analysis (CGA) using computer vision is an emerging field in artificial intelligence that faces barriers of accessible, real-world data, and clear task objectives. This paper lays the foundation for current developments in CGA as well as vision-based methods and datasets suitable for gait analysis. We introduce The Gait Abnormality in Video Dataset (GAVD) in response to our review of over 150 current gait-related computer vision datasets, which highlighted the need for a large and accessible gait dataset clinically annotated for CGA. GAVD stands out as the largest video gait dataset, comprising 1874 sequences of normal, abnormal and pathological gaits. Additionally, GAVD includes clinically annotated RGB data sourced from publicly available content on online platforms. It also encompasses over 400 subjects who have undergone clinical grade visual screening to represent a diverse range of abnormal gait patterns, captured in various settings, including hospital clinics and urban uncontrolled outdoor environments. We demonstrate the validity of the dataset and utility of action recognition models for CGA using pretrained models Temporal Segment Networks(TSN) and SlowFast network to achieve video abnormality detection of 94% and 92% respectively when tested on GAVD dataset. A GitHub repository https://github.com/Rahmyyy/GAVD consisting of convenient URL links, and clinically relevant annotation for CGA is provided for over 450 online videos, featuring diverse subjects performing a range of normal, pathological, and abnormal gait patterns.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Unveiling the Unexplored Decay Mode of a Light Charged Higgs Boson to an Off-Shell Top Quark and a Bottom Quark
Authors:
Jinheung Kim,
Soojin Lee,
Prasenjit Sanyal,
Jeonghyeon Song,
Daohan Wang
Abstract:
The charged Higgs boson ($H^\pm$) with a mass below the top quark mass remains a viable possibility within the type-I two-Higgs-doublet model under current constraints. While previous LHC searches have primarily focused on the $H^\pm\toτν$ decay mode, the decay channel into an off-shell top quark and a bottom quark, $H^\pm \rightarrow t^*b$, is leading or subleading for $H^\pm$ masses between 130…
▽ More
The charged Higgs boson ($H^\pm$) with a mass below the top quark mass remains a viable possibility within the type-I two-Higgs-doublet model under current constraints. While previous LHC searches have primarily focused on the $H^\pm\toτν$ decay mode, the decay channel into an off-shell top quark and a bottom quark, $H^\pm \rightarrow t^*b$, is leading or subleading for $H^\pm$ masses between 130 and 170 GeV. This study investigates the discovery potential of future colliders for this off-shell decay mode through pair-produced charged Higgs bosons decaying via $H^+H^-\rightarrow t^*bτν\rightarrow bbjjτν$. We perform signal-to-background analyses at the HL-LHC and a prospective 100 TeV proton-proton collider, employing cut-flow strategies and the Boosted Decision Tree method. However, due to the softness of the $b$ jets, signal significances fall below detection thresholds at these facilities. Extending our study to a multi-TeV muon collider (MuC), we demonstrate that a 3 TeV MuC achieves high signal significance, surpassing the $5σ$ threshold with an integrated luminosity of 1 ab$^{-1}$, assuming a 10\% background uncertainty. Specifically, for $M_{H^\pm} = 130$, 150, and 170 GeV, the significances are 13.7, 13.5, and 6.06, respectively. In contrast, a 10 TeV MuC requires 10 ab$^{-1}$ to achieve similar results. Our findings highlight the critical role of the MuC in probing the new signal channel $H^\pm\rightarrow t^*b$, offering a promising avenue for future charged Higgs boson searches involving off-shell top quarks.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
RISC-V R-Extension: Advancing Efficiency with Rented-Pipeline for Edge DNN Processing
Authors:
Won Hyeok Kim,
Hyeong Jin Kim,
Tae Hee Han
Abstract:
The proliferation of edge devices necessitates efficient computational architectures for lightweight tasks, particularly deep neural network (DNN) inference. Traditional NPUs, though effective for such operations, face challenges in power, cost, and area when integrated into lightweight edge devices. The RISC-V architecture, known for its modularity and open-source nature, offers a viable alternat…
▽ More
The proliferation of edge devices necessitates efficient computational architectures for lightweight tasks, particularly deep neural network (DNN) inference. Traditional NPUs, though effective for such operations, face challenges in power, cost, and area when integrated into lightweight edge devices. The RISC-V architecture, known for its modularity and open-source nature, offers a viable alternative. This paper introduces the RISC-V R-extension, a novel approach to enhancing DNN process efficiency on edge devices. The extension features rented-pipeline stages and architectural pipeline registers (APR), which optimize critical operation execution, thereby reducing latency and memory access frequency. Furthermore, this extension includes new custom instructions to support these architectural improvements. Through comprehensive analysis, this study demonstrates the boost of R-extension in edge device processing, setting the stage for more responsive and intelligent edge applications.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Vortex confinement through an unquantized magnetic flux
Authors:
Geunyong Kim,
Jinyoung Yun,
Jinho Yang,
Ilkyu Yang,
Dirk Wulferding,
Roman Movshovich,
Gil Young Cho,
Ki-Seok Kim,
Garam Hahn,
Jeehoon Kim
Abstract:
Geometrically confined superconductors often experience a breakdown in the quantization of magnetic flux owing to the incomplete screening of the supercurrent against the field penetration. In this study, we report that the confinement of a magnetic field occurs regardless of the dimensionality of the system, extending even to 1D linear potential systems. By utilizing a vector-field magnetic force…
▽ More
Geometrically confined superconductors often experience a breakdown in the quantization of magnetic flux owing to the incomplete screening of the supercurrent against the field penetration. In this study, we report that the confinement of a magnetic field occurs regardless of the dimensionality of the system, extending even to 1D linear potential systems. By utilizing a vector-field magnetic force microscope, we successfully create a vortex-antivortex pair connected by a 1D unquantized magnetic flux in ultra-thin superconducting films. Through an investigation of the manipulation and thermal behavior of the vortex pair, we uncover a long-range interaction mediated by the unquantized magnetic flux. These findings suggest a universal phenomenon of unquantized magnetic flux formation, independent of the geometry of the system. Our results present an experimental route for probing the impact of confinement on superconducting properties and order parameters in unconventional superconductors characterized by extremely low dimensionality.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Towards a Partial Computation offloading in In-networking Computing-Assisted MEC: A Digital Twin Approach
Authors:
Ibrahim Aliyu,
Awwal Arigi,
Seungmin Oh,
Tai-Won Um,
Jinsul Kim
Abstract:
This paper addresses the problem of minimizing latency with partial computation offloading within Industrial Internet-of-Things (IoT) systems in in-network computing (COIN)-assisted Multiaccess Edge Computing (C-MEC) via ultra-reliable and low latency communications (URLLC) links. We propose a digital twin (DT) scheme for a multiuser scenario, allowing collaborative partial task offloading from us…
▽ More
This paper addresses the problem of minimizing latency with partial computation offloading within Industrial Internet-of-Things (IoT) systems in in-network computing (COIN)-assisted Multiaccess Edge Computing (C-MEC) via ultra-reliable and low latency communications (URLLC) links. We propose a digital twin (DT) scheme for a multiuser scenario, allowing collaborative partial task offloading from user equipment (UE) to COIN-aided nodes or MEC. Specifically, we formulate the problem as joint task offloading decision, ratio and resource allocation. We employ game theory to create a low-complexity distributed offloading scheme in which the task offloading decision problem is modelled as an exact potential game. Double Deep Q-Network (DDQN) is utilized within the game to proactively predict optimal offloading ratio and resource allocation. This approach optimizes resource allocation across the whole system and enhances the robustness of the computing framework, ensuring efficient execution of computation-intensive services. Additionally, it addresses centralized approaches and UE resource contention issues, thus ensuring faster and more reliable communication.
△ Less
Submitted 8 April, 2024;
originally announced July 2024.
-
Shape Synthesis and 3D Ceramic Printing of Non-canonical MIMO Dielectric Resonator Antennas
Authors:
Binbin Yang,
Jaewoo Kim,
Trupti Bellundagi,
Jacob J. Adams
Abstract:
In this paper, we report a shape synthesis method for multi-mode dielectric resonator antennas (DRA) using characteristic mode theory (CMT) and a binary genetic algorithm (BGA). By including the antenna's characteristic modal responses (resonance frequencies and quality factors) in the cost function, the shape synthesis process is conducted without including excitation feeds. Through the optimizat…
▽ More
In this paper, we report a shape synthesis method for multi-mode dielectric resonator antennas (DRA) using characteristic mode theory (CMT) and a binary genetic algorithm (BGA). By including the antenna's characteristic modal responses (resonance frequencies and quality factors) in the cost function, the shape synthesis process is conducted without including excitation feeds. Through the optimization procedure, a non-canonical dielectric body is formed from tetrahedral elements to support the required modal properties. As a demonstration of the proposed design approach, two three-mode MIMO DRAs are synthesized from both a rectangular and a cylindrical volume to operate at 2.45 GHz. The synthesized MIMO DRA's complex shape (based on rectangle) is then fabricated using Nanoparticle jetted zirconia. A combination of probe and slot feeds are employed to excite the desired modes. Due to the orthogonality of the characteristic modes and the careful design of the feeding network, isolation $>20$ dB is achieved between all ports.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Revisiting Random Walks for Learning on Graphs
Authors:
Jinwoo Kim,
Olga Zaghen,
Ayhan Suleymanzade,
Youngmin Ryou,
Seunghoon Hong
Abstract:
We revisit a simple idea for machine learning on graphs, where a random walk on a graph produces a machine-readable record, and this record is processed by a deep neural network to directly make vertex-level or graph-level predictions. We refer to these stochastic machines as random walk neural networks, and show that we can design them to be isomorphism invariant while capable of universal approx…
▽ More
We revisit a simple idea for machine learning on graphs, where a random walk on a graph produces a machine-readable record, and this record is processed by a deep neural network to directly make vertex-level or graph-level predictions. We refer to these stochastic machines as random walk neural networks, and show that we can design them to be isomorphism invariant while capable of universal approximation of graph functions in probability. A useful finding is that almost any kind of record of random walk guarantees probabilistic invariance as long as the vertices are anonymized. This enables us to record random walks in plain text and adopt a language model to read these text records to solve graph tasks. We further establish a parallelism to message passing neural networks using tools from Markov chain theory, and show that over-smoothing in message passing is alleviated by construction in random walk neural networks, while over-squashing manifests as probabilistic under-reaching. We show that random walk neural networks based on pre-trained language models can solve several hard problems on graphs, such as separating strongly regular graphs where the 3-WL test fails, counting substructures, and transductive classification on arXiv citation network without training. Code is available at https://github.com/jw9730/random-walk.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Swish-T : Enhancing Swish Activation with Tanh Bias for Improved Neural Network Performance
Authors:
Youngmin Seo,
Jinha Kim,
Unsang Park
Abstract:
We propose the Swish-T family, an enhancement of the existing non-monotonic activation function Swish. Swish-T is defined by adding a Tanh bias to the original Swish function. This modification creates a family of Swish-T variants, each designed to excel in different tasks, showcasing specific advantages depending on the application context. The Tanh bias allows for broader acceptance of negative…
▽ More
We propose the Swish-T family, an enhancement of the existing non-monotonic activation function Swish. Swish-T is defined by adding a Tanh bias to the original Swish function. This modification creates a family of Swish-T variants, each designed to excel in different tasks, showcasing specific advantages depending on the application context. The Tanh bias allows for broader acceptance of negative values during initial training stages, offering a smoother non-monotonic curve than the original Swish. We ultimately propose the Swish-T$_{\textbf{C}}$ function, while Swish-T and Swish-T$_{\textbf{B}}$, byproducts of Swish-T$_{\textbf{C}}$, also demonstrate satisfactory performance. Furthermore, our ablation study shows that using Swish-T$_{\textbf{C}}$ as a non-parametric function can still achieve high performance. The superiority of the Swish-T family has been empirically demonstrated across various models and benchmark datasets, including MNIST, Fashion MNIST, SVHN, CIFAR-10, and CIFAR-100. The code is publicly available at https://github.com/ictseoyoungmin/Swish-T-pytorch.
△ Less
Submitted 3 July, 2024; v1 submitted 1 July, 2024;
originally announced July 2024.