-
Benchmarking Predictive Coding Networks -- Made Simple
Authors:
Luca Pinchetti,
Chang Qi,
Oleh Lokshyn,
Gaspard Olivers,
Cornelius Emde,
Mufeng Tang,
Amine M'Charrak,
Simon Frieder,
Bayar Menzat,
Rafal Bogacz,
Thomas Lukasiewicz,
Tommaso Salvatori
Abstract:
In this work, we tackle the problems of efficiency and scalability for predictive coding networks in machine learning. To do so, we first propose a library called PCX, whose focus lies on performance and simplicity, and provides a user-friendly, deep-learning oriented interface. Second, we use PCX to implement a large set of benchmarks for the community to use for their experiments. As most works…
▽ More
In this work, we tackle the problems of efficiency and scalability for predictive coding networks in machine learning. To do so, we first propose a library called PCX, whose focus lies on performance and simplicity, and provides a user-friendly, deep-learning oriented interface. Second, we use PCX to implement a large set of benchmarks for the community to use for their experiments. As most works propose their own tasks and architectures, do not compare one against each other, and focus on small-scale tasks, a simple and fast open-source library adopted by the whole community would address all of these concerns. Third, we perform extensive benchmarks using multiple algorithms, setting new state-of-the-art results in multiple tasks and datasets, as well as highlighting limitations inherent to PC that should be addressed. Thanks to the efficiency of PCX, we are able to analyze larger architectures than commonly used, providing baselines to galvanize community efforts towards one of the main open problems in the field: scalability. The code for PCX is available at \textit{https://github.com/liukidar/pcax}.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Exploring the Complex Ionization Environment of the Turbulent DM Tau Disk
Authors:
Deryl E. Long,
L. Ilsedore Cleeves,
Fred C. Adams,
Sean Andrews,
Edwin A. Bergin,
Viviana V. Guzmán,
Jane Huang,
A. Meredith Hughes,
Chunhua Qi,
Kamber Schwarz,
Jacob B. Simon,
David Wilner
Abstract:
Ionization drives important chemical and dynamical processes within protoplanetary disks, including the formation of organics and water in the cold midplane and the transportation of material via accretion and magneto-hydrodynamic (MHD) flows. Understanding these ionization-driven processes is crucial for understanding disk evolution and planet formation. We use new and archival ALMA observations…
▽ More
Ionization drives important chemical and dynamical processes within protoplanetary disks, including the formation of organics and water in the cold midplane and the transportation of material via accretion and magneto-hydrodynamic (MHD) flows. Understanding these ionization-driven processes is crucial for understanding disk evolution and planet formation. We use new and archival ALMA observations of HCO+, H13CO+, and N2H+ to produce the first forward-modeled 2D ionization constraints for the DM Tau protoplanetary disk. We include ionization from multiple sources and explore the disk chemistry under a range of ionizing conditions. Abundances from our 2D chemical models are post-processed using non-LTE radiative transfer, visibility sampling, and imaging, and are compared directly to the observed radial emission profiles. The observations are best fit by a modestly reduced CR ionization rate ($ζ_{CR}$ ~ 10$^{-18}$ s$^{-1}$) and a hard X-ray spectrum (hardness ratio [HR] = 0.3), which we associate with stellar flaring conditions. Our best-fit model under-produces emission in the inner disk, suggesting that there may be an additional mechanism enhancing ionization in DM Tau's inner disk. Overall, our findings highlight the complexity of ionization in protoplanetary disks and the need for high resolution multi-line studies.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Near-Field Multiuser Communications based on Sparse Arrays
Authors:
Kangjian Chen,
Chenhao Qi,
Geoffrey Ye Li,
Octavia A. Dobre
Abstract:
This paper considers near-field multiuser communications based on sparse arrays (SAs). First, for the uniform SAs (USAs), we analyze the beam gains of channel steering vectors, which shows that increasing the antenna spacings can effectively improve the spatial resolution of the antenna arrays to enhance the sum rate of multiuser communications. Then, we investigate nonuniform SAs (NSAs) to mitiga…
▽ More
This paper considers near-field multiuser communications based on sparse arrays (SAs). First, for the uniform SAs (USAs), we analyze the beam gains of channel steering vectors, which shows that increasing the antenna spacings can effectively improve the spatial resolution of the antenna arrays to enhance the sum rate of multiuser communications. Then, we investigate nonuniform SAs (NSAs) to mitigate the high multiuser interference from the grating lobes of the USAs. To maximize the sum rate of near-field multiuser communications, we optimize the antenna positions of the NSAs, where a successive convex approximation-based antenna position optimization algorithm is proposed. Moreover, we find that the channels of both the USAs and the NSAs show uniform sparsity in the defined surrogate distance-angle (SD-A) domain. Based on the channel sparsity, an on-grid SD-A-domain orthogonal matching pursuit (SDA-OMP) algorithm is developed to estimate multiuser channels. To further improve the resolution of the SDA-OMP, we also design an off-grid SD-A-domain iterative super-resolution channel estimation algorithm. Simulation results demonstrate the superior performance of the proposed methods.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Micro-expression recognition based on depth map to point cloud
Authors:
Ren Zhang,
Jianqin Yin,
Chao Qi,
Zehao Wang,
Zhicheng Zhang,
Yonghao Dang
Abstract:
Micro-expressions are nonverbal facial expressions that reveal the covert emotions of individuals, making the micro-expression recognition task receive widespread attention. However, the micro-expression recognition task is challenging due to the subtle facial motion and brevity in duration. Many 2D image-based methods have been developed in recent years to recognize MEs effectively, but, these ap…
▽ More
Micro-expressions are nonverbal facial expressions that reveal the covert emotions of individuals, making the micro-expression recognition task receive widespread attention. However, the micro-expression recognition task is challenging due to the subtle facial motion and brevity in duration. Many 2D image-based methods have been developed in recent years to recognize MEs effectively, but, these approaches are restricted by facial texture information and are susceptible to environmental factors, such as lighting. Conversely, depth information can effectively represent motion information related to facial structure changes and is not affected by lighting. Motion information derived from facial structures can describe motion features that pixel textures cannot delineate. We proposed a network for micro-expression recognition based on facial depth information, and our experiments have demonstrated the crucial role of depth maps in the micro-expression recognition task. Initially, we transform the depth map into a point cloud and obtain the motion information for each point by aligning the initiating frame with the apex frame and performing a differential operation. Subsequently, we adjusted all point cloud motion feature input dimensions and used them as inputs for multiple point cloud networks to assess the efficacy of this representation. PointNet++ was chosen as the ultimate outcome for micro-expression recognition due to its superior performance. Our experiments show that our proposed method significantly outperforms the existing deep learning methods, including the baseline, on the $CAS(ME)^3$ dataset, which includes depth information.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Evidence for Non-zero Turbulence in the Protoplanetary disc around IM Lup
Authors:
Kevin Flaherty,
A. Meredith Hughes,
Jacob B. Simon,
Alicia Smith Reina,
Chunhua Qi,
Xue-Ning Bai,
Sean M. Andrews,
David J. Wilner,
Agnes Kospal
Abstract:
The amount of turbulence in protoplanetary discs around young stars is critical for determining the efficiency, timeline, and outcomes of planet formation. It is also difficult to measure. Observations are still limited, but direct measurements of the non-thermal, turbulent gas motion are possible with the Atacama Large Millimeter/submillimeter Array (ALMA). Using CO(2-1)/$^{13}$CO(2-1)/C$^{18}$O(…
▽ More
The amount of turbulence in protoplanetary discs around young stars is critical for determining the efficiency, timeline, and outcomes of planet formation. It is also difficult to measure. Observations are still limited, but direct measurements of the non-thermal, turbulent gas motion are possible with the Atacama Large Millimeter/submillimeter Array (ALMA). Using CO(2-1)/$^{13}$CO(2-1)/C$^{18}$O(2-1) ALMA observations of the disc around IM Lup at ~0.4" (~60 au) resolution we find evidence of significant turbulence, at the level of $δv_{\rm turb}=(0.18-0.30)$c$_s$. This result is robust against systematic uncertainties (e.g., amplitude flux calibration, midplane gas temperature, disc self-gravity). We find that gravito-turbulence as the source of the gas motion is unlikely based on the lack of an imprint on the rotation curve from a massive disc, while magneto-rotational instabilities and hydrodynamic instabilities are still possible, depending on the unknown magnetic field strength and the cooling timescale in the outer disc.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
PLDNet: PLD-Guided Lightweight Deep Network Boosted by Efficient Attention for Handheld Dual-Microphone Speech Enhancement
Authors:
Nan Zhou,
Youhai Jiang,
Jialin Tan,
Chongmin Qi
Abstract:
Low-complexity speech enhancement on mobile phones is crucial in the era of 5G. Thus, focusing on handheld mobile phone communication scenario, based on power level difference (PLD) algorithm and lightweight U-Net, we propose PLD-guided lightweight deep network (PLDNet), an extremely lightweight dual-microphone speech enhancement method that integrates the guidance of signal processing algorithm a…
▽ More
Low-complexity speech enhancement on mobile phones is crucial in the era of 5G. Thus, focusing on handheld mobile phone communication scenario, based on power level difference (PLD) algorithm and lightweight U-Net, we propose PLD-guided lightweight deep network (PLDNet), an extremely lightweight dual-microphone speech enhancement method that integrates the guidance of signal processing algorithm and lightweight attention-augmented U-Net. For the guidance information, we employ PLD algorithm to pre-process dual-microphone spectrum, and feed the output into subsequent deep neural network, which utilizes a lightweight U-Net with our proposed gated convolution augmented frequency attention (GCAFA) module to extract desired clean speech. Experimental results demonstrate that our proposed method achieves competitive performance with recent top-performing models while reducing computational cost by over 90%, highlighting the potential for low-complexity speech enhancement on mobile phones.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Low CI/CO Abundance Ratio Revealed by HST UV Spectroscopy of CO-rich Debris Disks
Authors:
Aoife Brennan,
Luca Matrà,
Sebastián Marino,
David Wilner,
Chunhua Qi,
A. Meredith Hughes,
Aki Roberge,
Antonio S. Hales,
Seth Redfield
Abstract:
The origin and evolution of CO gas in debris disks has been debated since its initial detection. The gas could have a primordial origin, as a remnant of the protoplanetary disk or a secondary exocometary origin. This paper investigates the origin of gas in two debris disks, HD110058 and HD131488, using HST observations of CI and CO, which play critical roles in the gas evolution. We fitted several…
▽ More
The origin and evolution of CO gas in debris disks has been debated since its initial detection. The gas could have a primordial origin, as a remnant of the protoplanetary disk or a secondary exocometary origin. This paper investigates the origin of gas in two debris disks, HD110058 and HD131488, using HST observations of CI and CO, which play critical roles in the gas evolution. We fitted several electronic transitions of CI and CO rovibronic bands to derive column densities and temperatures for each system, revealing high CO column densities ($\sim$3-4 orders of magnitude higher than $β$ Pictoris), and low CI/CO ratios in both. Using the exogas model, we simulated the radial evolution of the gas in the debris disk assuming a secondary gas origin. We explored a wide range of CO exocometary release rates and $α$ viscosities, which are the key parameters of the model. Additionally, we incorporated photodissociation due to stellar UV to the exogas model and found that it is negligible for typical CO-rich disks and host stars, even at a few au due to the high radial optical depths in the EUV. We find that the current steady-state secondary release model cannot simultaneously reproduce the CO and CI HST-derived column densities, as it predicts larger CI/CO ratios than observed. Our direct UV measurement of low CI/CO ratios agrees with results derived from recent ALMA findings and may point to vertical layering of CI, additional CI removal, CO shielding processes, or different gas origin scenarios.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Data quality control system and long-term performance monitor of the LHAASO-KM2A
Authors:
Zhen Cao,
F. Aharonian,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
W. Bian,
A. V. Bukevich,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
H. X. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. Chen
, et al. (263 additional authors not shown)
Abstract:
The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To…
▽ More
The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively.
△ Less
Submitted 13 June, 2024; v1 submitted 20 May, 2024;
originally announced May 2024.
-
Martian seismic anisotropy underneath Elysium Planitia revealed by direct S wave splitting
Authors:
Jing Shi,
Cunrui Han,
Tao Wang,
Chao Qi,
Han Chen,
Zhihan Yu,
Jiaqi Geng,
Minghan Yang,
Xu Wang,
Ling Chen,
Hejiu Hui
Abstract:
Seismic anisotropy, arising from the crystallographic or lattice-preferred orientation of anisotropic minerals or the shape-preferred orientation of melts or cracks, can establish a critical link between Mars's past evolution and its current state. So far, although seismic anisotropy in Mars has been proposed due to different velocities of vertically and horizontally polarized shear waves in the M…
▽ More
Seismic anisotropy, arising from the crystallographic or lattice-preferred orientation of anisotropic minerals or the shape-preferred orientation of melts or cracks, can establish a critical link between Mars's past evolution and its current state. So far, although seismic anisotropy in Mars has been proposed due to different velocities of vertically and horizontally polarized shear waves in the Martian crust, obtained from crustal converted waves, multiples, and surface waves recorded by the InSight seismometer, the evidence is plausible. Notably, the shear wave splitting, which stands out as a straight indicator of seismic anisotropy, has not been reported using marsquake records. In this study, we employ Low-frequency marsquakes detected by the InSight seismometer to reveal shear wave splitting in Mars. We find that the direct S waves of three marsquake recordings (S0173a, S0235b, and S1133c) with high signal-to-noise ratios exhibit the splitting pheonmenon. We rule out the possibility of apparent anisotropy through synthetic tests, affirming the presence of seismic anisotropy in Mars. The delay time (about 1.33 s on average) measured from the direct S wave splitting is too large to be solely attributed to the seismic anisotropy in the upper crust (0 - 10 km) beneath the InSight. Thus, seismic anisotropy in the deeper region of Mars is indispensable. Combined with other geophysical evidence near the InSight landing site, the strong seismic anisotropy observed in this study implies the porous crust with aligned cracks being greater than 10 km beneath the InSight and/or the presence of an active mantle plume underneath the Elysium Planitia of Mars.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Search for solar axions by Primakoff effect with the full dataset of the CDEX-1B Experiment
Authors:
L. T. Yang,
S. K. Liu,
Q. Yue,
K. J. Kang,
Y. J. Li,
H. P. An,
Greeshma C.,
J. P. Chang,
Y. H. Chen,
J. P. Cheng,
W. H. Dai,
Z. Deng,
C. H. Fang,
X. P. Geng,
H. Gong,
Q. J. Guo,
T. Guo,
X. Y. Guo,
L. He,
J. R. He,
J. W. Hu,
H. X. Huang,
T. C. Huang,
L. Jiang,
S. Karmakar
, et al. (61 additional authors not shown)
Abstract:
We present the first limit on $g_{Aγ}$ coupling constant using the Bragg-Primakoff conversion based on an exposure of 1107.5 kg days of data from the CDEX-1B experiment at the China Jinping Underground Laboratory. The data are consistent with the null signal hypothesis, and no excess signals are observed. Limits of the coupling $g_{Aγ}<2.08\times10^{-9}$ GeV$^{-1}$ (95\% C.L.) are derived for axio…
▽ More
We present the first limit on $g_{Aγ}$ coupling constant using the Bragg-Primakoff conversion based on an exposure of 1107.5 kg days of data from the CDEX-1B experiment at the China Jinping Underground Laboratory. The data are consistent with the null signal hypothesis, and no excess signals are observed. Limits of the coupling $g_{Aγ}<2.08\times10^{-9}$ GeV$^{-1}$ (95\% C.L.) are derived for axions with mass up to 100 eV/$c^2$. Within the hadronic model of KSVZ, our results exclude axion mass $>5.3~\rm{eV}/c^2$ at 95\% C.L.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
Authors:
Caleb Chuck,
Carl Qi,
Michael J. Munje,
Shuozhe Li,
Max Rudolph,
Chang Shi,
Siddhant Agarwal,
Harshit Sikchi,
Abhinav Peri,
Sarthak Dayal,
Evan Kuo,
Kavan Mehta,
Anthony Wang,
Peter Stone,
Amy Zhang,
Scott Niekum
Abstract:
Reinforcement Learning is a promising tool for learning complex policies even in fast-moving and object-interactive domains where human teleoperation or hard-coded policies might fail. To effectively reflect this challenging category of tasks, we introduce a dynamic, interactive RL testbed based on robot air hockey. By augmenting air hockey with a large family of tasks ranging from easy tasks like…
▽ More
Reinforcement Learning is a promising tool for learning complex policies even in fast-moving and object-interactive domains where human teleoperation or hard-coded policies might fail. To effectively reflect this challenging category of tasks, we introduce a dynamic, interactive RL testbed based on robot air hockey. By augmenting air hockey with a large family of tasks ranging from easy tasks like reaching, to challenging ones like pushing a block by hitting it with a puck, as well as goal-based and human-interactive tasks, our testbed allows a varied assessment of RL capabilities. The robot air hockey testbed also supports sim-to-real transfer with three domains: two simulators of increasing fidelity and a real robot system. Using a dataset of demonstration data gathered through two teleoperation systems: a virtualized control environment, and human shadowing, we assess the testbed with behavior cloning, offline RL, and RL from scratch.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
Towards Improving Learning from Demonstration Algorithms via MCMC Methods
Authors:
Carl Qi,
Edward Sun,
Harry Zhang
Abstract:
Behavioral cloning, or more broadly, learning from demonstrations (LfD) is a priomising direction for robot policy learning in complex scenarios. Albeit being straightforward to implement and data-efficient, behavioral cloning has its own drawbacks, limiting its efficacy in real robot setups. In this work, we take one step towards improving learning from demonstration algorithms by leveraging impl…
▽ More
Behavioral cloning, or more broadly, learning from demonstrations (LfD) is a priomising direction for robot policy learning in complex scenarios. Albeit being straightforward to implement and data-efficient, behavioral cloning has its own drawbacks, limiting its efficacy in real robot setups. In this work, we take one step towards improving learning from demonstration algorithms by leveraging implicit energy-based policy models. Results suggest that in selected complex robot policy learning scenarios, treating supervised policy learning with an implicit model generally performs better, on average, than commonly used neural network-based explicit models, especially in the cases of approximating potentially discontinuous and multimodal functions.
△ Less
Submitted 23 May, 2024; v1 submitted 3 May, 2024;
originally announced May 2024.
-
MoST: Multi-modality Scene Tokenization for Motion Prediction
Authors:
Norman Mu,
Jingwei Ji,
Zhenpei Yang,
Nate Harada,
Haotian Tang,
Kan Chen,
Charles R. Qi,
Runzhou Ge,
Kratarth Goel,
Zoey Yang,
Scott Ettinger,
Rami Al-Rfou,
Dragomir Anguelov,
Yin Zhou
Abstract:
Many existing motion prediction approaches rely on symbolic perception outputs to generate agent trajectories, such as bounding boxes, road graph information and traffic lights. This symbolic representation is a high-level abstraction of the real world, which may render the motion prediction model vulnerable to perception errors (e.g., failures in detecting open-vocabulary obstacles) while missing…
▽ More
Many existing motion prediction approaches rely on symbolic perception outputs to generate agent trajectories, such as bounding boxes, road graph information and traffic lights. This symbolic representation is a high-level abstraction of the real world, which may render the motion prediction model vulnerable to perception errors (e.g., failures in detecting open-vocabulary obstacles) while missing salient information from the scene context (e.g., poor road conditions). An alternative paradigm is end-to-end learning from raw sensors. However, this approach suffers from the lack of interpretability and requires significantly more training resources. In this work, we propose tokenizing the visual world into a compact set of scene elements and then leveraging pre-trained image foundation models and LiDAR neural networks to encode all the scene elements in an open-vocabulary manner. The image foundation model enables our scene tokens to encode the general knowledge of the open world while the LiDAR neural network encodes geometry information. Our proposed representation can efficiently encode the multi-frame multi-modality observations with a few hundred tokens and is compatible with most transformer-based architectures. To evaluate our method, we have augmented Waymo Open Motion Dataset with camera embeddings. Experiments over Waymo Open Motion Dataset show that our approach leads to significant performance improvements over the state-of-the-art.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
First Search for Light Fermionic Dark Matter Absorption on Electrons Using Germanium Detector in CDEX-10 Experiment
Authors:
J. X. Liu,
L. T. Yang,
Q. Yue,
K. J. Kang,
Y. J. Li,
H. P. An,
Greeshma C.,
J. P. Chang,
Y. H. Chen,
J. P. Cheng,
W. H. Dai,
Z. Deng,
C. H. Fang,
X. P. Geng,
H. Gong,
Q. J. Guo,
T. Guo,
X. Y. Guo,
L. He,
J. R. He,
J. W. Hu,
H. X. Huang,
T. C. Huang,
L. Jiang,
S. Karmakar
, et al. (61 additional authors not shown)
Abstract:
We present the first results of the search for sub-MeV fermionic dark matter absorbed by electron targets of Germanium using the 205.4~kg$\cdot$day data collected by the CDEX-10 experiment, with the analysis threshold of 160~eVee. No significant dark matter (DM) signals over the background are observed. Results are presented as limits on the cross section of DM--electron interaction. We present ne…
▽ More
We present the first results of the search for sub-MeV fermionic dark matter absorbed by electron targets of Germanium using the 205.4~kg$\cdot$day data collected by the CDEX-10 experiment, with the analysis threshold of 160~eVee. No significant dark matter (DM) signals over the background are observed. Results are presented as limits on the cross section of DM--electron interaction. We present new constraints of cross section in the DM range of 0.1--10 keV/$c^2$ for vector and axial-vector interaction. The upper limit on the cross section is set to be $\rm 5.5\times10^{-46}~cm^2$ for vector interaction, and $\rm 1.8\times10^{-46}~cm^2$ for axial-vector interaction at DM mass of 5 keV/$c^2$.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Radial and vertical constraints on the icy origin of H$_{2}$CO in the HD 163296 Protoplanetary Disk
Authors:
Claudio Hernández-Vera,
Viviana V. Guzmán,
Elizabeth Artur de la Villarmois,
Karin I. Öberg,
L. Ilsedore Cleeves,
Michiel R. Hogerheijde,
Chunhua Qi,
John Carpenter,
Edith C. Fayolle
Abstract:
H$_2$CO is a small organic molecule widely detected in protoplanetary disks. As a precursor to grain-surface formation of CH$_3$OH, H$_2$CO is considered an important precursor of O-bearing organic molecules that are locked in ices. Still, since gas-phase reactions can also form H$_2$CO, there remains an open question on the channels by which organics form in disks, and how much the grain versus t…
▽ More
H$_2$CO is a small organic molecule widely detected in protoplanetary disks. As a precursor to grain-surface formation of CH$_3$OH, H$_2$CO is considered an important precursor of O-bearing organic molecules that are locked in ices. Still, since gas-phase reactions can also form H$_2$CO, there remains an open question on the channels by which organics form in disks, and how much the grain versus the gas pathways impact the overall organic reservoir. We present spectrally and spatially resolved Atacama Large Millimeter/submillimeter Array observations of several ortho- and para-H$_2$CO transitions toward the bright protoplanetary disk around the Herbig Ae star HD 163296. We derive column density, excitation temperature, and ortho-to-para ratio (OPR) radial profiles for H$_2$CO, as well as disk-averaged values of $N_{\mathrm{T}}\sim4\times 10^{12}$ cm$^{-2}$, $T_{\mathrm{ex}}\sim20$ K, and $\mathrm{OPR}\sim2.7$, respectively. We empirically determine the vertical structure of the emission, finding vertical heights of $z/r\sim0.1$. From the profiles, we find a relatively constant $\mathrm{OPR}\sim2.7$ with radius, but still consistent with $3.0$ among the uncertainties, a secondary increase of $N_{\mathrm{T}}$ in the outer disk, and low $T_{\mathrm{ex}}$ values that decrease with disk radius. Our resulting radial, vertical, and OPR constraints suggest an increased UV penetration beyond the dust millimeter edge, consistent with an icy origin but also with cold gas-phase chemistry. This Herbig disk contrasts previous results for the T Tauri disk, TW Hya, which had a larger contribution from cold gas-phase chemistry. More observations of other sources are needed to disentangle the dominant formation pathway of H$_2$CO in protoplanetary disks.
△ Less
Submitted 24 May, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Constraints on the Blazar-Boosted Dark Matter from the CDEX-10 Experiment
Authors:
R. Xu,
L. T. Yang,
Q. Yue,
K. J. Kang,
Y. J. Li,
H. P. An,
Greeshma C.,
J. P. Chang,
Y. H. Chen,
J. P. Cheng,
W. H. Dai,
Z. Deng,
C. H. Fang,
X. P. Geng,
H. Gong,
Q. J. Guo,
T. Guo,
X. Y. Guo,
L. He,
S. M. He,
J. W. Hu,
H. X. Huang,
T. C. Huang,
L. Jiang,
S. Karmakar
, et al. (59 additional authors not shown)
Abstract:
We report new constraints on light dark matter (DM) boosted by blazars using the 205.4 kg day data from the CDEX-10 experiment located at the China Jinping Underground Laboratory. Two representative blazars, TXS 0506+56 and BL Lacertae are studied. The results derived from TXS 0506+56 exclude DM-nucleon elastic scattering cross sections from $4.6\times 10^{-33}\ \rm cm^2$ to…
▽ More
We report new constraints on light dark matter (DM) boosted by blazars using the 205.4 kg day data from the CDEX-10 experiment located at the China Jinping Underground Laboratory. Two representative blazars, TXS 0506+56 and BL Lacertae are studied. The results derived from TXS 0506+56 exclude DM-nucleon elastic scattering cross sections from $4.6\times 10^{-33}\ \rm cm^2$ to $1\times10^{-26}\ \rm cm^2$ for DM masses between 10 keV and 1 GeV, and the results derived from BL Lacertae exclude DM-nucleon elastic scattering cross sections from $2.4\times 10^{-34}\ \rm cm^2$ to $1\times10^{-26}\ \rm cm^2$ for the same range of DM masses. The constraints correspond to the best sensitivities among solid-state detector experiments in the sub-MeV mass range.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Probing Dark Matter Particles from Evaporating Primordial Black Holes via Electron Scattering in the CDEX-10 Experiment
Authors:
Z. H. Zhang,
L. T. Yang,
Q. Yue,
K. J. Kang,
Y. J. Li,
H. P. An,
Greeshma C.,
J. P. Chang,
Y. H. Chen,
J. P. Cheng,
W. H. Dai,
Z. Deng,
C. H. Fang,
X. P. Geng,
H. Gong,
Q. J. Guo,
T. Guo,
X. Y. Guo,
L. He,
S. M. He,
J. W. Hu,
H. X. Huang,
T. C. Huang,
L. Jiang,
S. Karmakar
, et al. (59 additional authors not shown)
Abstract:
Dark matter (DM) is a major constituent of the Universe. However, no definite evidence of DM particles (denoted as ``$χ$") has been found in DM direct detection (DD) experiments to date. There is a novel concept that detecting $χ$ from evaporating primordial black holes (PBHs). We search for $χ$ emitted from PBHs by investigating their interaction with target electrons. The examined PBH masses ran…
▽ More
Dark matter (DM) is a major constituent of the Universe. However, no definite evidence of DM particles (denoted as ``$χ$") has been found in DM direct detection (DD) experiments to date. There is a novel concept that detecting $χ$ from evaporating primordial black holes (PBHs). We search for $χ$ emitted from PBHs by investigating their interaction with target electrons. The examined PBH masses range from 1$\times$10$^{15}$ to 7$\times$10$^{16}$ g under the current limits of PBH abundance $f_{PBH}$. Using 205.4 kg$\cdot$day data obtained from the CDEX-10 experiment conducted in the China Jinping Underground Laboratory, we exclude the $χ$--electron ($χ$--$e$) elastic-scattering cross section $σ_{χe} \sim 5\times10^{-29}$ cm$^2$ for $χ$ with a mass $m_χ\lesssim$ 0.1 keV from our results. If ($m_χ$, $σ_{χe}$) can be determined in the future, DD experiments are expected to impose strong constraints on $f_{PBH}$ for large $M_{PBH}$s.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts
Authors:
Yue Ma,
Yingqing He,
Hongfa Wang,
Andong Wang,
Chenyang Qi,
Chengfei Cai,
Xiu Li,
Zhifeng Li,
Heung-Yeung Shum,
Wei Liu,
Qifeng Chen
Abstract:
Despite recent advances in image-to-video generation, better controllability and local animation are less explored. Most existing image-to-video methods are not locally aware and tend to move the entire scene. However, human artists may need to control the movement of different objects or regions. Additionally, current I2V methods require users not only to describe the target motion but also to pr…
▽ More
Despite recent advances in image-to-video generation, better controllability and local animation are less explored. Most existing image-to-video methods are not locally aware and tend to move the entire scene. However, human artists may need to control the movement of different objects or regions. Additionally, current I2V methods require users not only to describe the target motion but also to provide redundant detailed descriptions of frame contents. These two issues hinder the practical utilization of current I2V tools. In this paper, we propose a practical framework, named Follow-Your-Click, to achieve image animation with a simple user click (for specifying what to move) and a short motion prompt (for specifying how to move). Technically, we propose the first-frame masking strategy, which significantly improves the video generation quality, and a motion-augmented module equipped with a short motion prompt dataset to improve the short prompt following abilities of our model. To further control the motion speed, we propose flow-based motion magnitude control to control the speed of target movement more precisely. Our framework has simpler yet precise user control and better generation performance than previous methods. Extensive experiments compared with 7 baselines, including both commercial tools and research methods on 8 metrics, suggest the superiority of our approach. Project Page: https://follow-your-click.github.io/
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Molecular Gas Tracers in Young and Old Protoplanetary Disks
Authors:
Dana E. Anderson,
L. Ilsedore Cleeves,
Geoffrey A. Blake,
Chunhua Qi,
Edwin A. Bergin,
John M. Carpenter,
Kamber R. Schwarz,
Claire Thilenius,
Ke Zhang
Abstract:
Molecular emission is used to investigate both the physical and chemical properties of protoplanetary disks. Therefore, to accurately derive disk properties, we need a thorough understanding of the behavior of the molecular probes we rely on. Here we investigate how the molecular line emission of N$_2$H$^+$, HCO$^+$, HCN, and C$^{18}$O compare to other measured quantities in a set of 20 protoplane…
▽ More
Molecular emission is used to investigate both the physical and chemical properties of protoplanetary disks. Therefore, to accurately derive disk properties, we need a thorough understanding of the behavior of the molecular probes we rely on. Here we investigate how the molecular line emission of N$_2$H$^+$, HCO$^+$, HCN, and C$^{18}$O compare to other measured quantities in a set of 20 protoplanetary disks. Overall, we find positive correlations between multiple line fluxes and the disk dust mass and radius. We also generally find strong positive correlations between the line fluxes of different molecular species. However, some disks do show noticeable differences in the relative fluxes of N$_2$H$^+$, HCO$^+$, HCN, and C$^{18}$O. These differences occur even within a single star-forming region. This results in a potentially large range of different disk masses and chemical compositions for systems of similar age and birth environment. While we make preliminary comparisons of molecular fluxes across different star-forming regions, more complete and uniform samples are needed in the future to search for trends with birth environment or age.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Active Learning for Graphs with Noisy Structures
Authors:
Hongliang Chi,
Cong Qi,
Suhang Wang,
Yao Ma
Abstract:
Graph Neural Networks (GNNs) have seen significant success in tasks such as node classification, largely contingent upon the availability of sufficient labeled nodes. Yet, the excessive cost of labeling large-scale graphs led to a focus on active learning on graphs, which aims for effective data selection to maximize downstream model performance. Notably, most existing methods assume reliable grap…
▽ More
Graph Neural Networks (GNNs) have seen significant success in tasks such as node classification, largely contingent upon the availability of sufficient labeled nodes. Yet, the excessive cost of labeling large-scale graphs led to a focus on active learning on graphs, which aims for effective data selection to maximize downstream model performance. Notably, most existing methods assume reliable graph topology, while real-world scenarios often present noisy graphs. Given this, designing a successful active learning framework for noisy graphs is highly needed but challenging, as selecting data for labeling and obtaining a clean graph are two tasks naturally interdependent: selecting high-quality data requires clean graph structure while cleaning noisy graph structure requires sufficient labeled data. Considering the complexity mentioned above, we propose an active learning framework, GALClean, which has been specifically designed to adopt an iterative approach for conducting both data selection and graph purification simultaneously with best information learned from the prior iteration. Importantly, we summarize GALClean as an instance of the Expectation-Maximization algorithm, which provides a theoretical understanding of its design and mechanisms. This theory naturally leads to an enhanced version, GALClean+. Extensive experiments have demonstrated the effectiveness and robustness of our proposed method across various types and levels of noisy graphs.
△ Less
Submitted 3 February, 2024;
originally announced February 2024.
-
JWST-MIRI Spectroscopy of Warm Molecular Emission and Variability in the AS 209 Disk
Authors:
Carlos E. Muñoz-Romero,
Karin I. Öberg,
Andrea Banzatti,
Klaus M. Pontoppidan,
Sean M. Andrews,
David J. Wilner,
Edwin A. Bergin,
Ian Czekala,
Charles J. Law,
Colette Salyk,
Richard Teague,
Chunhua Qi,
Jennifer B. Bergner,
Jane Huang,
Catherine Walsh,
Viviana V. Guzmán,
L. Ilsedore Cleeves,
Yuri Aikawa,
Jaehan Bae,
Alice S. Booth,
Gianni Cataldi,
John D. Ilee,
Romane Le Gal,
Feng Long,
Ryan A. Loomis
, et al. (2 additional authors not shown)
Abstract:
We present MIRI MRS observations of the large, multi-gapped protoplanetary disk around the T-Tauri star AS 209. The observations reveal hundreds of water vapor lines from 4.9 to 25.5 $μ$m towards the inner $\sim1$ au in the disk, including the first detection of ro-vibrational water emission in this disk. The spectrum is dominated by hot ($\sim800$ K) water vapor and OH gas, with only marginal det…
▽ More
We present MIRI MRS observations of the large, multi-gapped protoplanetary disk around the T-Tauri star AS 209. The observations reveal hundreds of water vapor lines from 4.9 to 25.5 $μ$m towards the inner $\sim1$ au in the disk, including the first detection of ro-vibrational water emission in this disk. The spectrum is dominated by hot ($\sim800$ K) water vapor and OH gas, with only marginal detections of CO$_2$, HCN, and a possible colder water vapor component. Using slab models with a detailed treatment of opacities and line overlap, we retrieve the column density, emitting area, and excitation temperature of water vapor and OH, and provide upper limits for the observable mass of other molecules. Compared to MIRI spectra of other T-Tauri disks, the inner disk of AS 209 does not appear to be atypically depleted in CO$_2$ nor HCN. Based on \textit{Spitzer IRS} observations, we further find evidence for molecular emission variability over a 10-year baseline. Water, OH, and CO$_2$ line luminosities have decreased by factors 2-4 in the new MIRI epoch, yet there are minimal continuum emission variations. The origin of this variability is yet to be understood.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Triple-Refined Hybrid-Field Beam Training for mmWave Extremely Large-Scale MIMO
Authors:
Kangjian Chen,
Chenhao Qi,
Octavia A. Dobre,
Geoffrey Ye Li
Abstract:
This paper investigates beam training for extremely large-scale multiple-input multiple-output systems. By considering both the near field and far field, a triple-refined hybrid-field beam training scheme is proposed, where high-accuracy estimates of channel parameters are obtained through three steps of progressive beam refinement. First, the hybrid-field beam gain (HFBG)-based first refinement m…
▽ More
This paper investigates beam training for extremely large-scale multiple-input multiple-output systems. By considering both the near field and far field, a triple-refined hybrid-field beam training scheme is proposed, where high-accuracy estimates of channel parameters are obtained through three steps of progressive beam refinement. First, the hybrid-field beam gain (HFBG)-based first refinement method is developed. Based on the analysis of the HFBG, the first-refinement codebook is designed and the beam training is performed accordingly to narrow down the potential region of the channel path. Then, the maximum likelihood (ML)-based and principle of stationary phase (PSP)-based second refinement methods are developed. By exploiting the measurements of the beam training, the ML is used to estimate the channel parameters. To avoid the high computational complexity of ML, closed-form estimates of the channel parameters are derived according to the PSP. Moreover, the Gaussian approximation (GA)-based third refinement method is developed. The hybrid-field neighboring search is first performed to identify the potential region of the main lobe of the channel steering vector. Afterwards, by applying the GA, a least-squares estimator is developed to obtain the high-accuracy channel parameter estimation. Simulation results verify the effectiveness of the proposed scheme.
△ Less
Submitted 20 January, 2024;
originally announced January 2024.
-
ACT-GAN: Radio map construction based on generative adversarial networks with ACT blocks
Authors:
Chen Qi,
Yang Jingjing,
Huang Ming,
Zhou Qiang
Abstract:
The radio map, serving as a visual representation of electromagnetic spatial characteristics, plays a pivotal role in assessment of wireless communication networks and radio monitoring coverage. Addressing the issue of low accuracy existing in the current radio map construction, this paper presents a novel radio map construction method based on generative adversarial network (GAN) in which the Agg…
▽ More
The radio map, serving as a visual representation of electromagnetic spatial characteristics, plays a pivotal role in assessment of wireless communication networks and radio monitoring coverage. Addressing the issue of low accuracy existing in the current radio map construction, this paper presents a novel radio map construction method based on generative adversarial network (GAN) in which the Aggregated Contextual-Transformation (AOT) block, Convolutional Block Attention Module (CBAM), and Transposed Convolution (T-Conv) block are applied to the generator, and we name it as ACT-GAN. It significantly improves the reconstruction accuracy and local texture of the radio maps. The performance of ACT-GAN across three different scenarios is demonstrated. Experiment results reveal that in the scenario without sparse discrete observations, the proposed method reduces the root mean square error (RMSE) by 14.6% in comparison to the state-of-the-art models. In the scenario with sparse discrete observations, the RMSE is diminished by 13.2%. Furthermore, the predictive results of the proposed model show a more lucid representation of electromagnetic spatial field distribution. To verify the universality of this model in radio map construction tasks, the scenario of unknown radio emission source is investigated. The results indicate that the proposed model is robust radio map construction and accurate in predicting the location of the emission source.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
SPIRE: Semantic Prompt-Driven Image Restoration
Authors:
Chenyang Qi,
Zhengzhong Tu,
Keren Ye,
Mauricio Delbracio,
Peyman Milanfar,
Qifeng Chen,
Hossein Talebi
Abstract:
Text-driven diffusion models have become increasingly popular for various image editing tasks, including inpainting, stylization, and object replacement. However, it still remains an open research problem to adopt this language-vision paradigm for more fine-level image processing tasks, such as denoising, super-resolution, deblurring, and compression artifact removal. In this paper, we develop SPI…
▽ More
Text-driven diffusion models have become increasingly popular for various image editing tasks, including inpainting, stylization, and object replacement. However, it still remains an open research problem to adopt this language-vision paradigm for more fine-level image processing tasks, such as denoising, super-resolution, deblurring, and compression artifact removal. In this paper, we develop SPIRE, a Semantic and restoration Prompt-driven Image Restoration framework that leverages natural language as a user-friendly interface to control the image restoration process. We consider the capacity of prompt information in two dimensions. First, we use content-related prompts to enhance the semantic alignment, effectively alleviating identity ambiguity in the restoration outcomes. Second, our approach is the first framework that supports fine-level instruction through language-based quantitative specification of the restoration strength, without the need for explicit task-specific design. In addition, we introduce a novel fusion mechanism that augments the existing ControlNet architecture by learning to rescale the generative prior, thereby achieving better restoration fidelity. Our extensive experiments demonstrate the superior restoration performance of SPIRE compared to the state of the arts, alongside offering the flexibility of text-based control over the restoration effects.
△ Less
Submitted 16 July, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
AnimateZero: Video Diffusion Models are Zero-Shot Image Animators
Authors:
Jiwen Yu,
Xiaodong Cun,
Chenyang Qi,
Yong Zhang,
Xintao Wang,
Ying Shan,
Jian Zhang
Abstract:
Large-scale text-to-video (T2V) diffusion models have great progress in recent years in terms of visual quality, motion and temporal consistency. However, the generation process is still a black box, where all attributes (e.g., appearance, motion) are learned and generated jointly without precise control ability other than rough text descriptions. Inspired by image animation which decouples the vi…
▽ More
Large-scale text-to-video (T2V) diffusion models have great progress in recent years in terms of visual quality, motion and temporal consistency. However, the generation process is still a black box, where all attributes (e.g., appearance, motion) are learned and generated jointly without precise control ability other than rough text descriptions. Inspired by image animation which decouples the video as one specific appearance with the corresponding motion, we propose AnimateZero to unveil the pre-trained text-to-video diffusion model, i.e., AnimateDiff, and provide more precise appearance and motion control abilities for it. For appearance control, we borrow intermediate latents and their features from the text-to-image (T2I) generation for ensuring the generated first frame is equal to the given generated image. For temporal control, we replace the global temporal attention of the original T2V model with our proposed positional-corrected window attention to ensure other frames align with the first frame well. Empowered by the proposed methods, AnimateZero can successfully control the generating progress without further training. As a zero-shot image animator for given images, AnimateZero also enables multiple new applications, including interactive video generation and real image animation. The detailed experiments demonstrate the effectiveness of the proposed method in both T2V and related applications.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
MagicStick: Controllable Video Editing via Control Handle Transformations
Authors:
Yue Ma,
Xiaodong Cun,
Yingqing He,
Chenyang Qi,
Xintao Wang,
Ying Shan,
Xiu Li,
Qifeng Chen
Abstract:
Text-based video editing has recently attracted considerable interest in changing the style or replacing the objects with a similar structure. Beyond this, we demonstrate that properties such as shape, size, location, motion, etc., can also be edited in videos. Our key insight is that the keyframe transformations of the specific internal feature (e.g., edge maps of objects or human pose), can easi…
▽ More
Text-based video editing has recently attracted considerable interest in changing the style or replacing the objects with a similar structure. Beyond this, we demonstrate that properties such as shape, size, location, motion, etc., can also be edited in videos. Our key insight is that the keyframe transformations of the specific internal feature (e.g., edge maps of objects or human pose), can easily propagate to other frames to provide generation guidance. We thus propose MagicStick, a controllable video editing method that edits the video properties by utilizing the transformation on the extracted internal control signals. In detail, to keep the appearance, we inflate both the pretrained image diffusion model and ControlNet to the temporal dimension and train low-rank adaptions (LORA) layers to fit the specific scenes. Then, in editing, we perform an inversion and editing framework. Differently, finetuned ControlNet is introduced in both inversion and generation for attention guidance with the proposed attention remix between the spatial attention maps of inversion and editing. Yet succinct, our method is the first method to show the ability of video property editing from the pre-trained text-to-image model. We present experiments on numerous examples within our unified framework. We also compare with shape-aware text-based editing and handcrafted motion video generation, demonstrating our superior temporal consistency and editing capability than previous works. The code and models will be made publicly available.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Unravelling DNS Performance: A Historical Examination of F-ROOT in Southeast Asia
Authors:
Jiajia Zhu,
Chao Qi
Abstract:
The DNS root server system uses Anycast technology to provide resolution through widely distributed root nodes. In recent years, the F-root node has seen astonishing growth and now boasts the largest number of nodes among the 13 root servers. Based on Ripe Atlas measurement data, we examined the availability and query latency of the F-root within the Southeast Asian region historically. The collec…
▽ More
The DNS root server system uses Anycast technology to provide resolution through widely distributed root nodes. In recent years, the F-root node has seen astonishing growth and now boasts the largest number of nodes among the 13 root servers. Based on Ripe Atlas measurement data, we examined the availability and query latency of the F-root within the Southeast Asian region historically. The collected data illustrates how latency varies with changes in the number of root nodes, how the geographic distribution of responding root nodes changes in different periods, and examines the most recent differences between countries in terms of latency distribution. This study sheds light on the evolving landscape of DNS infrastructure in Southeast Asia.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Multiuser Beamforming for Partially-Connected Millimeter Wave Massive MIMO
Authors:
Chenhao Qi,
Jinlin Hu,
Yang Du,
Arumugam Nallanathan
Abstract:
Multiuser beamforming is considered for partially-connected millimeter wave massive MIMO systems. Based on perfect channel state information (CSI), a low-complexity hybrid beamforming scheme that decouples the analog beamformer and the digital beamformer is proposed to maximize the sum-rate. The analog beamformer design is modeled as a phase alignment problem to harvest the array gain. Given the a…
▽ More
Multiuser beamforming is considered for partially-connected millimeter wave massive MIMO systems. Based on perfect channel state information (CSI), a low-complexity hybrid beamforming scheme that decouples the analog beamformer and the digital beamformer is proposed to maximize the sum-rate. The analog beamformer design is modeled as a phase alignment problem to harvest the array gain. Given the analog beamformer, the digital beamformer is designed by solving a weighted minimum mean squared error problem. Then based on imperfect CSI, an analog-only beamformer design scheme is proposed, where the design problem aims at maximizing the desired signal power on the current user and minimizing the power on the other users to mitigate the multiuser interference. The original problem is then transformed into a series of independent beam nulling subproblems, where an efficient iterative algorithm using the majorization-minimization framework is proposed to solve the subproblems. Simulation results show that, under perfect CSI, the proposed scheme achieves almost the same sum-rate performance as the existing schemes but with lower computational complexity; and under imperfect CSI, the proposed analog-only beamforming design scheme can effectively mitigate the multiuser interference.
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
Beam Training and Tracking for Extremely Large-Scale MIMO Communications
Authors:
Kangjian Chen,
Chenhao Qi,
Cheng-Xiang Wang,
Geoffrey Ye Li
Abstract:
In this paper, beam training and beam tracking are investigated for extremely large-scale multiple-input-multiple-output communication systems with partially-connected hybrid combining structures. Firstly, we propose a two-stage hybrid-field beam training scheme for both the near field and the far field. In the first stage, each subarray independently uses multiple far-field channel steering vecto…
▽ More
In this paper, beam training and beam tracking are investigated for extremely large-scale multiple-input-multiple-output communication systems with partially-connected hybrid combining structures. Firstly, we propose a two-stage hybrid-field beam training scheme for both the near field and the far field. In the first stage, each subarray independently uses multiple far-field channel steering vectors to approximate near-field ones for analog combining. To find the codeword best fitting for the channel, digital combiners in the second stage are designed to combine the outputs of the analog combiners from the first stage. Then, based on the principle of stationary phase and the time-frequency duality, the expressions of subarray signals after analog combining are analytically derived and a beam refinement based on phase shifts of subarrays~(BRPSS) scheme with closed-form solutions is proposed for high-resolution channel parameter estimation. Moreover, a low-complexity near-field beam tracking scheme is developed, where the kinematic model is adopted to characterize the channel variations and the extended Kalman filter is exploited for beam tracking. Simulation results verify the effectiveness of the proposed schemes.
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
Simultaneous Beam Training and Target Sensing in ISAC Systems with RIS
Authors:
Kangjian Chen,
Chenhao Qi,
Octavia A. Dobre,
Geoffrey Ye Li
Abstract:
This paper investigates an integrated sensing and communication (ISAC) system with reconfigurable intelligent surface (RIS). Our simultaneous beam training and target sensing (SBTTS) scheme enables the base station to perform beam training with the user terminals (UTs) and the RIS, and simultaneously to sense the targets. Based on our findings, the energy of the echoes from the RIS is accumulated…
▽ More
This paper investigates an integrated sensing and communication (ISAC) system with reconfigurable intelligent surface (RIS). Our simultaneous beam training and target sensing (SBTTS) scheme enables the base station to perform beam training with the user terminals (UTs) and the RIS, and simultaneously to sense the targets. Based on our findings, the energy of the echoes from the RIS is accumulated in the angle-delay domain while that from the targets is accumulated in the Doppler-delay domain. The SBTTS scheme can distinguish the RIS from the targets with the mixed echoes from the RIS and the targets. Then we propose a positioning and array orientation estimation (PAOE) scheme for both the line-of-sight channels and the non-line-of-sight channels based on the beam training results of SBTTS by developing a low-complexity two-dimensional fast search algorithm. Based on the SBTTS and PAOE schemes, we further compute the angle-of-arrival and angle-of-departure for the channels between the RIS and the UTs by exploiting the geometry relationship to accomplish the beam alignment of the ISAC system. Simulation results verify the effectiveness of the proposed schemes.
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
Key Issues in Wireless Transmission for NTN-Assisted Internet of Things
Authors:
Chenhao Qi,
Jing Wang,
Leyi Lyu,
Lei Tan,
Jinming Zhang,
Geoffrey Ye Li
Abstract:
Non-terrestrial networks (NTNs) have become appealing resolutions for seamless coverage in the next-generation wireless transmission, where a large number of Internet of Things (IoT) devices diversely distributed can be efficiently served. The explosively growing number of IoT devices brings a new challenge for massive connection. The long-distance wireless signal propagation in NTNs leads to seve…
▽ More
Non-terrestrial networks (NTNs) have become appealing resolutions for seamless coverage in the next-generation wireless transmission, where a large number of Internet of Things (IoT) devices diversely distributed can be efficiently served. The explosively growing number of IoT devices brings a new challenge for massive connection. The long-distance wireless signal propagation in NTNs leads to severe path loss and large latency, where the accurate acquisition of channel state information (CSI) is another challenge, especially for fast-moving non-terrestrial base stations (NTBSs). Moreover, the scarcity of on-board resources of NTBSs is also a challenge for resource allocation. To this end, we investigate three key issues, where the existing schemes and emerging resolutions for these three key issues have been comprehensively presented. The first issue is to enable the massive connection by designing random access to establish the wireless link and multiple access to transmit data streams. The second issue is to accurately acquire CSI in various channel conditions by channel estimation and beam training, where orthogonal time frequency space modulation and dynamic codebooks are on focus. The third issue is to efficiently allocate the wireless resources, including power allocation, spectrum sharing, beam hopping, and beamforming. At the end of this article, some future research topics are identified.
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
Fusion of the Power from Citations: Enhance your Influence by Integrating Information from References
Authors:
Cong Qi,
Qin Liu,
Kan Liu
Abstract:
Influence prediction plays a crucial role in the academic community. The amount of scholars' influence determines whether their work will be accepted by others. Most existing research focuses on predicting one paper's citation count after a period or identifying the most influential papers among the massive candidates, without concentrating on an individual paper's negative or positive impact on i…
▽ More
Influence prediction plays a crucial role in the academic community. The amount of scholars' influence determines whether their work will be accepted by others. Most existing research focuses on predicting one paper's citation count after a period or identifying the most influential papers among the massive candidates, without concentrating on an individual paper's negative or positive impact on its authors. Thus, this study aims to formulate the prediction problem to identify whether one paper can increase scholars' influence or not, which can provide feedback to the authors before they publish their papers. First, we presented the self-adapted ACC (Average Annual Citation Counts) metric to measure authors' impact yearly based on their annual published papers, paper citation counts, and contributions in each paper. Then, we proposed the RD-GAT (Reference-Depth Graph Attention Network) model to integrate heterogeneous graph information from different depth of references by assigning attention coefficients on them. Experiments on AMiner dataset demonstrated that the proposed ACC metrics could represent the authors influence effectively, and the RD-GAT model is more efficiently on the academic citation network, and have stronger robustness against the overfitting problem compared with the baseline models. By applying the framework in this work, scholars can identify whether their papers can improve their influence in the future.
△ Less
Submitted 25 June, 2024; v1 submitted 27 October, 2023;
originally announced October 2023.
-
An Investigation of LLMs' Inefficacy in Understanding Converse Relations
Authors:
Chengwen Qi,
Bowen Li,
Binyuan Hui,
Bailin Wang,
Jinyang Li,
Jinwang Wu,
Yuanjun Laili
Abstract:
Large Language Models (LLMs) have achieved remarkable success in many formal language oriented tasks, such as structural data-to-text and semantic parsing. However current benchmarks mostly follow the data distribution of the pre-training data of LLMs. Therefore, a natural question rises that do LLMs really understand the structured semantics of formal languages. In this paper, we investigate this…
▽ More
Large Language Models (LLMs) have achieved remarkable success in many formal language oriented tasks, such as structural data-to-text and semantic parsing. However current benchmarks mostly follow the data distribution of the pre-training data of LLMs. Therefore, a natural question rises that do LLMs really understand the structured semantics of formal languages. In this paper, we investigate this problem on a special case, converse binary relation. We introduce a new benchmark ConvRe focusing on converse relations, which contains 17 relations and 1240 triples extracted from popular knowledge graph completion datasets. Our ConvRE features two tasks, Re2Text and Text2Re, which are formulated as multi-choice question answering to evaluate LLMs' ability to determine the matching between relations and associated text. For the evaluation protocol, apart from different prompting methods, we further introduce variants to the test text and few-shot example text. We conduct experiments on three popular LLM families and have observed various scaling trends. The results suggest that LLMs often resort to shortcut learning and still face challenges on our proposed benchmark.
△ Less
Submitted 13 November, 2023; v1 submitted 8 October, 2023;
originally announced October 2023.
-
Learning Generalizable Tool-use Skills through Trajectory Generation
Authors:
Carl Qi,
Yilin Wu,
Lifan Yu,
Haoyue Liu,
Bowen Jiang,
Xingyu Lin,
David Held
Abstract:
Autonomous systems that efficiently utilize tools can assist humans in completing many common tasks such as cooking and cleaning. However, current systems fall short of matching human-level of intelligence in terms of adapting to novel tools. Prior works based on affordance often make strong assumptions about the environments and cannot scale to more complex, contact-rich tasks. In this work, we t…
▽ More
Autonomous systems that efficiently utilize tools can assist humans in completing many common tasks such as cooking and cleaning. However, current systems fall short of matching human-level of intelligence in terms of adapting to novel tools. Prior works based on affordance often make strong assumptions about the environments and cannot scale to more complex, contact-rich tasks. In this work, we tackle this challenge and explore how agents can learn to use previously unseen tools to manipulate deformable objects. We propose to learn a generative model of the tool-use trajectories as a sequence of tool point clouds, which generalizes to different tool shapes. Given any novel tool, we first generate a tool-use trajectory and then optimize the sequence of tool poses to align with the generated trajectory. We train a single model on four different challenging deformable object manipulation tasks, using demonstration data from only one tool per task. The model generalizes to various novel tools, significantly outperforming baselines. We further test our trained policy in the real world with unseen tools, where it achieves the performance comparable to human. Additional materials can be found on our project website: https://sites.google.com/view/toolgen.
△ Less
Submitted 23 April, 2024; v1 submitted 29 September, 2023;
originally announced October 2023.
-
Experimental Limits on Solar Reflected Dark Matter with a New Approach on Accelerated-Dark-Matter-Electron Analysis in Semiconductors
Authors:
Z. Y. Zhang,
L. T. Yang,
Q. Yue,
K. J. Kang,
Y. J. Li,
H. P. An,
Greeshma C.,
J. P. Chang,
Y. H. Chen,
J. P. Cheng,
W. H. Dai,
Z. Deng,
C. H. Fang,
X. P. Geng,
H. Gong,
Q. J. Guo,
T. Guo,
X. Y. Guo,
L. He,
S. M. He,
J. W. Hu,
H. X. Huang,
T. C. Huang,
L. Jiang,
S. Karmakar
, et al. (59 additional authors not shown)
Abstract:
Recently a dark matter-electron (DM-electron) paradigm has drawn much attention. Models beyond the standard halo model describing DM accelerated by high energy celestial bodies are under intense examination as well. In this Letter, a velocity components analysis (VCA) method dedicated to swift analysis of accelerated DM-electron interactions via semiconductor detectors is proposed and the first HP…
▽ More
Recently a dark matter-electron (DM-electron) paradigm has drawn much attention. Models beyond the standard halo model describing DM accelerated by high energy celestial bodies are under intense examination as well. In this Letter, a velocity components analysis (VCA) method dedicated to swift analysis of accelerated DM-electron interactions via semiconductor detectors is proposed and the first HPGe detector-based accelerated DM-electron analysis is realized. Utilizing the method, the first germanium based constraint on sub-GeV solar reflected DM-electron interaction is presented with the 205.4 kg$\cdot$day dataset from the CDEX-10 experiment. In the heavy mediator scenario, our result excels in the mass range of 5$-$15 keV/$c^2$, achieving a 3 orders of magnitude improvement comparing with previous semiconductor experiments. In the light mediator scenario, the strongest laboratory constraint for DM lighter than 0.1 MeV/$c^2$ is presented. The result proves the feasibility and demonstrates the vast potential of the VCA technique in future accelerated DM-electron analyses with semiconductor detectors.
△ Less
Submitted 24 April, 2024; v1 submitted 26 September, 2023;
originally announced September 2023.
-
Unsupervised 3D Perception with 2D Vision-Language Distillation for Autonomous Driving
Authors:
Mahyar Najibi,
Jingwei Ji,
Yin Zhou,
Charles R. Qi,
Xinchen Yan,
Scott Ettinger,
Dragomir Anguelov
Abstract:
Closed-set 3D perception models trained on only a pre-defined set of object categories can be inadequate for safety critical applications such as autonomous driving where new object types can be encountered after deployment. In this paper, we present a multi-modal auto labeling pipeline capable of generating amodal 3D bounding boxes and tracklets for training models on open-set categories without…
▽ More
Closed-set 3D perception models trained on only a pre-defined set of object categories can be inadequate for safety critical applications such as autonomous driving where new object types can be encountered after deployment. In this paper, we present a multi-modal auto labeling pipeline capable of generating amodal 3D bounding boxes and tracklets for training models on open-set categories without 3D human labels. Our pipeline exploits motion cues inherent in point cloud sequences in combination with the freely available 2D image-text pairs to identify and track all traffic participants. Compared to the recent studies in this domain, which can only provide class-agnostic auto labels limited to moving objects, our method can handle both static and moving objects in the unsupervised manner and is able to output open-vocabulary semantic labels thanks to the proposed vision-language knowledge distillation. Experiments on the Waymo Open Dataset show that our approach outperforms the prior work by significant margins on various unsupervised 3D perception tasks.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Evaluation Kidney Layer Segmentation on Whole Slide Imaging using Convolutional Neural Networks and Transformers
Authors:
Muhao Liu,
Chenyang Qi,
Shunxing Bao,
Quan Liu,
Ruining Deng,
Yu Wang,
Shilin Zhao,
Haichun Yang,
Yuankai Huo
Abstract:
The segmentation of kidney layer structures, including cortex, outer stripe, inner stripe, and inner medulla within human kidney whole slide images (WSI) plays an essential role in automated image analysis in renal pathology. However, the current manual segmentation process proves labor-intensive and infeasible for handling the extensive digital pathology images encountered at a large scale. In re…
▽ More
The segmentation of kidney layer structures, including cortex, outer stripe, inner stripe, and inner medulla within human kidney whole slide images (WSI) plays an essential role in automated image analysis in renal pathology. However, the current manual segmentation process proves labor-intensive and infeasible for handling the extensive digital pathology images encountered at a large scale. In response, the realm of digital renal pathology has seen the emergence of deep learning-based methodologies. However, very few, if any, deep learning based approaches have been applied to kidney layer structure segmentation. Addressing this gap, this paper assesses the feasibility of performing deep learning based approaches on kidney layer structure segmetnation. This study employs the representative convolutional neural network (CNN) and Transformer segmentation approaches, including Swin-Unet, Medical-Transformer, TransUNet, U-Net, PSPNet, and DeepLabv3+. We quantitatively evaluated six prevalent deep learning models on renal cortex layer segmentation using mice kidney WSIs. The empirical results stemming from our approach exhibit compelling advancements, as evidenced by a decent Mean Intersection over Union (mIoU) index. The results demonstrate that Transformer models generally outperform CNN-based models. By enabling a quantitative evaluation of renal cortical structures, deep learning approaches are promising to empower these medical professionals to make more informed kidney layer segmentation.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Projected WIMP sensitivity of the CDEX-50 dark matter experiment
Authors:
X. P. Geng,
L. T. Yang,
Q. Yue,
K. J. Kang,
Y. J. Li,
H. P. An,
Greeshma C.,
J. P. Chang,
Y. H. Chen,
J. P. Cheng,
W. H. Dai,
Z. Deng,
C. H. Fang,
H. Gong,
Q. J. Guo,
T. Guo,
X. Y. Guo,
L. He,
S. M. He,
J. W. Hu,
H. X. Huang,
T. C. Huang,
L. Jiang,
S. Karmakar,
H. B. Li
, et al. (59 additional authors not shown)
Abstract:
CDEX-50 is a next-generation project of the China Dark Matter Experiment (CDEX) that aims to search for dark matter using a 50-kg germanium detector array. This paper comprises a thorough summary of the CDEX-50 dark matter experiment, including an investigation of potential background sources and the development of a background model. Based on the baseline model, the projected sensitivity of weakl…
▽ More
CDEX-50 is a next-generation project of the China Dark Matter Experiment (CDEX) that aims to search for dark matter using a 50-kg germanium detector array. This paper comprises a thorough summary of the CDEX-50 dark matter experiment, including an investigation of potential background sources and the development of a background model. Based on the baseline model, the projected sensitivity of weakly interacting massive particle (WIMP) is also presented. The expected background level within the energy region of interest, set to 2--2.5 keVee, is $\sim$0.01 counts keVee$^{-1}$ kg$^{-1}$ day$^{-1}$. At 90\% confidence level, the expected sensitivity to spin-independent WIMP-nucleon couplings is estimated to reach a cross-section of 5.1 $\times$ 10$^{-45}$ cm$^{2}$ for a WIMP mass of 5 GeV/c$^{2}$ with an exposure objective of 150 kg$\cdot$year and an analysis threshold of 160 eVee. This science goal will correspond to the most sensitive results for WIMPs with a mass of 2.2--8 GeV/c$^{2}$.
△ Less
Submitted 4 July, 2024; v1 submitted 4 September, 2023;
originally announced September 2023.
-
Learning Multiple Gaits within Latent Space for Quadruped Robots
Authors:
Jinze Wu,
Yufei Xue,
Chenkun Qi
Abstract:
Learning multiple gaits is non-trivial for legged robots, especially when encountering different terrains and velocity commands. In this work, we present an end-to-end training framework for learning multiple gaits for quadruped robots, tailored to the needs of robust locomotion, agile locomotion, and user's commands. A latent space is constructed concurrently by a gait encoder and a gait generato…
▽ More
Learning multiple gaits is non-trivial for legged robots, especially when encountering different terrains and velocity commands. In this work, we present an end-to-end training framework for learning multiple gaits for quadruped robots, tailored to the needs of robust locomotion, agile locomotion, and user's commands. A latent space is constructed concurrently by a gait encoder and a gait generator, which helps the agent to reuse multiple gait skills to achieve adaptive gait behaviors. To learn natural behaviors for multiple gaits, we design gait-dependent rewards that are constructed explicitly from gait parameters and implicitly from conditional adversarial motion priors (CAMP). We demonstrate such multiple gaits control on a quadruped robot Go1 with only proprioceptive sensors.
△ Less
Submitted 6 August, 2023;
originally announced August 2023.
-
MoDAR: Using Motion Forecasting for 3D Object Detection in Point Cloud Sequences
Authors:
Yingwei Li,
Charles R. Qi,
Yin Zhou,
Chenxi Liu,
Dragomir Anguelov
Abstract:
Occluded and long-range objects are ubiquitous and challenging for 3D object detection. Point cloud sequence data provide unique opportunities to improve such cases, as an occluded or distant object can be observed from different viewpoints or gets better visibility over time. However, the efficiency and effectiveness in encoding long-term sequence data can still be improved. In this work, we prop…
▽ More
Occluded and long-range objects are ubiquitous and challenging for 3D object detection. Point cloud sequence data provide unique opportunities to improve such cases, as an occluded or distant object can be observed from different viewpoints or gets better visibility over time. However, the efficiency and effectiveness in encoding long-term sequence data can still be improved. In this work, we propose MoDAR, using motion forecasting outputs as a type of virtual modality, to augment LiDAR point clouds. The MoDAR modality propagates object information from temporal contexts to a target frame, represented as a set of virtual points, one for each object from a waypoint on a forecasted trajectory. A fused point cloud of both raw sensor points and the virtual points can then be fed to any off-the-shelf point-cloud based 3D object detector. Evaluated on the Waymo Open Dataset, our method significantly improves prior art detectors by using motion forecasting from extra-long sequences (e.g. 18 seconds), achieving new state of the arts, while not adding much computation overhead.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
Inserting Anybody in Diffusion Models via Celeb Basis
Authors:
Ge Yuan,
Xiaodong Cun,
Yong Zhang,
Maomao Li,
Chenyang Qi,
Xintao Wang,
Ying Shan,
Huicheng Zheng
Abstract:
Exquisite demand exists for customizing the pretrained large text-to-image model, $\textit{e.g.}$, Stable Diffusion, to generate innovative concepts, such as the users themselves. However, the newly-added concept from previous customization methods often shows weaker combination abilities than the original ones even given several images during training. We thus propose a new personalization method…
▽ More
Exquisite demand exists for customizing the pretrained large text-to-image model, $\textit{e.g.}$, Stable Diffusion, to generate innovative concepts, such as the users themselves. However, the newly-added concept from previous customization methods often shows weaker combination abilities than the original ones even given several images during training. We thus propose a new personalization method that allows for the seamless integration of a unique individual into the pre-trained diffusion model using just $\textbf{one facial photograph}$ and only $\textbf{1024 learnable parameters}$ under $\textbf{3 minutes}$. So as we can effortlessly generate stunning images of this person in any pose or position, interacting with anyone and doing anything imaginable from text prompts. To achieve this, we first analyze and build a well-defined celeb basis from the embedding space of the pre-trained large text encoder. Then, given one facial photo as the target identity, we generate its own embedding by optimizing the weight of this basis and locking all other parameters. Empowered by the proposed celeb basis, the new identity in our customized model showcases a better concept combination ability than previous personalization methods. Besides, our model can also learn several new identities at once and interact with each other where the previous customization model fails to. The code will be released.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
Gas Sources from the Coma and Nucleus of Comet 46P/Wirtanen Observed Using ALMA
Authors:
M. A. Cordiner,
N. X. Roth,
S. N. Milam,
G. Villanueva,
D. Bockelee-Morvan,
A. J. Remijan,
S. B. Charnley,
N. Biver,
D. C. Lis,
C. Qi,
B. Bonev,
J. Crovisier,
J. Boissier
Abstract:
Gas-phase molecules in cometary atmospheres (comae) originate primarily from (1) outgassing by the nucleus, (2) sublimation of icy grains in the near-nucleus coma, and (3) coma (photo-)chemical processes. However, the majority of cometary gases observed at radio wavelengths have yet to be mapped, so their production/release mechanisms remain uncertain. Here we present observations of six molecular…
▽ More
Gas-phase molecules in cometary atmospheres (comae) originate primarily from (1) outgassing by the nucleus, (2) sublimation of icy grains in the near-nucleus coma, and (3) coma (photo-)chemical processes. However, the majority of cometary gases observed at radio wavelengths have yet to be mapped, so their production/release mechanisms remain uncertain. Here we present observations of six molecular species towards comet 46P/Wirtanen, obtained using the Atacama Large Millimeter/submillimeter Array (ALMA) during the comet's unusually close (~0.1 au) approach to Earth in December 2018. Interferometric maps of HCN, CH3OH, CH3CN, H2CO, CS and HNC were obtained at an unprecedented sky-projected spatial resolution of up to 25 km, enabling the nucleus and coma sources of these molecules to be accurately quantified. The HCN, CH3OH and CH3CN spatial distributions are consistent with production by direct outgassing from (or very near to) the nucleus, with a significant proportion of the observed CH3OH originating from sublimation of icy grains in the near-nucleus coma (at a scale-length $L_p=36\pm7$ km). On the other hand, H2CO, CS and HNC originate primarily from distributed coma sources (with $L_p$ values in the range 550-16,000 km), the identities of which remain to be established. The HCN, CH3OH and HNC abundances in 46P are consistent with the average values previously observed in comets, whereas the H2CO, CH3CN and CS abundances are relatively low.
△ Less
Submitted 19 June, 2023; v1 submitted 8 May, 2023;
originally announced May 2023.
-
Searching for $^{76}$Ge neutrinoless double beta decay with the CDEX-1B experiment
Authors:
B. T. Zhang,
L. T. Yang,
Q. Yue,
K. J. Kang,
Y. J. Li,
H. P. An,
Greeshma C.,
J. P. Chang,
Y. H. Chen,
J. P. Cheng,
W. H. Dai,
Z. Deng,
C. H. Fang,
X. P. Geng,
H. Gong,
Q. J. Guo,
X. Y. Guo,
L. He,
S. M. He,
J. W. Hu,
H. X. Huang,
T. C. Huang,
H. T. Jia,
X. Jiang,
S. Karmakar
, et al. (59 additional authors not shown)
Abstract:
We operated a p-type point contact high purity germanium (PPCGe) detector (CDEX-1B, 1.008 kg) in the China Jinping Underground Laboratory (CJPL) for 500.3 days to search for neutrinoless double beta ($0νββ$) decay of $^{76}$Ge. A total of 504.3 kg $\cdot$ day effective exposure data was accumulated. The anti-coincidence and the multi/single-site event (MSE/SSE) discrimination methods were used to…
▽ More
We operated a p-type point contact high purity germanium (PPCGe) detector (CDEX-1B, 1.008 kg) in the China Jinping Underground Laboratory (CJPL) for 500.3 days to search for neutrinoless double beta ($0νββ$) decay of $^{76}$Ge. A total of 504.3 kg $\cdot$ day effective exposure data was accumulated. The anti-coincidence and the multi/single-site event (MSE/SSE) discrimination methods were used to suppress the background in the energy region of interest (ROI, $1989-2089$ keV for this work) with a factor of 23. A background level of 0.33 counts/(keV $\cdot$ kg $\cdot$ yr) was achieved. The lower limit on the half life of $^{76}$Ge $0νββ$ decay was constrained as $T_{1/2}^{0ν}\ > \ {2.2}\times 10^{23}\ \rm yr\ (90\% \ C.L.)$, corresponding to the upper limits on the effective Majorana neutrino mass: $\langle m_{ββ}\rangle < 2.3-5.2\ \mathrm{eV}$.
△ Less
Submitted 8 May, 2023; v1 submitted 1 May, 2023;
originally announced May 2023.
-
WOMD-LiDAR: Raw Sensor Dataset Benchmark for Motion Forecasting
Authors:
Kan Chen,
Runzhou Ge,
Hang Qiu,
Rami AI-Rfou,
Charles R. Qi,
Xuanyu Zhou,
Zoey Yang,
Scott Ettinger,
Pei Sun,
Zhaoqi Leng,
Mustafa Baniodeh,
Ivan Bogun,
Weiyue Wang,
Mingxing Tan,
Dragomir Anguelov
Abstract:
Widely adopted motion forecasting datasets substitute the observed sensory inputs with higher-level abstractions such as 3D boxes and polylines. These sparse shapes are inferred through annotating the original scenes with perception systems' predictions. Such intermediate representations tie the quality of the motion forecasting models to the performance of computer vision models. Moreover, the hu…
▽ More
Widely adopted motion forecasting datasets substitute the observed sensory inputs with higher-level abstractions such as 3D boxes and polylines. These sparse shapes are inferred through annotating the original scenes with perception systems' predictions. Such intermediate representations tie the quality of the motion forecasting models to the performance of computer vision models. Moreover, the human-designed explicit interfaces between perception and motion forecasting typically pass only a subset of the semantic information present in the original sensory input. To study the effect of these modular approaches, design new paradigms that mitigate these limitations, and accelerate the development of end-to-end motion forecasting models, we augment the Waymo Open Motion Dataset (WOMD) with large-scale, high-quality, diverse LiDAR data for the motion forecasting task.
The new augmented dataset WOMD-LiDAR consists of over 100,000 scenes that each spans 20 seconds, consisting of well-synchronized and calibrated high quality LiDAR point clouds captured across a range of urban and suburban geographies (https://waymo.com/open/data/motion/). Compared to Waymo Open Dataset (WOD), WOMD-LiDAR dataset contains 100x more scenes. Furthermore, we integrate the LiDAR data into the motion forecasting model training and provide a strong baseline. Experiments show that the LiDAR data brings improvement in the motion forecasting task. We hope that WOMD-LiDAR will provide new opportunities for boosting end-to-end motion forecasting models.
△ Less
Submitted 18 February, 2024; v1 submitted 7 April, 2023;
originally announced April 2023.
-
GINA-3D: Learning to Generate Implicit Neural Assets in the Wild
Authors:
Bokui Shen,
Xinchen Yan,
Charles R. Qi,
Mahyar Najibi,
Boyang Deng,
Leonidas Guibas,
Yin Zhou,
Dragomir Anguelov
Abstract:
Modeling the 3D world from sensor data for simulation is a scalable way of developing testing and validation environments for robotic learning problems such as autonomous driving. However, manually creating or re-creating real-world-like environments is difficult, expensive, and not scalable. Recent generative model techniques have shown promising progress to address such challenges by learning 3D…
▽ More
Modeling the 3D world from sensor data for simulation is a scalable way of developing testing and validation environments for robotic learning problems such as autonomous driving. However, manually creating or re-creating real-world-like environments is difficult, expensive, and not scalable. Recent generative model techniques have shown promising progress to address such challenges by learning 3D assets using only plentiful 2D images -- but still suffer limitations as they leverage either human-curated image datasets or renderings from manually-created synthetic 3D environments. In this paper, we introduce GINA-3D, a generative model that uses real-world driving data from camera and LiDAR sensors to create realistic 3D implicit neural assets of diverse vehicles and pedestrians. Compared to the existing image datasets, the real-world driving setting poses new challenges due to occlusions, lighting-variations and long-tail distributions. GINA-3D tackles these challenges by decoupling representation learning and generative modeling into two stages with a learned tri-plane latent structure, inspired by recent advances in generative modeling of images. To evaluate our approach, we construct a large-scale object-centric dataset containing over 1.2M images of vehicles and pedestrians from the Waymo Open Dataset, and a new set of 80K images of long-tail instances such as construction equipment, garbage trucks, and cable cars. We compare our model with existing approaches and demonstrate that it achieves state-of-the-art performance in quality and diversity for both generated images and geometries.
△ Less
Submitted 28 August, 2023; v1 submitted 4 April, 2023;
originally announced April 2023.
-
An infinite-times renewal equation
Authors:
Xu'An Dou,
Benoît Perthame,
Chenjiayue Qi,
Delphine Salort,
Zhennan Zhou
Abstract:
In neuroscience, the time elapsed since the last discharge has been used to predict the probability of the next discharge. Such predictions can be improved taking into account the last two discharge times, and possibly more. Such multi-time processes arise in many other areas and there is no universal limitation on the number of times to be used. This observation leads us to study the infinite-tim…
▽ More
In neuroscience, the time elapsed since the last discharge has been used to predict the probability of the next discharge. Such predictions can be improved taking into account the last two discharge times, and possibly more. Such multi-time processes arise in many other areas and there is no universal limitation on the number of times to be used. This observation leads us to study the infinite-times renewal equation as a simple model to understand the meaning and properties of such partial differential equations depending on an infinite number of variables.We define two notions of solutions, prove existence and uniqueness of solutions, possibly measures. We also prove the long time convergence, with exponential rate, to the steady state in different, strong or weak, topologies depending on assumptions on the coefficients.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
Real-time 6K Image Rescaling with Rate-distortion Optimization
Authors:
Chenyang Qi,
Xin Yang,
Ka Leong Cheng,
Ying-Cong Chen,
Qifeng Chen
Abstract:
Contemporary image rescaling aims at embedding a high-resolution (HR) image into a low-resolution (LR) thumbnail image that contains embedded information for HR image reconstruction. Unlike traditional image super-resolution, this enables high-fidelity HR image restoration faithful to the original one, given the embedded information in the LR thumbnail. However, state-of-the-art image rescaling me…
▽ More
Contemporary image rescaling aims at embedding a high-resolution (HR) image into a low-resolution (LR) thumbnail image that contains embedded information for HR image reconstruction. Unlike traditional image super-resolution, this enables high-fidelity HR image restoration faithful to the original one, given the embedded information in the LR thumbnail. However, state-of-the-art image rescaling methods do not optimize the LR image file size for efficient sharing and fall short of real-time performance for ultra-high-resolution (e.g., 6K) image reconstruction. To address these two challenges, we propose a novel framework (HyperThumbnail) for real-time 6K rate-distortion-aware image rescaling. Our framework first embeds an HR image into a JPEG LR thumbnail by an encoder with our proposed quantization prediction module, which minimizes the file size of the embedding LR JPEG thumbnail while maximizing HR reconstruction quality. Then, an efficient frequency-aware decoder reconstructs a high-fidelity HR image from the LR one in real time. Extensive experiments demonstrate that our framework outperforms previous image rescaling baselines in rate-distortion performance and can perform 6K image reconstruction in real time.
△ Less
Submitted 19 May, 2023; v1 submitted 3 April, 2023;
originally announced April 2023.
-
Boundary-to-Solution Mapping for Groundwater Flows in a Toth Basin
Authors:
Jingwei Sun,
Jun Li,
Yonghong Hao,
Cuiting Qi,
Chunmei Ma,
Huazhi Sun,
Negash Begashaw,
Gurcan Comet,
Yi Sun,
Qi Wang
Abstract:
In this paper, the authors propose a new approach to solving the groundwater flow equation in the Toth basin of arbitrary top and bottom topographies using deep learning. Instead of using traditional numerical solvers, they use a DeepONet to produce the boundary-to-solution mapping. This mapping takes the geometry of the physical domain along with the boundary conditions as inputs to output the st…
▽ More
In this paper, the authors propose a new approach to solving the groundwater flow equation in the Toth basin of arbitrary top and bottom topographies using deep learning. Instead of using traditional numerical solvers, they use a DeepONet to produce the boundary-to-solution mapping. This mapping takes the geometry of the physical domain along with the boundary conditions as inputs to output the steady state solution of the groundwater flow equation. To implement the DeepONet, the authors approximate the top and bottom boundaries using truncated Fourier series or piecewise linear representations. They present two different implementations of the DeepONet: one where the Toth basin is embedded in a rectangular computational domain, and another where the Toth basin with arbitrary top and bottom boundaries is mapped into a rectangular computational domain via a nonlinear transformation. They implement the DeepONet with respect to the Dirichlet and Robin boundary condition at the top and the Neumann boundary condition at the impervious bottom boundary, respectively. Using this deep-learning enabled tool, the authors investigate the impact of surface topography on the flow pattern by both the top surface and the bottom impervious boundary with arbitrary geometries. They discover that the average slope of the top surface promotes long-distance transport, while the local curvature controls localized circulations. Additionally, they find that the slope of the bottom impervious boundary can seriously impact the long-distance transport of groundwater flows. Overall, this paper presents a new and innovative approach to solving the groundwater flow equation using deep learning, which allows for the investigation of the impact of surface topography on groundwater flow patterns.
△ Less
Submitted 27 March, 2023;
originally announced March 2023.
-
VPU-EM: An Event-based Modeling Framework to Evaluate NPU Performance and Power Efficiency at Scale
Authors:
Charles Qi,
Yi Wang,
Hui Wang,
Yang Lu,
Shiva Shankar Subramanian,
Finola Cahill,
Conall Tuohy,
Victor Li,
Xu Qian,
Darren Crews,
Ling Wang,
Shivaji Roy,
Andrea Deidda,
Martin Power,
Niall Hanrahan,
Rick Richmond,
Umer Cheema,
Arnab Raha,
Alessandro Palla,
Gary Baugh,
Deepak Mathaikutty
Abstract:
State-of-art NPUs are typically architected as a self-contained sub-system with multiple heterogeneous hardware computing modules, and a dataflow-driven programming model. There lacks well-established methodology and tools in the industry to evaluate and compare the performance of NPUs from different architectures. We present an event-based performance modeling framework, VPU-EM, targeting scalabl…
▽ More
State-of-art NPUs are typically architected as a self-contained sub-system with multiple heterogeneous hardware computing modules, and a dataflow-driven programming model. There lacks well-established methodology and tools in the industry to evaluate and compare the performance of NPUs from different architectures. We present an event-based performance modeling framework, VPU-EM, targeting scalable performance evaluation of modern NPUs across diversified AI workloads. The framework adopts high-level event-based system-simulation methodology to abstract away design details for speed, while maintaining hardware pipelining, concurrency and interaction with software task scheduling. It is natively developed in Python and built to interface directly with AI frameworks such as Tensorflow, PyTorch, ONNX and OpenVINO, linking various in-house NPU graph compilers to achieve optimized full model performance. Furthermore, VPU-EM also provides the capability to model power characteristics of NPU in Power-EM mode to enable joint performance/power analysis. Using VPU-EM, we conduct performance/power analysis of models from representative neural network architecture. We demonstrate that even though this framework is developed for Intel VPU, an Intel in-house NPU IP technology, the methodology can be generalized for analysis of modern NPUs.
△ Less
Submitted 17 March, 2023;
originally announced March 2023.
-
FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
Authors:
Chenyang Qi,
Xiaodong Cun,
Yong Zhang,
Chenyang Lei,
Xintao Wang,
Ying Shan,
Qifeng Chen
Abstract:
The diffusion-based generative models have achieved remarkable success in text-based image generation. However, since it contains enormous randomness in generation progress, it is still challenging to apply such models for real-world visual content editing, especially in videos. In this paper, we propose FateZero, a zero-shot text-based editing method on real-world videos without per-prompt traini…
▽ More
The diffusion-based generative models have achieved remarkable success in text-based image generation. However, since it contains enormous randomness in generation progress, it is still challenging to apply such models for real-world visual content editing, especially in videos. In this paper, we propose FateZero, a zero-shot text-based editing method on real-world videos without per-prompt training or use-specific mask. To edit videos consistently, we propose several techniques based on the pre-trained models. Firstly, in contrast to the straightforward DDIM inversion technique, our approach captures intermediate attention maps during inversion, which effectively retain both structural and motion information. These maps are directly fused in the editing process rather than generated during denoising. To further minimize semantic leakage of the source video, we then fuse self-attentions with a blending mask obtained by cross-attention features from the source prompt. Furthermore, we have implemented a reform of the self-attention mechanism in denoising UNet by introducing spatial-temporal attention to ensure frame consistency. Yet succinct, our method is the first one to show the ability of zero-shot text-driven video style and local attribute editing from the trained text-to-image model. We also have a better zero-shot shape-aware editing ability based on the text-to-video model. Extensive experiments demonstrate our superior temporal consistency and editing capability than previous works.
△ Less
Submitted 11 October, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.