subscribe to arXiv mailings

arXiv:2407.11937 [pdf, other]

Generalized Difference-in-Differences

Authors: Yiqing Xu, Anqi Zhao, Peng Ding

Abstract: In many social science applications, researchers use the difference-in-differences (DID) estimator to establish causal relationships, exploiting cross-sectional variation in a baseline factor and temporal variation in exposure to an event that presumably may affect all units. This approach, often referred to as generalized DID (GDID), differs from canonical DID in that it lacks a "clean control gr… ▽ More In many social science applications, researchers use the difference-in-differences (DID) estimator to establish causal relationships, exploiting cross-sectional variation in a baseline factor and temporal variation in exposure to an event that presumably may affect all units. This approach, often referred to as generalized DID (GDID), differs from canonical DID in that it lacks a "clean control group" unexposed to the event after the event occurs. In this paper, we clarify GDID as a research design in terms of its data structure, feasible estimands, and identifying assumptions that allow the DID estimator to recover these estimands. We frame GDID as a factorial design with two factors: the baseline factor, denoted by $G$, and the exposure level to the event, denoted by $Z$, and define effect modification and causal interaction as the associative and causal effects of $G$ on the effect of $Z$, respectively. We show that under the canonical no anticipation and parallel trends assumptions, the DID estimator identifies only the effect modification of $G$ in GDID, and propose an additional generalized parallel trends assumption to identify causal interaction. Moreover, we show that the canonical DID research design can be framed as a special case of the GDID research design with an additional exclusion restriction assumption, thereby reconciling the two approaches. We illustrate these findings with empirical examples from economics and political science, and provide recommendations for improving practice and interpretation under GDID. △ Less

Submitted 16 July, 2024; originally announced July 2024.

arXiv:2407.10339 [pdf, other]

Supernova Pointing Capabilities of DUNE

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, D. A. Andrade , et al. (1340 additional authors not shown)

Abstract: The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electr… ▽ More The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electron-neutrino charged-current absorption on $^{40}$Ar and elastic scattering of neutrinos on electrons. Procedures to reconstruct individual interactions, including a newly developed technique called ``brems flipping'', as well as the burst direction from an ensemble of interactions are described. Performance of the burst direction reconstruction is evaluated for supernovae happening at a distance of 10 kpc for a specific supernova burst flux model. The pointing resolution is found to be 3.4 degrees at 68% coverage for a perfect interaction-channel classification and a fiducial mass of 40 kton, and 6.6 degrees for a 10 kton fiducial mass respectively. Assuming a 4% rate of charged-current interactions being misidentified as elastic scattering, DUNE's burst pointing resolution is found to be 4.3 degrees (8.7 degrees) at 68% coverage. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: 25 pages, 16 figures

Report number: FERMILAB-PUB-24-0319-LBNF

arXiv:2407.09371 [pdf, other]

Computationally Efficient Estimation of Large Probit Models

Authors: Patrick Ding, Guido Imbens, Zhaonan Qu, Yinyu Ye

Abstract: Probit models are useful for modeling correlated discrete responses in many disciplines, including discrete choice data in economics. However, the Gaussian latent variable feature of probit models coupled with identification constraints pose significant computational challenges for its estimation and inference, especially when the dimension of the discrete response variable is large. In this paper… ▽ More Probit models are useful for modeling correlated discrete responses in many disciplines, including discrete choice data in economics. However, the Gaussian latent variable feature of probit models coupled with identification constraints pose significant computational challenges for its estimation and inference, especially when the dimension of the discrete response variable is large. In this paper, we propose a computationally efficient Expectation-Maximization (EM) algorithm for estimating large probit models. Our work is distinct from existing methods in two important aspects. First, instead of simulation or sampling methods, we apply and customize expectation propagation (EP), a deterministic method originally proposed for approximate Bayesian inference, to estimate moments of the truncated multivariate normal (TMVN) in the E (expectation) step. Second, we take advantage of a symmetric identification condition to transform the constrained optimization problem in the M (maximization) step into a one-dimensional problem, which is solved efficiently using Newton's method instead of off-the-shelf solvers. Our method enables the analysis of correlated choice data in the presence of more than 100 alternatives, which is a reasonable size in modern applications, such as online shopping and booking platforms, but has been difficult in practice with probit models. We apply our probit estimation method to study ordering effects in hotel search results on Expedia.com. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2406.07025 [pdf, other]

Entropy-Reinforced Planning with Large Language Models for Drug Discovery

Authors: Xuefeng Liu, Chih-chan Tien, Peng Ding, Songhao Jiang, Rick L. Stevens

Abstract: The objective of drug discovery is to identify chemical compounds that possess specific pharmaceutical properties toward a binding target. Existing large language models (LLMS) can achieve high token matching scores in terms of likelihood for molecule generation. However, relying solely on LLM decoding often results in the generation of molecules that are either invalid due to a single misused tok… ▽ More The objective of drug discovery is to identify chemical compounds that possess specific pharmaceutical properties toward a binding target. Existing large language models (LLMS) can achieve high token matching scores in terms of likelihood for molecule generation. However, relying solely on LLM decoding often results in the generation of molecules that are either invalid due to a single misused token, or suboptimal due to unbalanced exploration and exploitation as a consequence of the LLMs prior experience. Here we propose ERP, Entropy-Reinforced Planning for Transformer Decoding, which employs an entropy-reinforced planning algorithm to enhance the Transformer decoding process and strike a balance between exploitation and exploration. ERP aims to achieve improvements in multiple properties compared to direct sampling from the Transformer. We evaluated ERP on the SARS-CoV-2 virus (3CLPro) and human cancer cell target protein (RTCB) benchmarks and demonstrated that, in both benchmarks, ERP consistently outperforms the current state-of-the-art algorithm by 1-5 percent, and baselines by 5-10 percent, respectively. Moreover, such improvement is robust across Transformer models trained with different objectives. Finally, to further illustrate the capabilities of ERP, we tested our algorithm on three code generation benchmarks and outperformed the current state-of-the-art approach as well. Our code is publicly available at: https://github.com/xuefeng-cs/ERP. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: Published in ICML2024

arXiv:2406.06980 [pdf, other]

Sensitivity Analysis for the Test-Negative Design

Authors: Soumyabrata Kundu, Peng Ding, Xinran Li, Jingshu Wang

Abstract: The test-negative design has become popular for evaluating the effectiveness of post-licensure vaccines using observational data. In addition to its logistical convenience on data collection, the design is also believed to control for the differential health-care-seeking behavior between vaccinated and unvaccinated individuals, which is an important while often unmeasured confounder between the va… ▽ More The test-negative design has become popular for evaluating the effectiveness of post-licensure vaccines using observational data. In addition to its logistical convenience on data collection, the design is also believed to control for the differential health-care-seeking behavior between vaccinated and unvaccinated individuals, which is an important while often unmeasured confounder between the vaccination and infection. Hence, the design has been employed routinely to monitor seasonal flu vaccines and more recently to measure the COVID-19 vaccine effectiveness. Despite its popularity, the design has been questioned, in particular about its ability to fully control for the unmeasured confounding. In this paper, we explore deviations from a perfect test-negative design, and propose various sensitivity analysis methods for estimating the effect of vaccination measured by the causal odds ratio on the subpopulation of individuals with good health-care-seeking behavior. We start with point identification of the causal odds ratio under a test-negative design, considering two forms of assumptions on the unmeasured confounder. These assumptions then lead to two approaches for conducting sensitivity analysis, addressing the influence of the unmeasured confounding in different ways. Specifically, one approach investigates partial control for unmeasured confounder in the test-negative design, while the other examines the impact of unmeasured confounder on both vaccination and infection. Furthermore, these approaches can be combined to provide narrower bounds on the true causal odds ratio, and can be further extended to sharpen the bounds by restricting the treatment effect heterogeneity. Finally, we apply the proposed methods to evaluate the effectiveness of COVID-19 vaccines using observational data from test-negative designs. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2404.14025 [pdf, other]

DHRNet: A Dual-Path Hierarchical Relation Network for Multi-Person Pose Estimation

Authors: Yonghao Dang, Jianqin Yin, Liyuan Liu, Pengxiang Ding, Yuan Sun, Yanzhu Hu

Abstract: Multi-person pose estimation (MPPE) presents a formidable yet crucial challenge in computer vision. Most existing methods predominantly concentrate on isolated interaction either between instances or joints, which is inadequate for scenarios demanding concurrent localization of both instances and joints. This paper introduces a novel CNN-based single-stage method, named Dual-path Hierarchical Rela… ▽ More Multi-person pose estimation (MPPE) presents a formidable yet crucial challenge in computer vision. Most existing methods predominantly concentrate on isolated interaction either between instances or joints, which is inadequate for scenarios demanding concurrent localization of both instances and joints. This paper introduces a novel CNN-based single-stage method, named Dual-path Hierarchical Relation Network (DHRNet), to extract instance-to-joint and joint-to-instance interactions concurrently. Specifically, we design a dual-path interaction modeling module (DIM) that strategically organizes cross-instance and cross-joint interaction modeling modules in two complementary orders, enriching interaction information by integrating merits from different correlation modeling branches. Notably, DHRNet excels in joint localization by leveraging information from other instances and joints. Extensive evaluations on challenging datasets, including COCO, CrowdPose, and OCHuman datasets, showcase DHRNet's state-of-the-art performance. The code will be released at https://github.com/YHDang/dhrnet-multi-pose-estimation. △ Less

Submitted 26 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.03584 [pdf, other]

doi 10.1109/TCSVT.2022.3163782

Towards more realistic human motion prediction with attention to motion coordination

Authors: Pengxiang Ding, Jianqin Yin

Abstract: Joint relation modeling is a curial component in human motion prediction. Most existing methods rely on skeletal-based graphs to build the joint relations, where local interactive relations between joint pairs are well learned. However, the motion coordination, a global joint relation reflecting the simultaneous cooperation of all joints, is usually weakened because it is learned from part to whol… ▽ More Joint relation modeling is a curial component in human motion prediction. Most existing methods rely on skeletal-based graphs to build the joint relations, where local interactive relations between joint pairs are well learned. However, the motion coordination, a global joint relation reflecting the simultaneous cooperation of all joints, is usually weakened because it is learned from part to whole progressively and asynchronously. Thus, the final predicted motions usually appear unrealistic. To tackle this issue, we learn a medium, called coordination attractor (CA), from the spatiotemporal features of motion to characterize the global motion features, which is subsequently used to build new relative joint relations. Through the CA, all joints are related simultaneously, and thus the motion coordination of all joints can be better learned. Based on this, we further propose a novel joint relation modeling module, Comprehensive Joint Relation Extractor (CJRE), to combine this motion coordination with the local interactions between joint pairs in a unified manner. Additionally, we also present a Multi-timescale Dynamics Extractor (MTDE) to extract enriched dynamics from the raw position information for effective prediction. Extensive experiments show that the proposed framework outperforms state-of-the-art methods in both short- and long-term predictions on H3.6M, CMU-Mocap, and 3DPW. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: Accepted by TCSVT

arXiv:2403.19913 [pdf, other]

MANGO: A Benchmark for Evaluating Mapping and Navigation Abilities of Large Language Models

Authors: Peng Ding, Jiading Fang, Peng Li, Kangrui Wang, Xiaochen Zhou, Mo Yu, Jing Li, Matthew R. Walter, Hongyuan Mei

Abstract: Large language models such as ChatGPT and GPT-4 have recently achieved astonishing performance on a variety of natural language processing tasks. In this paper, we propose MANGO, a benchmark to evaluate their capabilities to perform text-based mapping and navigation. Our benchmark includes 53 mazes taken from a suite of textgames: each maze is paired with a walkthrough that visits every location b… ▽ More Large language models such as ChatGPT and GPT-4 have recently achieved astonishing performance on a variety of natural language processing tasks. In this paper, we propose MANGO, a benchmark to evaluate their capabilities to perform text-based mapping and navigation. Our benchmark includes 53 mazes taken from a suite of textgames: each maze is paired with a walkthrough that visits every location but does not cover all possible paths. The task is question-answering: for each maze, a large language model reads the walkthrough and answers hundreds of mapping and navigation questions such as "How should you go to Attic from West of House?" and "Where are we if we go north and east from Cellar?". Although these questions are easy to humans, it turns out that even GPT-4, the best-to-date language model, performs poorly at answering them. Further, our experiments suggest that a strong mapping and navigation ability would benefit large language models in performing relevant downstream tasks, such as playing textgames. Our MANGO benchmark will facilitate future research on methods that improve the mapping and navigation capabilities of language models. We host our leaderboard, data, code, and evaluation program at https://mango.ttic.edu and https://github.com/oaklight/mango/. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.19656 [pdf, other]

Topologically protected flatness in chiral moiré heterostructures

Authors: Valentin Crépel, Peize Ding, Nishchhal Verma, Nicolas Regnault, Raquel Queiroz

Abstract: The observation of delicate correlated phases in twisted heterostructures of graphene and transition metal dichalcogenides suggests an inherent resilience of moiré flat bands against certain types of disorder. We investigate the robustness of moiré flat bands in the chiral limit of the Bistrizer-MacDonald model, applicable to both platforms in certain limits. We show a drastic difference between t… ▽ More The observation of delicate correlated phases in twisted heterostructures of graphene and transition metal dichalcogenides suggests an inherent resilience of moiré flat bands against certain types of disorder. We investigate the robustness of moiré flat bands in the chiral limit of the Bistrizer-MacDonald model, applicable to both platforms in certain limits. We show a drastic difference between the protection of the first magic angle and higher magic angles to chiral symmetric disorder such as random higher moiré potential harmonics arising, for instance, from lattice relaxation. We find that the first magic angle is topologically protected by a topological index theorem, similar to the protection of the zeroth Landau level of Dirac fermions, whose flatness withstands any chiral symmetric perturbation such as non-uniform magnetic fields. Focusing on the first magic angle of twisted bilayer graphene, our analysis reveals a hidden non-local constant of motion that permits the decomposition of the non-abelian gauge field induced by inter-layer tunnelings into two decoupled abelian ones, underscoring a topological mechanism for band flatness. Through numerical simulations, we further show the strikingly different robustness of flat bands across protected (first) and unprotected (higher) magic angles in the presence of various types of disorder and identify the scattering processes that are enhanced or suppressed in the chiral limit. Interestingly, we find that the suppression of disorder broadening persists away from the chiral limit and is further accentuated by isolating a single sublattice polarized flat band in energy. Our analysis suggests the Berry curvature hotspot at the top of the K and K' valence band in the transition metal dichalcogenide monolayers is essential for the stability of its moiré flat bands and their correlated states. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.14520 [pdf, other]

Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference

Authors: Han Zhao, Min Zhang, Wei Zhao, Pengxiang Ding, Siteng Huang, Donglin Wang

Abstract: In recent years, the application of multimodal large language models (MLLM) in various fields has achieved remarkable success. However, as the foundation model for many downstream tasks, current MLLMs are composed of the well-known Transformer network, which has a less efficient quadratic computation complexity. To improve the efficiency of such basic models, we propose Cobra, a linear computation… ▽ More In recent years, the application of multimodal large language models (MLLM) in various fields has achieved remarkable success. However, as the foundation model for many downstream tasks, current MLLMs are composed of the well-known Transformer network, which has a less efficient quadratic computation complexity. To improve the efficiency of such basic models, we propose Cobra, a linear computational complexity MLLM. Specifically, Cobra integrates the efficient Mamba language model into the visual modality. Moreover, we explore and study various modal fusion schemes to create an effective multi-modal Mamba. Extensive experiments demonstrate that (1) Cobra achieves extremely competitive performance with current computationally efficient state-of-the-art methods, e.g., LLaVA-Phi, TinyLLaVA, and MobileVLM v2, and has faster speed due to Cobra's linear sequential modeling. (2) Interestingly, the results of closed-set challenging prediction benchmarks show that Cobra performs well in overcoming visual illusions and spatial relationship judgments. (3) Notably, Cobra even achieves comparable performance to LLaVA with about 43% of the number of parameters. We will make all codes of Cobra open-source and hope that the proposed method can facilitate future research on complexity problems in MLLM. Our project page is available at: https://sites.google.com/view/cobravlm. △ Less

Submitted 5 June, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: Update ablation results

arXiv:2403.13834 [pdf, other]

Few-shot Learning on Heterogeneous Graphs: Challenges, Progress, and Prospects

Authors: Pengfei Ding, Yan Wang, Guanfeng Liu

Abstract: Few-shot learning on heterogeneous graphs (FLHG) is attracting more attention from both academia and industry because prevailing studies on heterogeneous graphs often suffer from label sparsity. FLHG aims to tackle the performance degradation in the face of limited annotated data and there have been numerous recent studies proposing various methods and applications. In this paper, we provide a com… ▽ More Few-shot learning on heterogeneous graphs (FLHG) is attracting more attention from both academia and industry because prevailing studies on heterogeneous graphs often suffer from label sparsity. FLHG aims to tackle the performance degradation in the face of limited annotated data and there have been numerous recent studies proposing various methods and applications. In this paper, we provide a comprehensive review of existing FLHG methods, covering challenges, research progress, and future prospects. Specifically, we first formalize FLHG and categorize its methods into three types: single-heterogeneity FLHG, dual-heterogeneity FLHG, and multi-heterogeneity FLHG. Then, we analyze the research progress within each category, highlighting the most recent and representative developments. Finally, we identify and discuss promising directions for future research in FLHG. To the best of our knowledge, this paper is the first systematic and comprehensive review of FLHG. △ Less

Submitted 9 March, 2024; originally announced March 2024.

arXiv:2403.13358 [pdf, other]

GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot

Authors: Wenxuan Song, Han Zhao, Pengxiang Ding, Can Cui, Shangke Lyu, Yaning Fan, Donglin Wang

Abstract: Multi-task robot learning holds significant importance in tackling diverse and complex scenarios. However, current approaches are hindered by performance issues and difficulties in collecting training datasets. In this paper, we propose GeRM (Generalist Robotic Model). We utilize offline reinforcement learning to optimize data utilization strategies to learn from both demonstrations and sub-optima… ▽ More Multi-task robot learning holds significant importance in tackling diverse and complex scenarios. However, current approaches are hindered by performance issues and difficulties in collecting training datasets. In this paper, we propose GeRM (Generalist Robotic Model). We utilize offline reinforcement learning to optimize data utilization strategies to learn from both demonstrations and sub-optimal data, thus surpassing the limitations of human demonstrations. Thereafter, we employ a transformer-based VLA network to process multi-modal inputs and output actions. By introducing the Mixture-of-Experts structure, GeRM allows faster inference speed with higher whole model capacity, and thus resolves the issue of limited RL parameters, enhancing model performance in multi-task learning while controlling computational costs. Through a series of experiments, we demonstrate that GeRM outperforms other methods across all tasks, while also validating its efficiency in both training and inference processes. Additionally, we uncover its potential to acquire emergent skills. Additionally, we contribute the QUARD-Auto dataset, collected automatically to support our training approach and foster advancements in multi-task quadruped robot learning. This work presents a new paradigm for reducing the cost of collecting robot data and driving progress in the multi-task learning community. You can reach our project and video through the link: https://songwxuan.github.io/GeRM/ . △ Less

Submitted 9 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.03212 [pdf, other]

Performance of a modular ton-scale pixel-readout liquid argon time projection chamber

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, D. A. Andrade , et al. (1340 additional authors not shown)

Abstract: The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmi… ▽ More The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmic ray events collected in the spring of 2021. We use this sample to demonstrate the imaging performance of the charge and light readout systems as well as the signal correlations between the two. We also report argon purity and detector uniformity measurements, and provide comparisons to detector simulations. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 47 pages, 41 figures

Report number: FERMILAB-PUB-24-0073-LBNF

arXiv:2403.01477 [pdf, other]

Two-phase rejective sampling

Authors: Shu Yang, Peng Ding

Abstract: Rejective sampling improves design and estimation efficiency of single-phase sampling when auxiliary information in a finite population is available. When such auxiliary information is unavailable, we propose to use two-phase rejective sampling (TPRS), which involves measuring auxiliary variables for the sample of units in the first phase, followed by the implementation of rejective sampling for t… ▽ More Rejective sampling improves design and estimation efficiency of single-phase sampling when auxiliary information in a finite population is available. When such auxiliary information is unavailable, we propose to use two-phase rejective sampling (TPRS), which involves measuring auxiliary variables for the sample of units in the first phase, followed by the implementation of rejective sampling for the outcome in the second phase. We explore the asymptotic design properties of double expansion and regression estimators under TPRS. We show that TPRS enhances the efficiency of the double expansion estimator, rendering it comparable to a regression estimator. We further refine the design to accommodate varying importance of covariates and extend it to multi-phase sampling. △ Less

Submitted 3 March, 2024; originally announced March 2024.

arXiv:2402.10537 [pdf, other]

Quantifying Individual Risk for Binary Outcome: Bounds and Inference

Authors: Peng Wu, Peng Ding, Zhi Geng, Yue Liu

Abstract: Understanding treatment heterogeneity is crucial for reliable decision-making in treatment evaluation and selection. While the conditional average treatment effect (CATE) is commonly used to capture treatment heterogeneity induced by covariates and design individualized treatment policies, it remains an averaging metric within subpopulations. This limitation prevents it from unveiling individual-l… ▽ More Understanding treatment heterogeneity is crucial for reliable decision-making in treatment evaluation and selection. While the conditional average treatment effect (CATE) is commonly used to capture treatment heterogeneity induced by covariates and design individualized treatment policies, it remains an averaging metric within subpopulations. This limitation prevents it from unveiling individual-level risks, potentially leading to misleading results. This article addresses this gap by examining individual risk for binary outcomes, specifically focusing on the fraction negatively affected (FNA) conditional on covariates -- a metric assessing the percentage of individuals experiencing worse outcomes with treatment compared to control. Under the strong ignorability assumption, FNA is unidentifiable, and we find that previous bounds are wide and practically unattainable except in certain degenerate cases. By introducing a plausible positive correlation assumption for the potential outcomes, we obtain significantly improved bounds compared to previous studies. We show that even with a positive and statistically significant CATE, the lower bound on FNA can be positive, i.e., in the best-case scenario many units will be harmed if receiving treatment. We establish a nonparametric sensitivity analysis framework for FNA using the Pearson correlation coefficient as the sensitivity parameter, thereby exploring the relationships among the correlation coefficient, FNA, and CATE. We also present a practical and tractable method for selecting the range of correlation coefficients. Furthermore, we propose flexible estimators for refined FNA bounds and prove their consistency and asymptotic normality. △ Less

Submitted 16 February, 2024; originally announced February 2024.

arXiv:2402.01568 [pdf, other]

Doping Liquid Argon with Xenon in ProtoDUNE Single-Phase: Effects on Scintillation Light

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, H. Amar Es-sghir, P. Amedo, J. Anderson, D. A. Andrade, C. Andreopoulos , et al. (1300 additional authors not shown)

Abstract: Doping of liquid argon TPCs (LArTPCs) with a small concentration of xenon is a technique for light-shifting and facilitates the detection of the liquid argon scintillation light. In this paper, we present the results of the first doping test ever performed in a kiloton-scale LArTPC. From February to May 2020, we carried out this special run in the single-phase DUNE Far Detector prototype (ProtoDUN… ▽ More Doping of liquid argon TPCs (LArTPCs) with a small concentration of xenon is a technique for light-shifting and facilitates the detection of the liquid argon scintillation light. In this paper, we present the results of the first doping test ever performed in a kiloton-scale LArTPC. From February to May 2020, we carried out this special run in the single-phase DUNE Far Detector prototype (ProtoDUNE-SP) at CERN, featuring 770 t of total liquid argon mass with 410 t of fiducial mass. The goal of the run was to measure the light and charge response of the detector to the addition of xenon, up to a concentration of 18.8 ppm. The main purpose was to test the possibility for reduction of non-uniformities in light collection, caused by deployment of photon detectors only within the anode planes. Light collection was analysed as a function of the xenon concentration, by using the pre-existing photon detection system (PDS) of ProtoDUNE-SP and an additional smaller set-up installed specifically for this run. In this paper we first summarize our current understanding of the argon-xenon energy transfer process and the impact of the presence of nitrogen in argon with and without xenon dopant. We then describe the key elements of ProtoDUNE-SP and the injection method deployed. Two dedicated photon detectors were able to collect the light produced by xenon and the total light. The ratio of these components was measured to be about 0.65 as 18.8 ppm of xenon were injected. We performed studies of the collection efficiency as a function of the distance between tracks and light detectors, demonstrating enhanced uniformity of response for the anode-mounted PDS. We also show that xenon doping can substantially recover light losses due to contamination of the liquid argon by nitrogen. △ Less

Submitted 9 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: 35 pages, 20 figures

Report number: CERN-EP-2024-024; FERMILAB-PUB-23-0819-LBNF

arXiv:2402.01271 [pdf, other]

An Intra-BRNN and GB-RVQ Based END-TO-END Neural Audio Codec

Authors: Linping Xu, Jiawei Jiang, Dejun Zhang, Xianjun Xia, Li Chen, Yijian Xiao, Piao Ding, Shenyi Song, Sixing Yin, Ferdous Sohel

Abstract: Recently, neural networks have proven to be effective in performing speech coding task at low bitrates. However, under-utilization of intra-frame correlations and the error of quantizer specifically degrade the reconstructed audio quality. To improve the coding quality, we present an end-to-end neural speech codec, namely CBRC (Convolutional and Bidirectional Recurrent neural Codec). An interleave… ▽ More Recently, neural networks have proven to be effective in performing speech coding task at low bitrates. However, under-utilization of intra-frame correlations and the error of quantizer specifically degrade the reconstructed audio quality. To improve the coding quality, we present an end-to-end neural speech codec, namely CBRC (Convolutional and Bidirectional Recurrent neural Codec). An interleaved structure using 1D-CNN and Intra-BRNN is designed to exploit the intra-frame correlations more efficiently. Furthermore, Group-wise and Beam-search Residual Vector Quantizer (GB-RVQ) is used to reduce the quantization noise. CBRC encodes audio every 20ms with no additional latency, which is suitable for real-time communication. Experimental results demonstrate the superiority of the proposed codec when comparing CBRC at 3kbps with Opus at 12kbps. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: INTERSPEECH 2023

arXiv:2401.04525 [pdf, other]

Implantable Photonic Neural Probes with Out-of-Plane Focusing Grating Emitters

Authors: Tianyuan Xue, Andrei Stalmashonak, Fu-Der Chen, Peisheng Ding, Xianshu Luo, Hongyao Chua, Guo-Qiang Lo, Wesley D. Sacher, Joyce K. S. Poon

Abstract: We have designed, fabricated, and characterized implantable silicon neural probes with nanophotonic grating emitters that focus the emitted light at a specified distance above the surface of the probe for spatially precise optogenetic targeting of neurons. Using the holographic principle, we designed gratings for wavelengths of 488 and 594 nm, targeting the excitation spectra of the optogenetic ac… ▽ More We have designed, fabricated, and characterized implantable silicon neural probes with nanophotonic grating emitters that focus the emitted light at a specified distance above the surface of the probe for spatially precise optogenetic targeting of neurons. Using the holographic principle, we designed gratings for wavelengths of 488 and 594 nm, targeting the excitation spectra of the optogenetic actuators Channelrhodopsin-2 and Chrimson, respectively. The measured optical emission pattern of these emitters in non-scattering medium and tissue matched well with simulations. To our knowledge, this is the first report of focused spots with the size scale of a neuron soma in brain tissue formed from implantable neural probes. △ Less

Submitted 10 January, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

Comments: 11 pages, 6 figures

arXiv:2401.03597 [pdf, other]

Few-Shot Causal Representation Learning for Out-of-Distribution Generalization on Heterogeneous Graphs

Authors: Pengfei Ding, Yan Wang, Guanfeng Liu, Nan Wang, Xiaofang Zhou

Abstract: Heterogeneous graph few-shot learning (HGFL) has been developed to address the label sparsity issue in heterogeneous graphs (HGs), which consist of various types of nodes and edges. The core concept of HGFL is to extract knowledge from rich-labeled classes in a source HG, transfer this knowledge to a target HG to facilitate learning new classes with few-labeled training data, and finally make pred… ▽ More Heterogeneous graph few-shot learning (HGFL) has been developed to address the label sparsity issue in heterogeneous graphs (HGs), which consist of various types of nodes and edges. The core concept of HGFL is to extract knowledge from rich-labeled classes in a source HG, transfer this knowledge to a target HG to facilitate learning new classes with few-labeled training data, and finally make predictions on unlabeled testing data. Existing methods typically assume that the source HG, training data, and testing data all share the same distribution. However, in practice, distribution shifts among these three types of data are inevitable due to two reasons: (1) the limited availability of the source HG that matches the target HG distribution, and (2) the unpredictable data generation mechanism of the target HG. Such distribution shifts result in ineffective knowledge transfer and poor learning performance in existing methods, thereby leading to a novel problem of out-of-distribution (OOD) generalization in HGFL. To address this challenging problem, we propose a novel Causal OOD Heterogeneous graph Few-shot learning model, namely COHF. In COHF, we first characterize distribution shifts in HGs with a structural causal model, establishing an invariance principle for OOD generalization in HGFL. Then, following this invariance principle, we propose a new variational autoencoder-based heterogeneous graph neural network to mitigate the impact of distribution shifts. Finally, by integrating this network with a novel meta-learning framework, COHF effectively transfers knowledge to the target HG to predict new classes with few-labeled data. Extensive experiments on seven real-world datasets have demonstrated the superior performance of COHF over the state-of-the-art methods. △ Less

Submitted 16 April, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

arXiv:2401.00649 [pdf, other]

Linear Model and Extensions

Authors: Peng Ding

Abstract: I developed the lecture notes based on my ``Linear Model'' course at the University of California Berkeley over the past seven years. This book provides an intermediate-level introduction to the linear model. It balances rigorous proofs and heuristic arguments. This book provides R code to replicate all simulation studies and case studies. I developed the lecture notes based on my ``Linear Model'' course at the University of California Berkeley over the past seven years. This book provides an intermediate-level introduction to the linear model. It balances rigorous proofs and heuristic arguments. This book provides R code to replicate all simulation studies and case studies. △ Less

Submitted 31 December, 2023; originally announced January 2024.

arXiv:2312.14457 [pdf, other]

QUAR-VLA: Vision-Language-Action Model for Quadruped Robots

Authors: Pengxiang Ding, Han Zhao, Wenxuan Song, Wenjie Zhang, Min Zhang, Siteng Huang, Ningxi Yang, Donglin Wang

Abstract: The important manifestation of robot intelligence is the ability to naturally interact and autonomously make decisions. Traditional approaches to robot control often compartmentalize perception, planning, and decision-making, simplifying system design but limiting the synergy between different information streams. This compartmentalization poses challenges in achieving seamless autonomous reasonin… ▽ More The important manifestation of robot intelligence is the ability to naturally interact and autonomously make decisions. Traditional approaches to robot control often compartmentalize perception, planning, and decision-making, simplifying system design but limiting the synergy between different information streams. This compartmentalization poses challenges in achieving seamless autonomous reasoning, decision-making, and action execution. To address these limitations, a novel paradigm, named Vision-Language-Action tasks for QUAdruped Robots (QUAR-VLA), has been introduced in this paper. This approach tightly integrates visual information and instructions to generate executable actions, effectively merging perception, planning, and decision-making. The central idea is to elevate the overall intelligence of the robot. Within this framework, a notable challenge lies in aligning fine-grained instructions with visual perception information. This emphasizes the complexity involved in ensuring that the robot accurately interprets and acts upon detailed instructions in harmony with its visual observations. Consequently, we propose QUAdruped Robotic Transformer (QUART), a family of VLA models to integrate visual information and instructions from diverse modalities as input and generates executable actions for real-world robots and present QUAdruped Robot Dataset (QUARD), a large-scale multi-task dataset including navigation, complex terrain locomotion, and whole-body manipulation tasks for training QUART models. Our extensive evaluation (4000 evaluation trials) shows that our approach leads to performant robotic policies and enables QUART to obtain a range of emergent capabilities. △ Less

Submitted 6 July, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

arXiv:2312.11972 [pdf, other]

Expressive Forecasting of 3D Whole-body Human Motions

Authors: Pengxiang Ding, Qiongjie Cui, Min Zhang, Mengyuan Liu, Haofan Wang, Donglin Wang

Abstract: Human motion forecasting, with the goal of estimating future human behavior over a period of time, is a fundamental task in many real-world applications. However, existing works typically concentrate on predicting the major joints of the human body without considering the delicate movements of the human hands. In practical applications, hand gesture plays an important role in human communication w… ▽ More Human motion forecasting, with the goal of estimating future human behavior over a period of time, is a fundamental task in many real-world applications. However, existing works typically concentrate on predicting the major joints of the human body without considering the delicate movements of the human hands. In practical applications, hand gesture plays an important role in human communication with the real world, and expresses the primary intention of human beings. In this work, we are the first to formulate a whole-body human pose forecasting task, which jointly predicts the future body and hand activities. Correspondingly, we propose a novel Encoding-Alignment-Interaction (EAI) framework that aims to predict both coarse (body joints) and fine-grained (gestures) activities collaboratively, enabling expressive and cross-facilitated forecasting of 3D whole-body human motions. Specifically, our model involves two key constituents: cross-context alignment (XCA) and cross-context interaction (XCI). Considering the heterogeneous information within the whole-body, XCA aims to align the latent features of various human components, while XCI focuses on effectively capturing the context interaction among the human components. We conduct extensive experiments on a newly-introduced large-scale benchmark and achieve state-of-the-art performance. The code is public for research purposes at https://github.com/Dingpx/EAI. △ Less

Submitted 4 April, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

Comments: Accepted by AAAI24

arXiv:2312.03130 [pdf, other]

The DUNE Far Detector Vertical Drift Technology, Technical Design Report

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, H. Amar, P. Amedo, J. Anderson, D. A. Andrade, C. Andreopoulos , et al. (1304 additional authors not shown)

Abstract: DUNE is an international experiment dedicated to addressing some of the questions at the forefront of particle physics and astrophysics, including the mystifying preponderance of matter over antimatter in the early universe. The dual-site experiment will employ an intense neutrino beam focused on a near and a far detector as it aims to determine the neutrino mass hierarchy and to make high-precisi… ▽ More DUNE is an international experiment dedicated to addressing some of the questions at the forefront of particle physics and astrophysics, including the mystifying preponderance of matter over antimatter in the early universe. The dual-site experiment will employ an intense neutrino beam focused on a near and a far detector as it aims to determine the neutrino mass hierarchy and to make high-precision measurements of the PMNS matrix parameters, including the CP-violating phase. It will also stand ready to observe supernova neutrino bursts, and seeks to observe nucleon decay as a signature of a grand unified theory underlying the standard model. The DUNE far detector implements liquid argon time-projection chamber (LArTPC) technology, and combines the many tens-of-kiloton fiducial mass necessary for rare event searches with the sub-centimeter spatial resolution required to image those events with high precision. The addition of a photon detection system enhances physics capabilities for all DUNE physics drivers and opens prospects for further physics explorations. Given its size, the far detector will be implemented as a set of modules, with LArTPC designs that differ from one another as newer technologies arise. In the vertical drift LArTPC design, a horizontal cathode bisects the detector, creating two stacked drift volumes in which ionization charges drift towards anodes at either the top or bottom. The anodes are composed of perforated PCB layers with conductive strips, enabling reconstruction in 3D. Light-trap-style photon detection modules are placed both on the cryostat's side walls and on the central cathode where they are optically powered. This Technical Design Report describes in detail the technical implementations of each subsystem of this LArTPC that, together with the other far detector modules and the near detector, will enable DUNE to achieve its physics goals. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: 425 pages; 281 figures Central editing team: A. Heavey, S. Kettell, A. Marchionni, S. Palestini, S. Rajogopalan, R. J. Wilson

Report number: Fermilab Report no: TM-2813-LBNF

arXiv:2311.11242 [pdf, other]

Coherent postionization dynamics of molecules based on adiabatic strong-field approximation

Authors: Shan Xue, Wenli Yang, Ping Li, Yuxuan Zhang, Pengji Ding, Song-Feng Zhao, Hongchuan Du, Anh-Thu Le

Abstract: Open-system density matrix methods typically employ incoherent population injection to investigate the postionization dynamics in strong laser fields. The presence of coherence injection has long been a subject of debate. In this context, we introduce a coherence injection model based on the adiabatic strong-field approximation (ASFA). This model effectively predicts ionic coherence resulting from… ▽ More Open-system density matrix methods typically employ incoherent population injection to investigate the postionization dynamics in strong laser fields. The presence of coherence injection has long been a subject of debate. In this context, we introduce a coherence injection model based on the adiabatic strong-field approximation (ASFA). This model effectively predicts ionic coherence resulting from directional tunnel ionization. With increasing field strength, the degree of coherence predicted by the ASFA model gradually deviates from that of the SFA model but remains much milder compared to the results of the simple and partial-wave expansion models. The impact of coherence injection on the postionization molecular dynamics is explored in O$_2$ and N$_2$. We find that the ionization-induced vibrational coherence strongly enhances the population inversion of $X^2 Σ_g^+ -B^2 Σ_u^+ $ in N$_2^+$ and the dissociation probability of O$_2^+$. Conversely, the ionization-induced vibronic coherences have inhibitory effects on the related transitions. These findings reveal the significance of including the vibronic-state-resolved coherence injection in simulating molecular dynamics following strong-field ionization. △ Less

Submitted 19 November, 2023; originally announced November 2023.

Comments: 12 pages, 7 figures

arXiv:2311.10877 [pdf, other]

Covariate adjustment in randomized experiments with missing outcomes and covariates

Authors: Anqi Zhao, Peng Ding, Fan Li

Abstract: Covariate adjustment can improve precision in analyzing randomized experiments. With fully observed data, regression adjustment and propensity score weighting are asymptotically equivalent in improving efficiency over unadjusted analysis. When some outcomes are missing, we consider combining these two adjustment methods with inverse probability of observation weighting for handling missing outcome… ▽ More Covariate adjustment can improve precision in analyzing randomized experiments. With fully observed data, regression adjustment and propensity score weighting are asymptotically equivalent in improving efficiency over unadjusted analysis. When some outcomes are missing, we consider combining these two adjustment methods with inverse probability of observation weighting for handling missing outcomes, and show that the equivalence between the two methods breaks down. Regression adjustment no longer ensures efficiency gain over unadjusted analysis unless the true outcome model is linear in covariates or the outcomes are missing completely at random. Propensity score weighting, in contrast, still guarantees efficiency over unadjusted analysis, and including more covariates in adjustment never harms asymptotic efficiency. Moreover, we establish the value of using partially observed covariates to secure additional efficiency by the missingness indicator method, which imputes all missing covariates by zero and uses the union of the completed covariates and corresponding missingness indicators as the new, fully observed covariates. Based on these findings, we recommend using regression adjustment in combination with the missingness indicator method if the linear outcome model or missing complete at random assumption is plausible and using propensity score weighting with the missingness indicator method otherwise. △ Less

Submitted 4 March, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

arXiv:2311.10076 [pdf, other]

A decorrelation method for general regression adjustment in randomized experiments

Authors: Fangzhou Su, Wenlong Mou, Peng Ding, Martin J. Wainwright

Abstract: We study regression adjustment with general function class approximations for estimating the average treatment effect in the design-based setting. Standard regression adjustment involves bias due to sample re-use, and this bias leads to behavior that is sub-optimal in the sample size, and/or imposes restrictive assumptions. Our main contribution is to introduce a novel decorrelation-based approach… ▽ More We study regression adjustment with general function class approximations for estimating the average treatment effect in the design-based setting. Standard regression adjustment involves bias due to sample re-use, and this bias leads to behavior that is sub-optimal in the sample size, and/or imposes restrictive assumptions. Our main contribution is to introduce a novel decorrelation-based approach that circumvents these issues. We prove guarantees, both asymptotic and non-asymptotic, relative to the oracle functions that are targeted by a given regression adjustment procedure. We illustrate our method by applying it to various high-dimensional and non-parametric problems, exhibiting improved sample complexity and weakened assumptions relative to known approaches. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: Fangzhou Su and Wenlong Mou contributed equally to this work

arXiv:2311.08268 [pdf, other]

A Wolf in Sheep's Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily

Authors: Peng Ding, Jun Kuang, Dan Ma, Xuezhi Cao, Yunsen Xian, Jiajun Chen, Shujian Huang

Abstract: Large Language Models (LLMs), such as ChatGPT and GPT-4, are designed to provide useful and safe responses. However, adversarial prompts known as 'jailbreaks' can circumvent safeguards, leading LLMs to generate potentially harmful content. Exploring jailbreak prompts can help to better reveal the weaknesses of LLMs and further steer us to secure them. Unfortunately, existing jailbreak methods eith… ▽ More Large Language Models (LLMs), such as ChatGPT and GPT-4, are designed to provide useful and safe responses. However, adversarial prompts known as 'jailbreaks' can circumvent safeguards, leading LLMs to generate potentially harmful content. Exploring jailbreak prompts can help to better reveal the weaknesses of LLMs and further steer us to secure them. Unfortunately, existing jailbreak methods either suffer from intricate manual design or require optimization on other white-box models, which compromises either generalization or efficiency. In this paper, we generalize jailbreak prompt attacks into two aspects: (1) Prompt Rewriting and (2) Scenario Nesting. Based on this, we propose ReNeLLM, an automatic framework that leverages LLMs themselves to generate effective jailbreak prompts. Extensive experiments demonstrate that ReNeLLM significantly improves the attack success rate while greatly reducing the time cost compared to existing baselines. Our study also reveals the inadequacy of current defense methods in safeguarding LLMs. Finally, we analyze the failure of LLMs defense from the perspective of prompt execution priority, and propose corresponding defense strategies. We hope that our research can catalyze both the academic community and LLMs developers towards the provision of safer and more regulated LLMs. The code is available at https://github.com/NJUNLP/ReNeLLM. △ Less

Submitted 6 April, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

Comments: Acccepted by NAACL 2024, 18 pages, 7 figures, 13 tables

arXiv:2311.07835 [pdf, other]

doi 10.1103/PhysRevD.110.012005

Expanding neutrino oscillation parameter measurements in NOvA using a Bayesian approach

Authors: NOvA Collaboration, M. A. Acero, B. Acharya, P. Adamson, N. Anfimov, A. Antoshkin, E. Arrieta-Diaz, L. Asquith, A. Aurisano, A. Back, N. Balashov, P. Baldi, B. A. Bambah, A. Bat, K. Bays, R. Bernstein, T. J. C. Bezerra, V. Bhatnagar, D. Bhattarai, B. Bhuyan, J. Bian, A. C. Booth, R. Bowles, B. Brahma, C. Bromberg , et al. (174 additional authors not shown)

Abstract: NOvA is a long-baseline neutrino oscillation experiment that measures oscillations in charged-current $ν_μ \rightarrow ν_μ$ (disappearance) and $ν_μ \rightarrow ν_{e}$ (appearance) channels, and their antineutrino counterparts, using neutrinos of energies around 2 GeV over a distance of 810 km. In this work we reanalyze the dataset first examined in our previous paper [Phys. Rev. D 106, 032004 (20… ▽ More NOvA is a long-baseline neutrino oscillation experiment that measures oscillations in charged-current $ν_μ \rightarrow ν_μ$ (disappearance) and $ν_μ \rightarrow ν_{e}$ (appearance) channels, and their antineutrino counterparts, using neutrinos of energies around 2 GeV over a distance of 810 km. In this work we reanalyze the dataset first examined in our previous paper [Phys. Rev. D 106, 032004 (2022)] using an alternative statistical approach based on Bayesian Markov Chain Monte Carlo. We measure oscillation parameters consistent with the previous results. We also extend our inferences to include the first NOvA measurements of the reactor mixing angle $θ_{13}$ and the Jarlskog invariant. We use these results to quantify the strength of our inferences about CP violation, as well as to examine the effects of constraints from short-baseline measurements of $θ_{13}$ using antineutrinos from nuclear reactors when making NOvA measurements of $θ_{23}$. Our long-baseline measurement of $θ_{13}$ is also shown to be consistent with the reactor measurements, supporting the general applicability and robustness of the PMNS framework for neutrino oscillations. △ Less

Submitted 27 May, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: 20 pages, 17 figures; version accepted by Phys. Rev. D. Data associated with this paper is available at https://doi.org/10.15484/2349444

Report number: FERMILAB-PUB-23-667-AD-CSAID-ND

Journal ref: Phys.Rev.D 110 (2024) 1, 012005

arXiv:2310.04660 [pdf, other]

Balancing Weights for Causal Inference in Observational Factorial Studies

Authors: Ruoqi Yu, Peng Ding

Abstract: Many scientific questions in biomedical, environmental, and psychological research involve understanding the impact of multiple factors on outcomes. While randomized factorial experiments are ideal for this purpose, randomization is infeasible in many empirical studies. Therefore, investigators often rely on observational data, where drawing reliable causal inferences for multiple factors remains… ▽ More Many scientific questions in biomedical, environmental, and psychological research involve understanding the impact of multiple factors on outcomes. While randomized factorial experiments are ideal for this purpose, randomization is infeasible in many empirical studies. Therefore, investigators often rely on observational data, where drawing reliable causal inferences for multiple factors remains challenging. As the number of treatment combinations grows exponentially with the number of factors, some treatment combinations can be rare or even missing by chance in observed data, further complicating factorial effects estimation. To address these challenges, we propose a novel weighting method tailored to observational studies with multiple factors. Our approach uses weighted observational data to emulate a randomized factorial experiment, enabling simultaneous estimation of the effects of multiple factors and their interactions. Our investigations reveal a crucial nuance: achieving balance among covariates, as in single-factor scenarios, is necessary but insufficient for unbiasedly estimating factorial effects. Our findings suggest that balancing the factors is also essential in multi-factor settings. Moreover, we extend our weighting method to handle missing treatment combinations in observed data. Finally, we study the asymptotic behavior of the new weighting estimators and propose a consistent variance estimator, providing reliable inferences on factorial effects in observational studies. △ Less

Submitted 6 October, 2023; originally announced October 2023.

arXiv:2309.15769 [pdf, other]

Algebraic and Statistical Properties of the Ordinary Least Squares Interpolator

Authors: Dennis Shen, Dogyoon Song, Peng Ding, Jasjeet S. Sekhon

Abstract: Deep learning research has uncovered the phenomenon of benign overfitting for overparameterized statistical models, which has drawn significant theoretical interest in recent years. Given its simplicity and practicality, the ordinary least squares (OLS) interpolator has become essential to gain foundational insights into this phenomenon. While properties of OLS are well established in classical, u… ▽ More Deep learning research has uncovered the phenomenon of benign overfitting for overparameterized statistical models, which has drawn significant theoretical interest in recent years. Given its simplicity and practicality, the ordinary least squares (OLS) interpolator has become essential to gain foundational insights into this phenomenon. While properties of OLS are well established in classical, underparameterized settings, its behavior in high-dimensional, overparameterized regimes is less explored (unlike for ridge or lasso regression) though significant progress has been made of late. We contribute to this growing literature by providing fundamental algebraic and statistical results for the minimum $\ell_2$-norm OLS interpolator. In particular, we provide algebraic equivalents of (i) the leave-$k$-out residual formula, (ii) Cochran's formula, and (iii) the Frisch-Waugh-Lovell theorem in the overparameterized regime. These results aid in understanding the OLS interpolator's ability to generalize and have substantive implications for causal inference. Under the Gauss-Markov model, we present statistical results such as an extension of the Gauss-Markov theorem and an analysis of variance estimation under homoskedastic errors for the overparameterized regime. To substantiate our theoretical contributions, we conduct simulations that further explore the stochastic properties of the OLS interpolator. △ Less

Submitted 30 May, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.12425 [pdf, other]

Principal Stratification with Continuous Post-Treatment Variables: Nonparametric Identification and Semiparametric Estimation

Authors: Sizhu Lu, Zhichao Jiang, Peng Ding

Abstract: Post-treatment variables often complicate causal inference. They appear in many scientific problems, including noncompliance, truncation by death, mediation, and surrogate endpoint evaluation. Principal stratification is a strategy to address these challenges by adjusting for the potential values of the post-treatment variables, defined as the principal strata. It allows for characterizing treatme… ▽ More Post-treatment variables often complicate causal inference. They appear in many scientific problems, including noncompliance, truncation by death, mediation, and surrogate endpoint evaluation. Principal stratification is a strategy to address these challenges by adjusting for the potential values of the post-treatment variables, defined as the principal strata. It allows for characterizing treatment effect heterogeneity across principal strata and unveiling the mechanism of the treatment's impact on the outcome related to post-treatment variables. However, the existing literature has primarily focused on binary post-treatment variables, leaving the case with continuous post-treatment variables largely unexplored. This gap persists due to the complexity of infinitely many principal strata, which present challenges to both the identification and estimation of causal effects. We fill this gap by providing nonparametric identification and semiparametric estimation theory for principal stratification with continuous post-treatment variables. We propose to use working models to approximate the underlying causal effect surfaces and derive the efficient influence functions of the corresponding model parameters. Based on the theory, we construct doubly robust estimators and implement them in an R package. △ Less

Submitted 3 April, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

arXiv:2309.07476 [pdf, other]

Causal inference in network experiments: regression-based analysis and design-based properties

Authors: Mengsi Gao, Peng Ding

Abstract: Investigating interference or spillover effects among units is a central task in many social science problems. Network experiments are powerful tools for this task, which avoids endogeneity by randomly assigning treatments to units over networks. However, it is non-trivial to analyze network experiments properly without imposing strong modeling assumptions. Previously, many researchers have propos… ▽ More Investigating interference or spillover effects among units is a central task in many social science problems. Network experiments are powerful tools for this task, which avoids endogeneity by randomly assigning treatments to units over networks. However, it is non-trivial to analyze network experiments properly without imposing strong modeling assumptions. Previously, many researchers have proposed sophisticated point estimators and standard errors for causal effects under network experiments. We further show that regression-based point estimators and standard errors can have strong theoretical guarantees if the regression functions and robust standard errors are carefully specified to accommodate the interference patterns under network experiments. We first recall a well-known result that the Hajek estimator is numerically identical to the coefficient from the weighted-least-squares fit based on the inverse probability of the exposure mapping. Moreover, we demonstrate that the regression-based approach offers three notable advantages: its ease of implementation, the ability to derive standard errors through the same weighted-least-squares fit, and the capacity to integrate covariates into the analysis, thereby enhancing estimation efficiency. Furthermore, we analyze the asymptotic bias of the regression-based network-robust standard errors. Recognizing that the covariance estimator can be anti-conservative, we propose an adjusted covariance estimator to improve the empirical coverage rates. Although we focus on regression-based point estimators and standard errors, our theory holds under the design-based framework, which assumes that the randomness comes solely from the design of network experiments and allows for arbitrary misspecification of the regression models. △ Less

Submitted 20 November, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

arXiv:2309.07273 [pdf]

Real Effect or Bias? Best Practices for Evaluating the Robustness of Real-World Evidence through Quantitative Sensitivity Analysis for Unmeasured Confounding

Authors: Douglas Faries, Chenyin Gao, Xiang Zhang, Chad Hazlett, James Stamey, Shu Yang, Peng Ding, Mingyang Shan, Kristin Sheffield, Nancy Dreyer

Abstract: The assumption of no unmeasured confounders is a critical but unverifiable assumption required for causal inference yet quantitative sensitivity analyses to assess robustness of real-world evidence remains underutilized. The lack of use is likely in part due to complexity of implementation and often specific and restrictive data requirements required for application of each method. With the advent… ▽ More The assumption of no unmeasured confounders is a critical but unverifiable assumption required for causal inference yet quantitative sensitivity analyses to assess robustness of real-world evidence remains underutilized. The lack of use is likely in part due to complexity of implementation and often specific and restrictive data requirements required for application of each method. With the advent of sensitivity analyses methods that are broadly applicable in that they do not require identification of a specific unmeasured confounder, along with publicly available code for implementation, roadblocks toward broader use are decreasing. To spur greater application, here we present a best practice guidance to address the potential for unmeasured confounding at both the design and analysis stages, including a set of framing questions and an analytic toolbox for researchers. The questions at the design stage guide the research through steps evaluating the potential robustness of the design while encouraging gathering of additional data to reduce uncertainty due to potential confounding. At the analysis stage, the questions guide researchers to quantifying the robustness of the observed result and providing researchers with a clearer indication of the robustness of their conclusions. We demonstrate the application of the guidance using simulated data based on a real-world fibromyalgia study, applying multiple methods from our analytic toolbox for illustration purposes. △ Less

Submitted 13 September, 2023; originally announced September 2023.

Comments: 16 pages which includes 5 figures

MSC Class: Primary 62

arXiv:2309.02835 [pdf]

A flexible and accurate total variation and cascaded denoisers-based image reconstruction algorithm for hyperspectrally compressed ultrafast photography

Authors: Zihan Guo, Jiali Yao, Dalong Qi, Pengpeng Ding, Chengzhi Jin, Ning Xu, Zhiling Zhang, Yunhua Yao, Lianzhong Deng, Zhiyong Wang, Zhenrong Sun, Shian Zhang

Abstract: Hyperspectrally compressed ultrafast photography (HCUP) based on compressed sensing and the time- and spectrum-to-space mappings can simultaneously realize the temporal and spectral imaging of non-repeatable or difficult-to-repeat transient events passively in a single exposure. It possesses an incredibly high frame rate of tens of trillions of frames per second and a sequence depth of several hun… ▽ More Hyperspectrally compressed ultrafast photography (HCUP) based on compressed sensing and the time- and spectrum-to-space mappings can simultaneously realize the temporal and spectral imaging of non-repeatable or difficult-to-repeat transient events passively in a single exposure. It possesses an incredibly high frame rate of tens of trillions of frames per second and a sequence depth of several hundred, and plays a revolutionary role in single-shot ultrafast optical imaging. However, due to the ultra-high data compression ratio induced by the extremely large sequence depth as well as the limited fidelities of traditional reconstruction algorithms over the reconstruction process, HCUP suffers from a poor image reconstruction quality and fails to capture fine structures in complex transient scenes. To overcome these restrictions, we propose a flexible image reconstruction algorithm based on the total variation (TV) and cascaded denoisers (CD) for HCUP, named the TV-CD algorithm. It applies the TV denoising model cascaded with several advanced deep learning-based denoising models in the iterative plug-and-play alternating direction method of multipliers framework, which can preserve the image smoothness while utilizing the deep denoising networks to obtain more priori, and thus solving the common sparsity representation problem in local similarity and motion compensation. Both simulation and experimental results show that the proposed TV-CD algorithm can effectively improve the image reconstruction accuracy and quality of HCUP, and further promote the practical applications of HCUP in capturing high-dimensional complex physical, chemical and biological ultrafast optical scenes. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: 25 pages, 5 figures and 1 table

arXiv:2308.05275 [pdf, other]

Cross-heterogeneity Graph Few-shot Learning

Authors: Pengfei Ding, Yan Wang, Guanfeng Liu

Abstract: In recent years, heterogeneous graph few-shot learning has been proposed to address the label sparsity issue in heterogeneous graphs (HGs), which contain various types of nodes and edges. The existing methods have achieved good performance by transferring generalized knowledge extracted from rich-labeled classes in source HG(s) to few-labeled classes in a target HG. However, these methods only con… ▽ More In recent years, heterogeneous graph few-shot learning has been proposed to address the label sparsity issue in heterogeneous graphs (HGs), which contain various types of nodes and edges. The existing methods have achieved good performance by transferring generalized knowledge extracted from rich-labeled classes in source HG(s) to few-labeled classes in a target HG. However, these methods only consider the single-heterogeneity scenario where the source and target HGs share a fixed set of node/edge types, ignoring the more general scenario of cross-heterogeneity, where each HG can have a different and non-fixed set of node/edge types. To this end, we focus on the unexplored cross-heterogeneity scenario and propose a novel model for Cross-heterogeneity Graph Few-shot Learning, namely CGFL. In CGFL, we first extract meta-patterns to capture heterogeneous information and propose a multi-view heterogeneous graph neural network (MHGN) to learn meta-patterns across HGs. Then, we propose a score module to measure the informativeness of labeled samples and determine the transferability of each source HG. Finally, by integrating MHGN and the score module into a meta-learning mechanism, CGFL can effectively transfer generalized knowledge to predict new classes with few-labeled data. Extensive experiments on four real-world datasets have demonstrated the superior performance of CGFL over the state-of-the-art methods. △ Less

Submitted 9 August, 2023; originally announced August 2023.

arXiv:2308.03271 [pdf, other]

Local Structure-aware Graph Contrastive Representation Learning

Authors: Kai Yang, Yuan Liu, Zijuan Zhao, Peijin Ding, Wenqian Zhao

Abstract: Traditional Graph Neural Network (GNN), as a graph representation learning method, is constrained by label information. However, Graph Contrastive Learning (GCL) methods, which tackle the label problem effectively, mainly focus on the feature information of the global graph or small subgraph structure (e.g., the first-order neighborhood). In the paper, we propose a Local Structure-aware Graph Cont… ▽ More Traditional Graph Neural Network (GNN), as a graph representation learning method, is constrained by label information. However, Graph Contrastive Learning (GCL) methods, which tackle the label problem effectively, mainly focus on the feature information of the global graph or small subgraph structure (e.g., the first-order neighborhood). In the paper, we propose a Local Structure-aware Graph Contrastive representation Learning method (LS-GCL) to model the structural information of nodes from multiple views. Specifically, we construct the semantic subgraphs that are not limited to the first-order neighbors. For the local view, the semantic subgraph of each target node is input into a shared GNN encoder to obtain the target node embeddings at the subgraph-level. Then, we use a pooling function to generate the subgraph-level graph embeddings. For the global view, considering the original graph preserves indispensable semantic information of nodes, we leverage the shared GNN encoder to learn the target node embeddings at the global graph-level. The proposed LS-GCL model is optimized to maximize the common information among similar instances at three various perspectives through a multi-level contrastive loss function. Experimental results on five datasets illustrate that our method outperforms state-of-the-art graph representation learning approaches for both node classification and link prediction tasks. △ Less

Submitted 6 August, 2023; originally announced August 2023.

arXiv:2307.08203 [pdf, other]

A randomization-based theory for preliminary testing of covariate balance in controlled trials

Authors: Anqi Zhao, Peng Ding

Abstract: Randomized trials balance all covariates on average and provide the gold standard for estimating treatment effects. Chance imbalances nevertheless exist more or less in realized treatment allocations and intrigue an important question: what should we do in case the treatment groups differ with respect to some important baseline characteristics? A common strategy is to conduct a {\it preliminary te… ▽ More Randomized trials balance all covariates on average and provide the gold standard for estimating treatment effects. Chance imbalances nevertheless exist more or less in realized treatment allocations and intrigue an important question: what should we do in case the treatment groups differ with respect to some important baseline characteristics? A common strategy is to conduct a {\it preliminary test} of the balance of baseline covariates after randomization, and invoke covariate adjustment for subsequent inference if and only if the realized allocation fails some prespecified criterion. Although such practice is intuitive and popular among practitioners, the existing literature has so far only evaluated its properties under strong parametric model assumptions in theory and simulation, yielding results of limited generality. To fill this gap, we examine two strategies for conducting preliminary test-based covariate adjustment by regression, and evaluate the validity and efficiency of the resulting inferences from the randomization-based perspective. As it turns out, the preliminary-test estimator based on the analysis of covariance can be even less efficient than the unadjusted difference in means, and risks anticonservative confidence intervals based on normal approximation even with the robust standard error. The preliminary-test estimator based on the fully interacted specification is on the other hand less efficient than its counterpart under the {\it always-adjust} strategy, and yields overconservative confidence intervals based on normal approximation. Based on theory and simulation, we echo the existing literature and do not recommend the preliminary-test procedure for covariate adjustment in randomized trials. △ Less

Submitted 16 July, 2023; originally announced July 2023.

arXiv:2307.07442 [pdf]

Sensitivity Analysis for Unmeasured Confounding in Medical Product Development and Evaluation Using Real World Evidence

Authors: Peng Ding, Yixin Fang, Doug Faries, Susan Gruber, Hana Lee, Joo-Yeon Lee, Pallavi Mishra-Kalyani, Mingyang Shan, Mark van der Laan, Shu Yang, Xiang Zhang

Abstract: The American Statistical Association Biopharmaceutical Section (ASA BIOP) working group on real-world evidence (RWE) has been making continuous, extended effort towards a goal of supporting and advancing regulatory science with respect to non-interventional, clinical studies intended to use real-world data for evidence generation for the purpose of medical product development and evaluation (i.e.,… ▽ More The American Statistical Association Biopharmaceutical Section (ASA BIOP) working group on real-world evidence (RWE) has been making continuous, extended effort towards a goal of supporting and advancing regulatory science with respect to non-interventional, clinical studies intended to use real-world data for evidence generation for the purpose of medical product development and evaluation (i.e., RWE studies). In 2023, the working group published a manuscript delineating challenges and opportunities in constructing estimands for RWE studies following a framework in ICH E9(R1) guidance on estimand and sensitivity analysis. As a follow-up task, we describe the other issue in RWE studies, sensitivity analysis. Focusing on the issue of unmeasured confounding, we review availability and applicability of sensitivity analysis methods for different types unmeasured confounding. We discuss consideration on the choice and use of sensitivity analysis for RWE studies. Updated version of this article will present how findings from sensitivity analysis could support regulatory decision-making using a real example. △ Less

Submitted 14 July, 2023; originally announced July 2023.

Comments: 17 pages, 2 figures

arXiv:2307.06566 [pdf, other]

Regression-Oriented Knowledge Distillation for Lightweight Ship Orientation Angle Prediction with Optical Remote Sensing Images

Authors: Zhan Shi, Xin Ding, Peng Ding, Chun Yang, Ru Huang, Xiaoxuan Song

Abstract: Ship orientation angle prediction (SOAP) with optical remote sensing images is an important image processing task, which often relies on deep convolutional neural networks (CNNs) to make accurate predictions. This paper proposes a novel framework to reduce the model sizes and computational costs of SOAP models without harming prediction accuracy. First, a new SOAP model called Mobile-SOAP is desig… ▽ More Ship orientation angle prediction (SOAP) with optical remote sensing images is an important image processing task, which often relies on deep convolutional neural networks (CNNs) to make accurate predictions. This paper proposes a novel framework to reduce the model sizes and computational costs of SOAP models without harming prediction accuracy. First, a new SOAP model called Mobile-SOAP is designed based on MobileNetV2, achieving state-of-the-art prediction accuracy. Four tiny SOAP models are also created by replacing the convolutional blocks in Mobile-SOAP with four small-scale networks, respectively. Then, to transfer knowledge from Mobile-SOAP to four lightweight models, we propose a novel knowledge distillation (KD) framework termed SOAP-KD consisting of a novel feature-based guidance loss and an optimized synthetic samples-based knowledge transfer mechanism. Lastly, extensive experiments on the FGSC-23 dataset confirm the superiority of Mobile-SOAP over existing models and also demonstrate the effectiveness of SOAP-KD in improving the prediction performance of four specially designed tiny models. Notably, by using SOAP-KD, the test mean absolute error of the ShuffleNetV2x1.0-based model is only 8% higher than that of Mobile-SOAP, but its number of parameters and multiply-accumulate operations (MACs) are respectively 61.6% and 60.8% less. △ Less

Submitted 13 July, 2023; originally announced July 2023.

arXiv:2306.13699 [pdf, other]

Curvature-enhanced Graph Convolutional Network for Biomolecular Interaction Prediction

Authors: Cong Shen, Pingjian Ding, Junjie Wee, Jialin Bi, Jiawei Luo, Kelin Xia

Abstract: Geometric deep learning has demonstrated a great potential in non-Euclidean data analysis. The incorporation of geometric insights into learning architecture is vital to its success. Here we propose a curvature-enhanced graph convolutional network (CGCN) for biomolecular interaction prediction, for the first time. Our CGCN employs Ollivier-Ricci curvature (ORC) to characterize network local struct… ▽ More Geometric deep learning has demonstrated a great potential in non-Euclidean data analysis. The incorporation of geometric insights into learning architecture is vital to its success. Here we propose a curvature-enhanced graph convolutional network (CGCN) for biomolecular interaction prediction, for the first time. Our CGCN employs Ollivier-Ricci curvature (ORC) to characterize network local structures and to enhance the learning capability of GCNs. More specifically, ORCs are evaluated based on the local topology from node neighborhoods, and further used as weights for the feature aggregation in message-passing procedure. Our CGCN model is extensively validated on fourteen real-world bimolecular interaction networks and a series of simulated data. It has been found that our CGCN can achieve the state-of-the-art results. It outperforms all existing models, as far as we know, in thirteen out of the fourteen real-world datasets and ranks as the second in the rest one. The results from the simulated data show that our CGCN model is superior to the traditional GCN models regardless of the positive-to-negativecurvature ratios, network densities, and network sizes (when larger than 500). △ Less

Submitted 23 June, 2023; originally announced June 2023.

arXiv:2306.07470 [pdf, other]

Reviving Shift Equivariance in Vision Transformers

Authors: Peijian Ding, Davit Soselia, Thomas Armstrong, Jiahao Su, Furong Huang

Abstract: Shift equivariance is a fundamental principle that governs how we perceive the world - our recognition of an object remains invariant with respect to shifts. Transformers have gained immense popularity due to their effectiveness in both language and vision tasks. While the self-attention operator in vision transformers (ViT) is permutation-equivariant and thus shift-equivariant, patch embedding, p… ▽ More Shift equivariance is a fundamental principle that governs how we perceive the world - our recognition of an object remains invariant with respect to shifts. Transformers have gained immense popularity due to their effectiveness in both language and vision tasks. While the self-attention operator in vision transformers (ViT) is permutation-equivariant and thus shift-equivariant, patch embedding, positional encoding, and subsampled attention in ViT variants can disrupt this property, resulting in inconsistent predictions even under small shift perturbations. Although there is a growing trend in incorporating the inductive bias of convolutional neural networks (CNNs) into vision transformers, it does not fully address the issue. We propose an adaptive polyphase anchoring algorithm that can be seamlessly integrated into vision transformer models to ensure shift-equivariance in patch embedding and subsampled attention modules, such as window attention and global subsampled attention. Furthermore, we utilize depth-wise convolution to encode positional information. Our algorithms enable ViT, and its variants such as Twins to achieve 100% consistency with respect to input shift, demonstrate robustness to cropping, flipping, and affine transformations, and maintain consistent predictions even when the original models lose 20 percentage points on average when shifted by just a few pixels with Twins' accuracy dropping from 80.57% to 62.40%. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: 9 pages, 3 figures

arXiv:2306.00812 [pdf, other]

Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model

Authors: Xiaohuai Le, Tong Lei, Li Chen, Yiqing Guo, Chao He, Cheng Chen, Xianjun Xia, Hua Gao, Yijian Xiao, Piao Ding, Shenyi Song, Jing Lu

Abstract: With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccura… ▽ More With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccurate fundamental frequency estimation. To tackle this problem, we propose a learnable comb filter to enhance harmonics. Based on the sub-band model, we design a DNN-based fundamental frequency estimator to estimate the discrete fundamental frequencies and a comb filter for harmonic enhancement, which are trained via an end-to-end pattern. The experiments show the advantages of our proposed method over PecepNet and DeepFilterNet. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: accepted by Interspeech 2023

arXiv:2305.19819 [pdf, other]

Study of Boson Stars with Wormhole

Authors: Peng-Bo Ding, Tian-Xiang Ma, Yong-Qiang Wang

Abstract: In this paper, we reconsider the mixed system of BSs with wormholes at their center which performed by complex scalar field and phantom field and study a whole new condition about the potential. Both the symmetric and asymmetric solutions in the two asymptotically flat regions are obtained by using numerical method and we mainly explore the change of the results by varying the parameters of throat… ▽ More In this paper, we reconsider the mixed system of BSs with wormholes at their center which performed by complex scalar field and phantom field and study a whole new condition about the potential. Both the symmetric and asymmetric solutions in the two asymptotically flat regions are obtained by using numerical method and we mainly explore the change of the results by varying the parameters of throats and potential. In ground state, we find there are multiple solutions at certain setting of parameters and with the increase of $η_0$ or decrease of $c$, the results gradually become single-valued functions and these two variables have similar influence to the curve shape of mass $M$ and charge $Q$, furthermore, the asymmetric solutions can turn into the solutions of symmetry at some frequency $ω$ in certain $η_0$ and $c$. However, when it comes to excited state, the properties of solutions of symmetry is similar to the ground state while asymmetrical results exhibit altered characteristics. We also present the geometries of wormhole to investigate the property of this model. △ Less

Submitted 31 May, 2023; originally announced May 2023.

Comments: 23 pages, 17 figures

arXiv:2305.18793 [pdf, other]

A First Course in Causal Inference

Authors: Peng Ding

Abstract: I developed the lecture notes based on my ``Causal Inference'' course at the University of California Berkeley over the past seven years. Since half of the students were undergraduates, my lecture notes only required basic knowledge of probability theory, statistical inference, and linear and logistic regressions. I developed the lecture notes based on my ``Causal Inference'' course at the University of California Berkeley over the past seven years. Since half of the students were undergraduates, my lecture notes only required basic knowledge of probability theory, statistical inference, and linear and logistic regressions. △ Less

Submitted 3 October, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

Comments: 1. Posted the companion R code and datasets on Harvard Dataverse https://doi.org/10.7910/DVN/ZX3VEV; 2. Corrected for many typos; 3. Added the Preface and many new homework problems

arXiv:2305.17643 [pdf, other]

Flexible sensitivity analysis for causal inference in observational studies subject to unmeasured confounding

Authors: Sizhu Lu, Peng Ding

Abstract: Causal inference with observational studies often suffers from unmeasured confounding, yielding biased estimators based on the unconfoundedness assumption. Sensitivity analysis assesses how the causal conclusions change with respect to different degrees of unmeasured confounding. Most existing sensitivity analysis methods work well for specific types of statistical estimation or testing strategies… ▽ More Causal inference with observational studies often suffers from unmeasured confounding, yielding biased estimators based on the unconfoundedness assumption. Sensitivity analysis assesses how the causal conclusions change with respect to different degrees of unmeasured confounding. Most existing sensitivity analysis methods work well for specific types of statistical estimation or testing strategies. We propose a flexible sensitivity analysis framework that can deal with commonly used inverse probability weighting, outcome regression, and doubly robust estimators simultaneously. It is based on the well-known parametrization of the selection bias as comparisons of the observed and counterfactual outcomes conditional on observed covariates. It is attractive for practical use because it only requires simple modifications of the standard estimators. Moreover, it naturally extends to many other causal inference settings, including the causal risk ratio or odds ratio, the average causal effect on the treated units, and studies with survival outcomes. We also develop an R package saci to implement our sensitivity analysis estimators. △ Less

Submitted 29 March, 2024; v1 submitted 28 May, 2023; originally announced May 2023.

arXiv:2305.04496 [pdf, other]

Boson star with parity-odd symmetry in wormhole spacetime

Authors: Yuan Yue, Peng-Bo Ding, Yong-Qiang Wang

Abstract: In this paper, we revisit the model of bosonic matter in the form of a free complex scalar field with a nontrivial wormhole spacetime topology supported by a free phantom field. We obtain a new type of boson star with wormhole solutions, in which the complex scalar field possess full parity-odd symmetry with respect to the two asymptotically flat spacetime regions. When the size of the throat is s… ▽ More In this paper, we revisit the model of bosonic matter in the form of a free complex scalar field with a nontrivial wormhole spacetime topology supported by a free phantom field. We obtain a new type of boson star with wormhole solutions, in which the complex scalar field possess full parity-odd symmetry with respect to the two asymptotically flat spacetime regions. When the size of the throat is small, The behavior of boson stars with wormhole approaches that of boson stars. When the size of the throat is intermediate, the typical spiraling dependence of the mass and the particle number on the frequency of the boson stars is replaced by a loop structure. However, as the size becomes relatively large, the loop structure will also disappear. In particular, The complex scalar field could form two boson stars with opposite phase differences with respect to the two spacetime regions in the limit of vanishing throat size. We analyze the properties of this new type of boson stars with wormhole and further show that the wormhole spacetime geometry. △ Less

Submitted 8 May, 2023; originally announced May 2023.

Comments: 9 pages, 5 figures

arXiv:2304.10890 [pdf, other]

doi 10.1103/PhysRevB.108.054442

Magnetic properties of a spin-orbit entangled Jeff = 1/2 honeycomb lattice

Authors: J. Khatua, Q. P. Ding, M. S. Ramachandra Rao, K. Y. Choi, A. Zorko, Y. Furukawa, P. Khuntia

Abstract: The interplay between spin-orbit coupling, anisotropic magnetic interaction, frustration-induced quantum fluctuations and spin correlations can lead to novel quantum states with exotic excitations in rare-earth-based quantum magnets. Herein, we present the crystal structure, magnetization, electron spin resonance (ESR), specific heat, and nuclear magnetic resonance (NMR) experiments on the polycry… ▽ More The interplay between spin-orbit coupling, anisotropic magnetic interaction, frustration-induced quantum fluctuations and spin correlations can lead to novel quantum states with exotic excitations in rare-earth-based quantum magnets. Herein, we present the crystal structure, magnetization, electron spin resonance (ESR), specific heat, and nuclear magnetic resonance (NMR) experiments on the polycrystalline samples of Ba9Yb2Si6O24, in which Yb3+ ions form a perfect honeycomb lattice without detectable anti-site disorder. The magnetization data reveal antiferromagnetically coupled spin-orbit entangled Jeff = 1/2 degrees of freedom of Yb3+ ions in the Kramers doublet state. The ESR measurements reveal that the first excited Kramers doublet is 32.3(7) meV above the ground state. The specific heat results suggest the absence of any long-range magnetic order in the measured temperature range. Furthermore, the 29Si NMR results do not indicate any signature of magnetic ordering down to 1.6 K, and the spin-lattice relaxation rate reveals the presence of a field-induced gap that is attributed to the Zeeman splitting of Kramers doublet state in this quantum material. Our experiments detect neither spin freezing nor long-range magnetic ordering down to 1.6 K. The current results suggest the presence of short-range spin correlations in this spin-orbit entangled Jeff =1/2 rare-earth magnet on a honeycomb lattice. △ Less

Submitted 20 August, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

Journal ref: Phys. Rev. B 108, 054442 (2023)

arXiv:2304.10387 [pdf, other]

doi 10.3389/fnins.2023.1213265

Implantable Photonic Neural Probes with 3D-Printed Microfluidics and Applications to Uncaging

Authors: Xin Mu, Fu-Der Chen, Ka My Dang, Michael G. K. Brunk, Jianfeng Li, Hannes Wahn, Andrei Stalmashonak, Peisheng Ding, Xianshu Luo, Hongyao Chua, Guo-Qiang Lo, Joyce K. S. Poon, Wesley D. Sacher

Abstract: Advances in chip-scale photonic-electronic integration are enabling a new generation of foundry-manufacturable implantable silicon neural probes incorporating nanophotonic waveguides and microelectrodes for optogenetic stimulation and electrophysiological recording in neuroscience research. Further extending neural probe functionalities with integrated microfluidics is a direct approach to achieve… ▽ More Advances in chip-scale photonic-electronic integration are enabling a new generation of foundry-manufacturable implantable silicon neural probes incorporating nanophotonic waveguides and microelectrodes for optogenetic stimulation and electrophysiological recording in neuroscience research. Further extending neural probe functionalities with integrated microfluidics is a direct approach to achieve neurochemical injection and sampling capabilities. In this work, we use two-photon polymerization 3D printing to integrate microfluidic channels onto photonic neural probes, which include silicon nitride nanophotonic waveguides and grating emitters. The customizability of 3D printing enables a unique geometry of microfluidics that conforms to the shape of each neural probe, enabling integration of microfluidics with a variety of existing neural probes while avoiding the complexities of monolithic microfluidics integration. We demonstrate the photonic and fluidic functionalities of the neural probes via fluorescein injection in agarose gel and photoloysis of caged fluorescein in solution and in flxed brain tissue. △ Less

Submitted 25 April, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

arXiv:2304.04540 [pdf, other]

FreConv: Frequency Branch-and-Integration Convolutional Networks

Authors: Zhaowen Li, Xu Zhao, Peigeng Ding, Zongxin Gao, Yuting Yang, Ming Tang, Jinqiao Wang

Abstract: Recent researches indicate that utilizing the frequency information of input data can enhance the performance of networks. However, the existing popular convolutional structure is not designed specifically for utilizing the frequency information contained in datasets. In this paper, we propose a novel and effective module, named FreConv (frequency branch-and-integration convolution), to replace th… ▽ More Recent researches indicate that utilizing the frequency information of input data can enhance the performance of networks. However, the existing popular convolutional structure is not designed specifically for utilizing the frequency information contained in datasets. In this paper, we propose a novel and effective module, named FreConv (frequency branch-and-integration convolution), to replace the vanilla convolution. FreConv adopts a dual-branch architecture to extract and integrate high- and low-frequency information. In the high-frequency branch, a derivative-filter-like architecture is designed to extract the high-frequency information while a light extractor is employed in the low-frequency branch because the low-frequency information is usually redundant. FreConv is able to exploit the frequency information of input data in a more reasonable way to enhance feature representation ability and reduce the memory and computational cost significantly. Without any bells and whistles, experimental results on various tasks demonstrate that FreConv-equipped networks consistently outperform state-of-the-art baselines. △ Less

Submitted 10 April, 2023; originally announced April 2023.

Comments: Accepted by ICME2023

arXiv:2303.17102 [pdf, other]

When is the estimated propensity score better? High-dimensional analysis and bias correction

Authors: Fangzhou Su, Wenlong Mou, Peng Ding, Martin J. Wainwright

Abstract: Anecdotally, using an estimated propensity score is superior to the true propensity score in estimating the average treatment effect based on observational data. However, this claim comes with several qualifications: it holds only if propensity score model is correctly specified and the number of covariates $d$ is small relative to the sample size $n$. We revisit this phenomenon by studying the in… ▽ More Anecdotally, using an estimated propensity score is superior to the true propensity score in estimating the average treatment effect based on observational data. However, this claim comes with several qualifications: it holds only if propensity score model is correctly specified and the number of covariates $d$ is small relative to the sample size $n$. We revisit this phenomenon by studying the inverse propensity score weighting (IPW) estimator based on a logistic model with a diverging number of covariates. We first show that the IPW estimator based on the estimated propensity score is consistent and asymptotically normal with smaller variance than the oracle IPW estimator (using the true propensity score) if and only if $n \gtrsim d^2$. We then propose a debiased IPW estimator that achieves the same guarantees in the regime $n \gtrsim d^{3/2}$. Our proofs rely on a novel non-asymptotic decomposition of the IPW error along with careful control of the higher order terms. △ Less

Submitted 29 March, 2023; originally announced March 2023.

Comments: Fangzhou Su and Wenlong Mou contributed equally to this work

Showing 1–50 of 232 results for author: Ding, P