Skip to main content

Showing 1–30 of 30 results for author: Dwivedi, R

  1. arXiv:2404.12290  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Debiased Distribution Compression

    Authors: Lingxiao Li, Raaz Dwivedi, Lester Mackey

    Abstract: Modern compression methods can summarize a target distribution $\mathbb{P}$ more succinctly than i.i.d. sampling but require access to a low-bias input sequence like a Markov chain converging quickly to $\mathbb{P}$. We introduce a new suite of compression methods suitable for compression with biased input sequences. Given $n$ points targeting the wrong distribution and quadratic time, Stein kerne… ▽ More

    Submitted 26 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: Accepted to ICML 2024

  2. arXiv:2404.08277  [pdf, other

    cs.CV

    FaceFilterSense: A Filter-Resistant Face Recognition and Facial Attribute Analysis Framework

    Authors: Shubham Tiwari, Yash Sethia, Ritesh Kumar, Ashwani Tanwar, Rudresh Dwivedi

    Abstract: With the advent of social media, fun selfie filters have come into tremendous mainstream use affecting the functioning of facial biometric systems as well as image recognition systems. These filters vary from beautification filters and Augmented Reality (AR)-based filters to filters that modify facial landmarks. Hence, there is a need to assess the impact of such filters on the performance of exis… ▽ More

    Submitted 18 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

  3. arXiv:2404.06619  [pdf, other

    cs.CL cs.CY cs.LG

    FairPair: A Robust Evaluation of Biases in Language Models through Paired Perturbations

    Authors: Jane Dwivedi-Yu, Raaz Dwivedi, Timo Schick

    Abstract: The accurate evaluation of differential treatment in language models to specific groups is critical to ensuring a positive and safe user experience. An ideal evaluation should have the properties of being robust, extendable to new groups or attributes, and being able to capture biases that appear in typical usage (rather than just extreme, rare cases). Relatedly, bias evaluation should surface not… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  4. arXiv:2402.11652  [pdf, other

    econ.EM cs.LG stat.ME stat.ML

    Doubly Robust Inference in Causal Latent Factor Models

    Authors: Alberto Abadie, Anish Agarwal, Raaz Dwivedi, Abhin Shah

    Abstract: This article introduces a new estimator of average treatment effects under unobserved confounding in modern data-rich environments featuring large numbers of units and outcomes. The proposed estimator is doubly robust, combining outcome imputation, inverse probability weighting, and a novel cross-fitting procedure for matrix completion. We derive finite-sample and asymptotic guarantees, and show t… ▽ More

    Submitted 15 April, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  5. arXiv:2310.19452  [pdf, other

    cs.CR

    Incorporating Zero-Knowledge Succinct Non-interactive Argument of Knowledge for Blockchain-based Identity Management with off-chain computations

    Authors: Pranay Kothari, Deepak Chopra, Manjot Singh, Shivam Bhardwaj, Rudresh Dwivedi

    Abstract: In today's world, secure and efficient biometric authentication is of keen importance. Traditional authentication methods are no longer considered reliable due to their susceptibility to cyber-attacks. Biometric authentication, particularly fingerprint authentication, has emerged as a promising alternative, but it raises concerns about the storage and use of biometric data, as well as centralized… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  6. arXiv:2308.05950  [pdf, other

    cs.DC cs.CR

    Blockchain-Based Transferable Digital Rights of Land

    Authors: Ras Dwivedi, Sumit Patel, Prof. Sandeep Shukla

    Abstract: Land, being a scarce and valuable resource, is in high demand, especially in densely populated areas of older cities. Development authorities require land for infrastructure projects and other amenities, while landowners hold onto their land for both its usage and its financial value. Transferable Development Rights (TDRs) serve as a mechanism to separate the development rights associated with the… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: 5 pages, Paper presented in https://easychair.org/cfp/ICSF2023

  7. arXiv:2304.14509  [pdf, other

    cs.CV cs.AI

    An Efficient Ensemble Explainable AI (XAI) Approach for Morphed Face Detection

    Authors: Rudresh Dwivedi, Ritesh Kumar, Deepak Chopra, Pranay Kothari, Manjot Singh

    Abstract: The extensive utilization of biometric authentication systems have emanated attackers / imposters to forge user identity based on morphed images. In this attack, a synthetic image is produced and merged with genuine. Next, the resultant image is user for authentication. Numerous deep neural convolutional architectures have been proposed in literature for face Morphing Attack Detection (MADs) to pr… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

  8. arXiv:2304.05365  [pdf, other

    cs.LG stat.AP stat.ME stat.ML

    Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resampling

    Authors: Susobhan Ghosh, Raphael Kim, Prasidh Chhabria, Raaz Dwivedi, Predrag Klasnja, Peng Liao, Kelly Zhang, Susan Murphy

    Abstract: There is a growing interest in using reinforcement learning (RL) to personalize sequences of treatments in digital health to support users in adopting healthier behaviors. Such sequential decision-making problems involve decisions about when to treat and how to treat based on the user's context (e.g., prior activity level, location, etc.). Online RL is a promising data-driven approach for this pro… ▽ More

    Submitted 7 August, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: The first two authors contributed equally

  9. arXiv:2301.05974  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Compress Then Test: Powerful Kernel Testing in Near-linear Time

    Authors: Carles Domingo-Enrich, Raaz Dwivedi, Lester Mackey

    Abstract: Kernel two-sample testing provides a powerful framework for distinguishing any pair of distributions based on $n$ sample points. However, existing kernel tests either run in $n^2$ time or sacrifice undue power to improve runtime. To address these shortcomings, we introduce Compress Then Test (CTT), a new framework for high-powered kernel testing based on sample compression. CTT cheaply approximate… ▽ More

    Submitted 23 February, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

    Comments: Accepted as a paper at AISTATS 2023

  10. arXiv:2211.14297  [pdf, ps, other

    stat.ML cs.LG

    Doubly robust nearest neighbors in factor models

    Authors: Raaz Dwivedi, Katherine Tian, Sabina Tomkins, Predrag Klasnja, Susan Murphy, Devavrat Shah

    Abstract: We introduce and analyze an improved variant of nearest neighbors (NN) for estimation with missing data in latent factor models. We consider a matrix completion problem with missing data, where the $(i, t)$-th entry, when observed, is given by its mean $f(u_i, v_t)$ plus mean-zero noise for an unknown function $f$ and latent factors $u_i$ and $v_t$. Prior NN strategies, like unit-unit NN, for esti… ▽ More

    Submitted 29 January, 2024; v1 submitted 25 November, 2022; originally announced November 2022.

  11. arXiv:2211.08209  [pdf, other

    cs.LG stat.ME

    On counterfactual inference with unobserved confounding

    Authors: Abhin Shah, Raaz Dwivedi, Devavrat Shah, Gregory W. Wornell

    Abstract: Given an observational study with $n$ independent but heterogeneous units, our goal is to learn the counterfactual distribution for each unit using only one $p$-dimensional sample per unit containing covariates, interventions, and outcomes. Specifically, we allow for unobserved confounding that introduces statistical biases between interventions and outcomes as well as exacerbates the heterogeneit… ▽ More

    Submitted 14 September, 2023; v1 submitted 13 November, 2022; originally announced November 2022.

  12. arXiv:2202.06891  [pdf, other

    stat.ML cs.LG

    Counterfactual inference for sequential experiments

    Authors: Raaz Dwivedi, Katherine Tian, Sabina Tomkins, Predrag Klasnja, Susan Murphy, Devavrat Shah

    Abstract: We consider after-study statistical inference for sequentially designed experiments wherein multiple units are assigned treatments for multiple time points using treatment policies that adapt over time. Our goal is to provide inference guarantees for the counterfactual mean at the smallest possible scale -- mean outcome under different treatments for each unit and each time -- with minimal assumpt… ▽ More

    Submitted 16 April, 2023; v1 submitted 14 February, 2022; originally announced February 2022.

  13. arXiv:2111.07941  [pdf, other

    stat.ML cs.DS cs.LG math.ST stat.ME

    Distribution Compression in Near-linear Time

    Authors: Abhishek Shetty, Raaz Dwivedi, Lester Mackey

    Abstract: In distribution compression, one aims to accurately summarize a probability distribution $\mathbb{P}$ using a small number of representative points. Near-optimal thinning procedures achieve this goal by sampling $n$ points from a Markov chain and identifying $\sqrt{n}$ points with $\widetilde{\mathcal{O}}(1/\sqrt{n})$ discrepancy to $\mathbb{P}$. Unfortunately, these algorithms suffer from quadrat… ▽ More

    Submitted 17 October, 2022; v1 submitted 15 November, 2021; originally announced November 2021.

    Comments: Accepted to ICLR 2022; An outdated proof of Theorem 2 was previously included in the appendix; this oversight is corrected in this version

  14. arXiv:2110.01593  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Generalized Kernel Thinning

    Authors: Raaz Dwivedi, Lester Mackey

    Abstract: The kernel thinning (KT) algorithm of Dwivedi and Mackey (2021) compresses a probability distribution more effectively than independent sampling by targeting a reproducing kernel Hilbert space (RKHS) and leveraging a less smooth square-root kernel. Here we provide four improvements. First, we show that KT applied directly to the target RKHS yields tighter, dimension-free guarantees for any kernel,… ▽ More

    Submitted 19 July, 2022; v1 submitted 4 October, 2021; originally announced October 2021.

    Comments: Published in ICLR 2022

  15. arXiv:2105.05842  [pdf, other

    stat.ML cs.LG math.ST stat.CO stat.ME

    Kernel Thinning

    Authors: Raaz Dwivedi, Lester Mackey

    Abstract: We introduce kernel thinning, a new procedure for compressing a distribution $\mathbb{P}$ more effectively than i.i.d. sampling or standard thinning. Given a suitable reproducing kernel $\mathbf{k}_{\star}$ and $O(n^2)$ time, kernel thinning compresses an $n$-point approximation to $\mathbb{P}$ into a $\sqrt{n}$-point approximation with comparable worst-case integration error across the associated… ▽ More

    Submitted 11 May, 2024; v1 submitted 12 May, 2021; originally announced May 2021.

    Comments: Accepted for presentation as an extended abstract at the Conference on Learning Theory (COLT) 2021, and published in the Journal of Machine Learning Research (JMLR) 2024

  16. arXiv:2103.03096  [pdf, other

    cs.CV cs.AI

    Towards Designing Computer Vision-based Explainable-AI Solution: A Use Case of Livestock Mart Industry

    Authors: Devam Dave, Het Naik, Smiti Singhal, Rudresh Dwivedi, Pankesh Patel

    Abstract: The objective of an online Mart is to match buyers and sellers, to weigh animals and to oversee their sale. A reliable pricing method can be developed by ML models that can read through historical sales data. However, when AI models suggest or recommend a price, that in itself does not reveal too much (i.e., it acts like a black box) about the qualities and the abilities of an animal. An intereste… ▽ More

    Submitted 8 February, 2021; originally announced March 2021.

    Comments: 8 pages, 5 figures

  17. arXiv:2010.01499  [pdf

    cs.CV cs.LG

    A New Mask R-CNN Based Method for Improved Landslide Detection

    Authors: Silvia Liberata Ullo, Amrita Mohan, Alessandro Sebastianelli, Shaik Ejaz Ahamed, Basant Kumar, Ramji Dwivedi, G. R. Sinha

    Abstract: This paper presents a novel method of landslide detection by exploiting the Mask R-CNN capability of identifying an object layout by using a pixel-based segmentation, along with transfer learning used to train the proposed model. A data set of 160 elements is created containing landslide and non-landslide images. The proposed method consists of three steps: (i) augmenting training image samples to… ▽ More

    Submitted 4 October, 2020; originally announced October 2020.

    Comments: 9 pages, 8 figures, 6 tables, submitted to JSTARS special issue on Cultural Heritage

  18. arXiv:2008.10109  [pdf, other

    stat.ME cs.LG stat.AP

    Stable discovery of interpretable subgroups via calibration in causal studies

    Authors: Raaz Dwivedi, Yan Shuo Tan, Briton Park, Mian Wei, Kevin Horgan, David Madigan, Bin Yu

    Abstract: Building on Yu and Kumbier's PCS framework and for randomized experiments, we introduce a novel methodology for Stable Discovery of Interpretable Subgroups via Calibration (StaDISC), with large heterogeneous treatment effects. StaDISC was developed during our re-analysis of the 1999-2000 VIGOR study, an 8076 patient randomized controlled trial (RCT), that compared the risk of adverse events from a… ▽ More

    Submitted 28 September, 2020; v1 submitted 23 August, 2020; originally announced August 2020.

    Comments: Raaz Dwivedi and Yan Shuo Tan are joint first authors and contributed equally to this work. 52 pages, 8 Figures, 9 Tables. To appear in International Statistical Review, 2020

  19. arXiv:2006.10189  [pdf, other

    cs.LG cs.IT math.ST stat.ML

    Revisiting minimum description length complexity in overparameterized models

    Authors: Raaz Dwivedi, Chandan Singh, Bin Yu, Martin J. Wainwright

    Abstract: Complexity is a fundamental concept underlying statistical learning theory that aims to inform generalization performance. Parameter count, while successful in low-dimensional settings, is not well-justified for overparameterized settings when the number of parameters is more than the number of training samples. We revisit complexity measures based on Rissanen's principle of minimum description le… ▽ More

    Submitted 12 October, 2023; v1 submitted 17 June, 2020; originally announced June 2020.

    Comments: First two authors contributed equally

  20. arXiv:2005.11411  [pdf, other

    cs.LG math.ST stat.ML

    Instability, Computational Efficiency and Statistical Accuracy

    Authors: Nhat Ho, Koulik Khamaru, Raaz Dwivedi, Martin J. Wainwright, Michael I. Jordan, Bin Yu

    Abstract: Many statistical estimators are defined as the fixed point of a data-dependent operator, with estimators based on minimizing a cost function being an important special case. The limiting performance of such estimators depends on the properties of the population-level operator in the idealized limit of infinitely many samples. We develop a general framework that yields bounds on statistical accurac… ▽ More

    Submitted 20 March, 2022; v1 submitted 22 May, 2020; originally announced May 2020.

    Comments: 68 pages, 6 Figures, 2 Tables. First three authors contributed equally

  21. Curating a COVID-19 data repository and forecasting county-level death counts in the United States

    Authors: Nick Altieri, Rebecca L. Barter, James Duncan, Raaz Dwivedi, Karl Kumbier, Xiao Li, Robert Netzorg, Briton Park, Chandan Singh, Yan Shuo Tan, Tiffany Tang, Yu Wang, Chao Zhang, Bin Yu

    Abstract: As the COVID-19 outbreak evolves, accurate forecasting continues to play an extremely important role in informing policy decisions. In this paper, we present our continuous curation of a large data repository containing COVID-19 information from a range of sources. We use this data to develop predictions and corresponding prediction intervals for the short-term trajectory of COVID-19 cumulative de… ▽ More

    Submitted 9 August, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

    Comments: Authors ordered alphabetically. All authors contributed significantly to this work. All collected data, modeling code, forecasts, and visualizations are updated daily and available at \url{https://github.com/Yu-Group/covid19-severity-prediction}

    Journal ref: Published in Harvard Data Science Review, 2020

  22. arXiv:1907.09764  [pdf, other

    nucl-th cs.LG nucl-ex stat.ML

    Trees and Islands -- Machine learning approach to nuclear physics

    Authors: Nishchal R. Dwivedi

    Abstract: We implement machine learning algorithms to nuclear data. These algorithms are purely data driven and generate models that are capable to capture intricate trends. Gradient boosted trees algorithm is employed to generate a trained model from existing nuclear data, which is used for prediction for data of damping parameter, shell correction energies, quadrupole deformation, pairing gaps, level dens… ▽ More

    Submitted 23 July, 2019; originally announced July 2019.

    Comments: 8 Figures, 2 Tables

  23. arXiv:1905.12247  [pdf, other

    stat.ML cs.LG stat.CO

    Fast mixing of Metropolized Hamiltonian Monte Carlo: Benefits of multi-step gradients

    Authors: Yuansi Chen, Raaz Dwivedi, Martin J. Wainwright, Bin Yu

    Abstract: Hamiltonian Monte Carlo (HMC) is a state-of-the-art Markov chain Monte Carlo sampling algorithm for drawing samples from smooth probability densities over continuous spaces. We study the variant most widely used in practice, Metropolized HMC with the Störmer-Verlet or leapfrog integrator, and make two primary contributions. First, we provide a non-asymptotic upper bound on the mixing time of the M… ▽ More

    Submitted 11 January, 2021; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: 73 pages, 2 figures, fixed a mistake in the proof of Lemma 11, accepted in JMLR

  24. arXiv:1902.00194  [pdf, other

    math.ST cs.LG stat.ML

    Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models

    Authors: Raaz Dwivedi, Nhat Ho, Koulik Khamaru, Martin J. Wainwright, Michael I. Jordan, Bin Yu

    Abstract: We study a class of weakly identifiable location-scale mixture models for which the maximum likelihood estimates based on $n$ i.i.d. samples are known to have lower accuracy than the classical $n^{- \frac{1}{2}}$ error. We investigate whether the Expectation-Maximization (EM) algorithm also converges slowly for these models. We provide a rigorous characterization of EM for fitting a weakly identif… ▽ More

    Submitted 15 November, 2021; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: 30 pages, 4 figures. The first three authors contributed equally to this work. To appear in AISTATS 2020

  25. A non-invertible cancelable fingerprint template generation based on ridge feature transformation

    Authors: Rudresh Dwivedi, Somnath Dey

    Abstract: In a biometric verification system, leakage of biometric data leads to permanent identity loss since original biometric data is inherently linked to a user. Further, various types of attacks on a biometric system may reveal the original template and utility in other applications. To address these security and privacy concerns cancelable biometric has been introduced. Cancelable biometric construct… ▽ More

    Submitted 28 May, 2018; originally announced May 2018.

  26. A novel hybrid score level and decision level fusion scheme for cancelable multi-biometric verification

    Authors: Rudresh Dwivedi, Somnath Dey

    Abstract: In spite of the benefits of biometric-based authentication systems, there are few concerns raised because of the sensitivity of biometric data to outliers, low performance caused due to intra-class variations and privacy invasion caused by information leakage. To address these issues, we propose a hybrid fusion framework where only the protected modalities are combined to fulfill the requirement o… ▽ More

    Submitted 26 May, 2018; originally announced May 2018.

  27. arXiv:1805.10108  [pdf, other

    cs.CV

    Generating protected fingerprint template utilizing coprime mapping transformation

    Authors: Rudresh Dwivedi, Somnath Dey

    Abstract: The identity of a user is permanently lost if biometric data gets compromised since the biometric information is irreplaceable and irrevocable. To revoke and reissue a new template in place of the compromised biometric template, the idea of cancelable biometrics has been introduced. The concept behind cancelable biometric is to irreversibly transform the original biometric template and perform the… ▽ More

    Submitted 25 May, 2018; originally announced May 2018.

  28. arXiv:1805.08399  [pdf, other

    cs.CR

    A fingerprint based crypto-biometric system for secure communication

    Authors: Rudresh Dwivedi, Somnath Dey, Mukul Anand Sharma, Apurv Goel

    Abstract: To ensure the secure transmission of data, cryptography is treated as the most effective solution. Cryptographic key is an important entity in this procedure. In general, randomly generated cryptographic key (of 256 bits) is difficult to remember. However, such a key needs to be stored in a protected place or transported through a shared communication line which, in fact, poses another threat to s… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.

    Comments: 29 single column pages, 8 figures

  29. arXiv:1608.02895  [pdf, other

    math.PR cs.DS

    The power of online thinning in reducing discrepancy

    Authors: Raaz Dwivedi, Ohad N. Feldheim, Ori Gurel-Gurevich, Aaditya Ramdas

    Abstract: Consider an infinite sequence of independent, uniformly chosen points from $[0,1]^d$. After looking at each point in the sequence, an overseer is allowed to either keep it or reject it, and this choice may depend on the locations of all previously kept points. However, the overseer must keep at least one of every two consecutive points. We call a sequence generated in this fashion a \emph{two-thin… ▽ More

    Submitted 4 September, 2017; v1 submitted 9 August, 2016; originally announced August 2016.

    Comments: 22 pages, 3 figures. Expanded version including multidimensional results. Some results regarding 1+β thinning were deferred to a separate paper

    MSC Class: 68W27; 60D05; 60G55 ACM Class: F.2.2; G.3

  30. arXiv:1008.0336  [pdf

    cs.LG

    Close Clustering Based Automated Color Image Annotation

    Authors: Ankit Garg, Rahul Dwivedi, Krishna Asawa

    Abstract: Most image-search approaches today are based on the text based tags associated with the images which are mostly human generated and are subject to various kinds of errors. The results of a query to the image database thus can often be misleading and may not satisfy the requirements of the user. In this work we propose our approach to automate this tagging process of images, where image results gen… ▽ More

    Submitted 2 August, 2010; originally announced August 2010.