Skip to main content

Showing 1–42 of 42 results for author: Greenewald, K

  1. arXiv:2407.00066  [pdf, other

    cs.DC cs.AI cs.CL cs.LG

    Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead

    Authors: Rickard Brüel-Gabrielsson, Jiacheng Zhu, Onkar Bhardwaj, Leshem Choshen, Kristjan Greenewald, Mikhail Yurochkin, Justin Solomon

    Abstract: Fine-tuning large language models (LLMs) with low-rank adapters (LoRAs) has become common practice, often yielding numerous copies of the same LLM differing only in their LoRA updates. This paradigm presents challenges for systems that serve real-time responses to queries that each involve a different LoRA. Prior works optimize the design of such systems but still require continuous loading and of… ▽ More

    Submitted 17 June, 2024; originally announced July 2024.

  2. arXiv:2406.07475  [pdf, other

    cs.LG stat.ML

    Partially Observed Trajectory Inference using Optimal Transport and a Dynamics Prior

    Authors: Anming Gu, Edward Chien, Kristjan Greenewald

    Abstract: Trajectory inference seeks to recover the temporal dynamics of a population from snapshots of its (uncoupled) temporal marginals, i.e. where observed particles are not tracked over time. Lavenant et al. arXiv:2102.09204 addressed this challenging problem under a stochastic differential equation (SDE) model with a gradient-driven drift in the observed space, introducing a minimum entropy estimator… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 32 pages, 9 figures

  3. arXiv:2406.06425  [pdf, other

    stat.ML cs.LG math.ST

    Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking

    Authors: Gabriel Rioux, Apoorva Nitsure, Mattia Rigotti, Kristjan Greenewald, Youssef Mroueh

    Abstract: Stochastic dominance is an important concept in probability theory, econometrics and social choice theory for robustly modeling agents' preferences between random outcomes. While many works have been dedicated to the univariate case, little has been done in the multivariate scenario, wherein an agent has to decide between different multivariate outcomes. By exploiting a characterization of multiva… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 27 pages, 2 figures

  4. arXiv:2406.05882  [pdf, other

    cs.LG stat.ML

    Distributional Preference Alignment of LLMs via Optimal Transport

    Authors: Igor Melnyk, Youssef Mroueh, Brian Belgodere, Mattia Rigotti, Apoorva Nitsure, Mikhail Yurochkin, Kristjan Greenewald, Jiri Navratil, Jerret Ross

    Abstract: Current LLM alignment techniques use pairwise human preferences at a sample level, and as such, they do not imply an alignment on the distributional level. We propose in this paper Alignment via Optimal Transport (AOT), a novel method for distributional preference alignment of LLMs. AOT aligns LLMs on unpaired preference data by making the reward distribution of the positive samples stochastically… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  5. arXiv:2406.04047  [pdf, other

    stat.ML cs.LG

    Slicing Mutual Information Generalization Bounds for Neural Networks

    Authors: Kimia Nadjahi, Kristjan Greenewald, Rickard Brüel Gabrielsson, Justin Solomon

    Abstract: The ability of machine learning (ML) algorithms to generalize well to unseen data has been studied through the lens of information theory, by bounding the generalization error with the input-output mutual information (MI), i.e., the MI between the training data and the learned hypothesis. Yet, these bounds have limited practicality for modern ML applications (e.g., deep learning), due to the diffi… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML 2024

  6. arXiv:2405.15891  [pdf, other

    cs.CV cs.GR cs.LG

    Score Distillation via Reparametrized DDIM

    Authors: Artem Lukoianov, Haitz Sáez de Ocáriz Borde, Kristjan Greenewald, Vitor Campagnolo Guizilini, Timur Bagautdinov, Vincent Sitzmann, Justin Solomon

    Abstract: While 2D diffusion models generate realistic, high-detail images, 3D shape generation methods like Score Distillation Sampling (SDS) built on these 2D diffusion models produce cartoon-like, over-smoothed shapes. To help explain this discrepancy, we show that the image guidance used in Score Distillation can be understood as the velocity field of a 2D denoising generative process, up to the choice… ▽ More

    Submitted 13 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: Preprint. 25 pages, 26 figures. Revision : added missed comparisons, fixed typos, fixed PDF compatibility issues

  7. arXiv:2404.10095  [pdf, other

    cs.CY cs.CR cs.DS

    Synthetic Census Data Generation via Multidimensional Multiset Sum

    Authors: Cynthia Dwork, Kristjan Greenewald, Manish Raghavan

    Abstract: The US Decennial Census provides valuable data for both research and policy purposes. Census data are subject to a variety of disclosure avoidance techniques prior to release in order to preserve respondent confidentiality. While many are interested in studying the impacts of disclosure avoidance methods on downstream analyses, particularly with the introduction of differential privacy in the 2020… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  8. arXiv:2403.08819  [pdf, other

    cs.LG cs.CL stat.ML

    Thermometer: Towards Universal Calibration for Large Language Models

    Authors: Maohao Shen, Subhro Das, Kristjan Greenewald, Prasanna Sattigeri, Gregory Wornell, Soumya Ghosh

    Abstract: We consider the issue of calibration in large language models (LLM). Recent studies have found that common interventions such as instruction tuning often result in poorly calibrated LLMs. Although calibration is well-explored in traditional applications, calibrating LLMs is uniquely challenging. These challenges stem as much from the severe computational requirements of LLMs as from their versatil… ▽ More

    Submitted 27 June, 2024; v1 submitted 19 February, 2024; originally announced March 2024.

    Comments: Camera ready version for ICML 2024

  9. arXiv:2402.16842  [pdf, other

    cs.LG

    Asymmetry in Low-Rank Adapters of Foundation Models

    Authors: Jiacheng Zhu, Kristjan Greenewald, Kimia Nadjahi, Haitz Sáez de Ocáriz Borde, Rickard Brüel Gabrielsson, Leshem Choshen, Marzyeh Ghassemi, Mikhail Yurochkin, Justin Solomon

    Abstract: Parameter-efficient fine-tuning optimizes large, pre-trained foundation models by updating a subset of parameters; in this class, Low-Rank Adaptation (LoRA) is particularly effective. Inspired by an effort to investigate the different roles of LoRA matrices during fine-tuning, this paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices. Specifically,… ▽ More

    Submitted 27 February, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: 17 pages, 2 figures, 9 tables

  10. arXiv:2402.02006  [pdf, other

    cs.LG

    PresAIse, A Prescriptive AI Solution for Enterprises

    Authors: Wei Sun, Scott McFaddin, Linh Ha Tran, Shivaram Subramanian, Kristjan Greenewald, Yeshi Tenzin, Zack Xue, Youssef Drissi, Markus Ettl

    Abstract: Prescriptive AI represents a transformative shift in decision-making, offering causal insights and actionable recommendations. Despite its huge potential, enterprise adoption often faces several challenges. The first challenge is caused by the limitations of observational data for accurate causal inference which is typically a prerequisite for good decision-making. The second pertains to the inter… ▽ More

    Submitted 12 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: 14 pages

  11. arXiv:2310.07132  [pdf, other

    cs.LG math.ST q-fin.RM stat.ML

    Risk Aware Benchmarking of Large Language Models

    Authors: Apoorva Nitsure, Youssef Mroueh, Mattia Rigotti, Kristjan Greenewald, Brian Belgodere, Mikhail Yurochkin, Jiri Navratil, Igor Melnyk, Jerret Ross

    Abstract: We propose a distributional framework for benchmarking socio-technical risks of foundation models with quantified statistical significance. Our approach hinges on a new statistical relative testing based on first and second order stochastic dominance of real random variables. We show that the second order statistics in this test are linked to mean-risk models commonly used in econometrics and math… ▽ More

    Submitted 9 June, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: ICML 2024

  12. arXiv:2309.16200  [pdf, other

    cs.LG cs.IT

    Max-Sliced Mutual Information

    Authors: Dor Tsur, Ziv Goldfeld, Kristjan Greenewald

    Abstract: Quantifying the dependence between high-dimensional random variables is central to statistical learning and inference. Two classical methods are canonical correlation analysis (CCA), which identifies maximally correlated projected versions of the original variables, and Shannon's mutual information, which is a universal dependence measure that also captures high-order dependencies. However, CCA on… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: Accepted at NeurIPS 2023

  13. arXiv:2307.06250  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Identifiability Guarantees for Causal Disentanglement from Soft Interventions

    Authors: Jiaqi Zhang, Chandler Squires, Kristjan Greenewald, Akash Srivastava, Karthikeyan Shanmugam, Caroline Uhler

    Abstract: Causal disentanglement aims to uncover a representation of data using latent variables that are interrelated through a causal model. Such a representation is identifiable if the latent model that explains the data is unique. In this paper, we focus on the scenario where unpaired observational and interventional data are available, with each intervention changing the mechanism of a latent variable.… ▽ More

    Submitted 8 November, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

  14. arXiv:2305.15538  [pdf, other

    cs.LG cs.CR cs.DB cs.IT

    Post-processing Private Synthetic Data for Improving Utility on Selected Measures

    Authors: Hao Wang, Shivchander Sudalairaj, John Henning, Kristjan Greenewald, Akash Srivastava

    Abstract: Existing private synthetic data generation algorithms are agnostic to downstream tasks. However, end users may have specific requirements that the synthetic data must satisfy. Failure to meet these requirements could significantly reduce the utility of the data for downstream use. We introduce a post-processing technique that improves the utility of the synthetic data with respect to measures sele… ▽ More

    Submitted 18 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  15. arXiv:2305.04712  [pdf, other

    cs.IT cs.LG

    High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction

    Authors: Kristjan Greenewald, Brian Kingsbury, Yuancheng Yu

    Abstract: We study the problem of overcoming exponential sample complexity in differential entropy estimation under Gaussian convolutions. Specifically, we consider the estimation of the differential entropy $h(X+Z)$ via $n$ independently and identically distributed samples of $X$, where $X$ and $Z$ are independent $D$-dimensional random variables with $X$ sub-Gaussian with bounded second moment and… ▽ More

    Submitted 11 May, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: To appear in ISIT 2023

  16. arXiv:2302.11838  [pdf, other

    cs.IT cs.DS

    Minimum-Entropy Coupling Approximation Guarantees Beyond the Majorization Barrier

    Authors: Spencer Compton, Dmitriy Katz, Benjamin Qi, Kristjan Greenewald, Murat Kocaoglu

    Abstract: Given a set of discrete probability distributions, the minimum entropy coupling is the minimum entropy joint distribution that has the input distributions as its marginals. This has immediate relevance to tasks such as entropic causal inference for causal graph discovery and bounding mutual information between variables that we observe separately. Since finding the minimum entropy coupling is NP-H… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: AISTATS 2023

  17. arXiv:2210.06759  [pdf, other

    cs.LG

    Outlier-Robust Group Inference via Gradient Space Clustering

    Authors: Yuchen Zeng, Kristjan Greenewald, Kangwook Lee, Justin Solomon, Mikhail Yurochkin

    Abstract: Traditional machine learning models focus on achieving good performance on the overall training distribution, but they often underperform on minority groups. Existing methods can improve the worst-group performance, but they can have several limitations: (i) they require group annotations, which are often expensive and sometimes infeasible to obtain, and/or (ii) they are sensitive to outliers. Mos… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: 17 pages, 6 tables, 8 figures

  18. arXiv:2206.08526  [pdf, other

    cs.IT stat.ML

    k-Sliced Mutual Information: A Quantitative Study of Scalability with Dimension

    Authors: Ziv Goldfeld, Kristjan Greenewald, Theshani Nuradha, Galen Reeves

    Abstract: Sliced mutual information (SMI) is defined as an average of mutual information (MI) terms between one-dimensional random projections of the random variables. It serves as a surrogate measure of dependence to classic MI that preserves many of its properties but is more scalable to high dimensions. However, a quantitative characterization of how SMI itself and estimation rates thereof depend on the… ▽ More

    Submitted 14 October, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: Accepted at NeurIPS 2022

  19. arXiv:2202.01671  [pdf, other

    stat.ML cs.LG

    Log-Euclidean Signatures for Intrinsic Distances Between Unaligned Datasets

    Authors: Tal Shnitzer, Mikhail Yurochkin, Kristjan Greenewald, Justin Solomon

    Abstract: The need for efficiently comparing and representing datasets with unknown alignment spans various fields, from model analysis and comparison in machine learning to trend discovery in collections of medical datasets. We use manifold learning to compare the intrinsic geometric structures of different datasets by comparing their diffusion operators, symmetric positive-definite (SPD) matrices that rel… ▽ More

    Submitted 11 July, 2022; v1 submitted 3 February, 2022; originally announced February 2022.

    Comments: 23 pages, 9 figures

  20. arXiv:2201.11945  [pdf, other

    cs.LG

    Learning Proximal Operators to Discover Multiple Optima

    Authors: Lingxiao Li, Noam Aigerman, Vladimir G. Kim, Jiajin Li, Kristjan Greenewald, Mikhail Yurochkin, Justin Solomon

    Abstract: Finding multiple solutions of non-convex optimization problems is a ubiquitous yet challenging task. Most past algorithms either apply single-solution optimization methods from multiple random initial guesses or search in the vicinity of found solutions using ad hoc heuristics. We present an end-to-end method to learn the proximal operator of a family of training problems so that multiple local mi… ▽ More

    Submitted 1 March, 2023; v1 submitted 28 January, 2022; originally announced January 2022.

  21. arXiv:2110.05279  [pdf, ps, other

    cs.IT

    Sliced Mutual Information: A Scalable Measure of Statistical Dependence

    Authors: Ziv Goldfeld, Kristjan Greenewald

    Abstract: Mutual information (MI) is a fundamental measure of statistical dependence, with a myriad of applications to information theory, statistics, and machine learning. While it possesses many desirable structural properties, the estimation of high-dimensional MI from samples suffers from the curse of dimensionality. Motivated by statistical scalability to high dimensions, this paper proposes sliced MI… ▽ More

    Submitted 18 October, 2021; v1 submitted 11 October, 2021; originally announced October 2021.

  22. arXiv:2106.03314  [pdf, other

    cs.LG stat.ML

    Measuring Generalization with Optimal Transport

    Authors: Ching-Yao Chuang, Youssef Mroueh, Kristjan Greenewald, Antonio Torralba, Stefanie Jegelka

    Abstract: Understanding the generalization of deep neural networks is one of the most important tasks in deep learning. Although much progress has been made, theoretical error bounds still often behave disparately from empirical observations. In this work, we develop margin-based generalization bounds, where the margins are normalized with optimal transport costs between independent random subsets sampled f… ▽ More

    Submitted 7 November, 2021; v1 submitted 6 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021

  23. arXiv:2106.02933  [pdf, other

    cs.LG

    k-Mixup Regularization for Deep Learning via Optimal Transport

    Authors: Kristjan Greenewald, Anming Gu, Mikhail Yurochkin, Justin Solomon, Edward Chien

    Abstract: Mixup is a popular regularization technique for training deep neural networks that improves generalization and increases robustness to certain distribution shifts. It perturbs input training data in the direction of other randomly-chosen instances in the training set. To better leverage the structure of the data, we extend mixup in a simple, broadly applicable way to \emph{$k$-mixup}, which pertur… ▽ More

    Submitted 7 October, 2023; v1 submitted 5 June, 2021; originally announced June 2021.

  24. arXiv:2102.12731  [pdf, other

    cs.LG stat.ML

    Improving Approximate Optimal Transport Distances using Quantization

    Authors: Gaspard Beugnot, Aude Genevay, Kristjan Greenewald, Justin Solomon

    Abstract: Optimal transport (OT) is a popular tool in machine learning to compare probability measures geometrically, but it comes with substantial computational burden. Linear programming algorithms for computing OT distances scale cubically in the size of the input, making OT impractical in the large-sample regime. We introduce a practical algorithm, which relies on a quantization step, to estimate OT dis… ▽ More

    Submitted 23 March, 2022; v1 submitted 25 February, 2021; originally announced February 2021.

    Comments: Published in the proceedings of the Conference on Uncertainty in Artificial Intelligence 2021 (UAI)

    Journal ref: PMLR 161:290-300, 2021

  25. arXiv:2101.03501  [pdf, other

    stat.ML cs.AI cs.IT cs.LG

    Entropic Causal Inference: Identifiability and Finite Sample Results

    Authors: Spencer Compton, Murat Kocaoglu, Kristjan Greenewald, Dmitriy Katz

    Abstract: Entropic causal inference is a framework for inferring the causal direction between two categorical variables from observational data. The central assumption is that the amount of unobserved randomness in the system is not too large. This unobserved randomness is measured by the entropy of the exogenous variable in the underlying structural causal model, which governs the causal relation between t… ▽ More

    Submitted 10 January, 2021; originally announced January 2021.

    Comments: In Proceedings of NeurIPS 2020

  26. arXiv:2012.06958  [pdf, other

    math.ST cs.LG math.NA stat.ML

    $k$-Variance: A Clustered Notion of Variance

    Authors: Justin Solomon, Kristjan Greenewald, Haikady N. Nagaraja

    Abstract: We introduce $k$-variance, a generalization of variance built on the machinery of random bipartite matchings. $K$-variance measures the expected cost of matching two sets of $k$ samples from a distribution to each other, capturing local rather than global information about a measure as $k$ increases; it is easily approximated stochastically using sampling and linear programming. In addition to def… ▽ More

    Submitted 12 December, 2020; originally announced December 2020.

  27. arXiv:2011.01979  [pdf, other

    stat.ML cs.LG stat.ME

    High-Dimensional Feature Selection for Sample Efficient Treatment Effect Estimation

    Authors: Kristjan Greenewald, Dmitriy Katz-Rogozhnikov, Karthik Shanmugam

    Abstract: The estimation of causal treatment effects from observational data is a fundamental problem in causal inference. To avoid bias, the effect estimator must control for all confounders. Hence practitioners often collect data for as many covariates as possible to raise the chances of including the relevant confounders. While this addresses the bias, this has the side effect of significantly increasing… ▽ More

    Submitted 3 November, 2020; originally announced November 2020.

  28. arXiv:2011.00641  [pdf, other

    stat.ME cs.LG stat.ML

    Active Structure Learning of Causal DAGs via Directed Clique Tree

    Authors: Chandler Squires, Sara Magliacane, Kristjan Greenewald, Dmitriy Katz, Murat Kocaoglu, Karthikeyan Shanmugam

    Abstract: A growing body of work has begun to study intervention design for efficient structure learning of causal directed acyclic graphs (DAGs). A typical setting is a causally sufficient setting, i.e. a system with no latent confounders, selection bias, or feedback, when the essential graph of the observational equivalence class (EC) is given as an input and interventions are assumed to be noiseless. Mos… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

    Comments: NeurIPS 2020

  29. arXiv:2010.13187  [pdf, other

    stat.ML cs.CV cs.LG

    Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling

    Authors: Akash Srivastava, Yamini Bansal, Yukun Ding, Cole Lincoln Hurwitz, Kai Xu, Bernhard Egger, Prasanna Sattigeri, Joshua B. Tenenbaum, Phuong Le, Arun Prakash R, Nengfeng Zhou, Joel Vaughan, Yaquan Wang, Anwesha Bhattacharyya, Kristjan Greenewald, David D. Cox, Dan Gutfreund

    Abstract: Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the (aggregate) posterior to encourage statistical independence of the latent factors. This approach introduces a trade-off between disentangled representation learning and reconstruction quality since the model does not have enough capacity to learn correlated latent variables that capture… ▽ More

    Submitted 3 April, 2024; v1 submitted 25 October, 2020; originally announced October 2020.

  30. arXiv:2007.05558  [pdf, other

    cs.LG stat.ML

    The Computational Limits of Deep Learning

    Authors: Neil C. Thompson, Kristjan Greenewald, Keeheon Lee, Gabriel F. Manso

    Abstract: Deep learning's recent history has been one of achievement: from triumphing over humans in the game of Go to world-leading performance in image classification, voice recognition, translation, and other tasks. But this progress has come with a voracious appetite for computing power. This article catalogs the extent of this dependency, showing that progress across a wide variety of applications is s… ▽ More

    Submitted 27 July, 2022; v1 submitted 10 July, 2020; originally announced July 2020.

    Comments: 33 pages, 8 figures

  31. arXiv:1911.00218  [pdf, other

    stat.ML cs.LG

    Statistical Model Aggregation via Parameter Matching

    Authors: Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Trong Nghia Hoang

    Abstract: We consider the problem of aggregating models learned from sequestered, possibly heterogeneous datasets. Exploiting tools from Bayesian nonparametrics, we develop a general meta-modeling framework that learns shared global latent structures by identifying correspondences among local model parameterizations. Our proposed framework is model-independent and is applicable to a wide range of model type… ▽ More

    Submitted 1 November, 2019; originally announced November 2019.

    Comments: NeurIPS 2019

  32. arXiv:1909.03539  [pdf, other

    cs.LG cs.AI

    Personalized HeartSteps: A Reinforcement Learning Algorithm for Optimizing Physical Activity

    Authors: Peng Liao, Kristjan Greenewald, Predrag Klasnja, Susan Murphy

    Abstract: With the recent evolution of mobile health technologies, health scientists are increasingly interested in developing just-in-time adaptive interventions (JITAIs), typically delivered via notification on mobile device and designed to help the user prevent negative health outcomes and promote the adoption and maintenance of healthy behaviors. A JITAI involves a sequence of decision rules (i.e., trea… ▽ More

    Submitted 8 September, 2019; originally announced September 2019.

  33. arXiv:1906.00313  [pdf, other

    stat.ML cs.LG

    BreGMN: scaled-Bregman Generative Modeling Networks

    Authors: Akash Srivastava, Kristjan Greenewald, Farzaneh Mirzazadeh

    Abstract: The family of f-divergences is ubiquitously applied to generative modeling in order to adapt the distribution of the model to that of the data. Well-definedness of f-divergences, however, requires the distributions of the data and model to overlap completely in every time step of training. As a result, as soon as the support of distributions of data and model contain non-overlapping portions, grad… ▽ More

    Submitted 1 June, 2019; originally announced June 2019.

  34. arXiv:1905.13576  [pdf, other

    math.ST cs.IT

    Convergence of Smoothed Empirical Measures with Applications to Entropy Estimation

    Authors: Ziv Goldfeld, Kristjan Greenewald, Yury Polyanskiy, Jonathan Weed

    Abstract: This paper studies convergence of empirical measures smoothed by a Gaussian kernel. Specifically, consider approximating $P\ast\mathcal{N}_σ$, for $\mathcal{N}_σ\triangleq\mathcal{N}(0,σ^2 \mathrm{I}_d)$, by $\hat{P}_n\ast\mathcal{N}_σ$, where $\hat{P}_n$ is the empirical measure, under different statistical distances. The convergence is examined in terms of the Wasserstein distance, total variati… ▽ More

    Submitted 1 May, 2020; v1 submitted 30 May, 2019; originally announced May 2019.

    Comments: arXiv admin note: substantial text overlap with arXiv:1810.11589

  35. arXiv:1905.12022  [pdf, other

    stat.ML cs.LG

    Bayesian Nonparametric Federated Learning of Neural Networks

    Authors: Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Trong Nghia Hoang, Yasaman Khazaeni

    Abstract: In federated learning problems, data is scattered across different servers and exchanging or pooling it is often impractical or prohibited. We develop a Bayesian nonparametric framework for federated learning with neural networks. Each data server is assumed to provide local neural network weights, which are modeled through our framework. We then develop an inference approach that allows us to syn… ▽ More

    Submitted 28 May, 2019; originally announced May 2019.

    Comments: ICML 2019

  36. arXiv:1810.05728  [pdf, other

    cs.LG stat.ML

    Estimating Information Flow in Deep Neural Networks

    Authors: Ziv Goldfeld, Ewout van den Berg, Kristjan Greenewald, Igor Melnyk, Nam Nguyen, Brian Kingsbury, Yury Polyanskiy

    Abstract: We study the flow of information and the evolution of internal representations during deep neural network (DNN) training, aiming to demystify the compression aspect of the information bottleneck theory. The theory suggests that DNN training comprises a rapid fitting phase followed by a slower compression phase, in which the mutual information $I(X;T)$ between the input $X$ and internal representat… ▽ More

    Submitted 30 May, 2019; v1 submitted 12 October, 2018; originally announced October 2018.

    Comments: Main text accepted to ICML 2019. This preprint contains the full version of that paper (including omitted appendices)

  37. Similarity Function Tracking using Pairwise Comparisons

    Authors: Kristjan Greenewald, Stephen Kelley, Brandon Oselio, Alfred O. Hero III

    Abstract: Recent work in distance metric learning has focused on learning transformations of data that best align with specified pairwise similarity and dissimilarity constraints, often supplied by a human observer. The learned transformations lead to improved retrieval, classification, and clustering algorithms due to the better adapted distance or similarity measures. Here, we address the problem of learn… ▽ More

    Submitted 6 January, 2017; originally announced January 2017.

    Comments: submitted to IEEE transactions on signal processing. arXiv admin note: substantial text overlap with arXiv:1610.03090, arXiv:1603.03678

  38. arXiv:1610.03090  [pdf, other

    cs.LG

    Dynamic Metric Learning from Pairwise Comparisons

    Authors: Kristjan Greenewald, Stephen Kelley, Alfred Hero III

    Abstract: Recent work in distance metric learning has focused on learning transformations of data that best align with specified pairwise similarity and dissimilarity constraints, often supplied by a human observer. The learned transformations lead to improved retrieval, classification, and clustering algorithms due to the better adapted distance or similarity measures. Here, we address the problem of learn… ▽ More

    Submitted 10 October, 2016; originally announced October 2016.

    Comments: to appear Allerton 2016. arXiv admin note: substantial text overlap with arXiv:1603.03678

  39. arXiv:1605.01790  [pdf, other

    cs.CV

    Robust SAR STAP via Kronecker Decomposition

    Authors: Kristjan Greenewald, Edmund Zelnio, Alfred Hero

    Abstract: This paper proposes a spatio-temporal decomposition for the detection of moving targets in multiantenna SAR. As a high resolution radar imaging modality, SAR detects and localizes non-moving targets accurately, giving it an advantage over lower resolution GMTI radars. Moving target detection is more challenging due to target smearing and masking by clutter. Space-time adaptive processing (STAP) is… ▽ More

    Submitted 5 May, 2016; originally announced May 2016.

    Comments: to appear at IEEE AES. arXiv admin note: text overlap with arXiv:1604.03622, arXiv:1501.07481

  40. arXiv:1603.03678  [pdf, other

    stat.ML cs.LG

    Nonstationary Distance Metric Learning

    Authors: Kristjan Greenewald, Stephen Kelley, Alfred Hero

    Abstract: Recent work in distance metric learning has focused on learning transformations of data that best align with provided sets of pairwise similarity and dissimilarity constraints. The learned transformations lead to improved retrieval, classification, and clustering algorithms due to the better adapted distance or similarity measures. Here, we introduce the problem of learning these transformations w… ▽ More

    Submitted 22 May, 2016; v1 submitted 11 March, 2016; originally announced March 2016.

  41. Ensemble Estimation of Information Divergence

    Authors: Kevin R. Moon, Kumar Sricharan, Kristjan Greenewald, Alfred O. Hero III

    Abstract: Recent work has focused on the problem of nonparametric estimation of information divergence functionals. Many existing approaches are restrictive in their assumptions on the density support set or require difficult calculations at the support boundary which must be known a priori. The MSE convergence rate of a leave-one-out kernel density plug-in divergence functional estimator for general bounde… ▽ More

    Submitted 4 June, 2018; v1 submitted 25 January, 2016; originally announced January 2016.

    Comments: 27 pages, 4 figures; A previous version of this paper was posted under the title of "Improving Convergence of Divergence Functional Ensemble Estimators"

    Journal ref: Entropy, vol. 20, no. 8, pp. 560, July 2018

  42. arXiv:1405.4574  [pdf, other

    cs.CV stat.ME

    Kronecker PCA Based Spatio-Temporal Modeling of Video for Dismount Classification

    Authors: Kristjan H. Greenewald, Alfred O. Hero III

    Abstract: We consider the application of KronPCA spatio-temporal modeling techniques [Greenewald et al 2013, Tsiligkaridis et al 2013] to the extraction of spatiotemporal features for video dismount classification. KronPCA performs a low-rank type of dimensionality reduction that is adapted to spatio-temporal data and is characterized by the T frame multiframe mean and covariance of p spatial features. For… ▽ More

    Submitted 18 May, 2014; originally announced May 2014.

    Comments: 8 pages. To appear in Proceeding of SPIE DSS. arXiv admin note: text overlap with arXiv:1402.5568