Skip to main content

Showing 1–17 of 17 results for author: Gray, A G

  1. arXiv:2405.02350  [pdf, ps, other

    cs.LG cs.AI

    What makes Models Compositional? A Theoretical View: With Supplement

    Authors: Parikshit Ram, Tim Klinger, Alexander G. Gray

    Abstract: Compositionality is thought to be a key component of language, and various compositional benchmarks have been developed to empirically probe the compositional generalization of existing sequence processing models. These benchmarks often highlight failures of existing models, but it is not clear why these models fail in this way. In this paper, we seek to theoretically understand the role the compo… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Extended version of the original IJCAI 2024 paper with detailed supplementary materials (27 pages, 7 figures)

  2. arXiv:2301.05131  [pdf, other

    cs.LG

    Toward Theoretical Guidance for Two Common Questions in Practical Cross-Validation based Hyperparameter Selection

    Authors: Parikshit Ram, Alexander G. Gray, Horst C. Samulowitz, Gregory Bramble

    Abstract: We show, to our knowledge, the first theoretical treatments of two common questions in cross-validation based hyperparameter selection: (1) After selecting the best hyperparameter using a held-out set, we train the final model using {\em all} of the training data -- since this may or may not improve future generalization error, should one do this? (2) During optimization such as via SGD (stochasti… ▽ More

    Submitted 12 January, 2023; originally announced January 2023.

    Comments: Extended version of the paper appearing at the SIAM International Conference on Data Mining 2023 (SDM23)

  3. arXiv:2006.09635  [pdf, other

    cs.LG math.OC stat.ML

    Solving Constrained CASH Problems with ADMM

    Authors: Parikshit Ram, Sijia Liu, Deepak Vijaykeerthi, Dakuo Wang, Djallel Bouneffouf, Greg Bramble, Horst Samulowitz, Alexander G. Gray

    Abstract: The CASH problem has been widely studied in the context of automated configurations of machine learning (ML) pipelines and various solvers and toolkits are available. However, CASH solvers do not directly handle black-box constraints such as fairness, robustness or other domain-specific custom constraints. We present our recent approach [Liu, et al., 2020] that leverages the ADMM optimization fram… ▽ More

    Submitted 10 July, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: 7th ICML Workshop on Automated Machine Learning (2020)

  4. arXiv:1309.6830  [pdf

    cs.LG stat.ML

    Building Bridges: Viewing Active Learning from the Multi-Armed Bandit Lens

    Authors: Ravi Ganti, Alexander G. Gray

    Abstract: In this paper we propose a multi-armed bandit inspired, pool based active learning algorithm for the problem of binary classification. By carefully constructing an analogy between active learning and multi-armed bandits, we utilize ideas such as lower confidence bounds, and self-concordant regularization from the multi-armed bandit literature to design our proposed algorithm. Our algorithm is a se… ▽ More

    Submitted 26 September, 2013; originally announced September 2013.

    Comments: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

    Report number: UAI-P-2013-PG-232-241

  5. arXiv:1304.4327  [pdf, ps, other

    cs.DS

    Tree-Independent Dual-Tree Algorithms

    Authors: Ryan R. Curtin, William B. March, Parikshit Ram, David V. Anderson, Alexander G. Gray, Charles L. Isbell Jr

    Abstract: Dual-tree algorithms are a widely used class of branch-and-bound algorithms. Unfortunately, developing dual-tree algorithms for use with different trees and problems is often complex and burdensome. We introduce a four-part logical split: the tree, the traversal, the point-to-point base case, and the pruning rule. We provide a meta-algorithm which allows development of dual-tree algorithms in a tr… ▽ More

    Submitted 16 April, 2013; originally announced April 2013.

    Comments: accepted in ICML 2013

  6. arXiv:1210.6293  [pdf, ps, other

    cs.MS cs.CV cs.LG

    MLPACK: A Scalable C++ Machine Learning Library

    Authors: Ryan R. Curtin, James R. Cline, N. P. Slagle, William B. March, Parikshit Ram, Nishant A. Mehta, Alexander G. Gray

    Abstract: MLPACK is a state-of-the-art, scalable, multi-platform C++ machine learning library released in late 2011 offering both a simple, consistent API accessible to novice users and high performance and flexibility to expert users by leveraging modern features of C++. MLPACK provides cutting-edge algorithms whose benchmarks exhibit far better performance than other leading machine learning libraries. ML… ▽ More

    Submitted 23 October, 2012; originally announced October 2012.

    Comments: Submitted to JMLR MLOSS (http://jmlr.csail.mit.edu/mloss/)

    Journal ref: Journal of Machine Learning Research 14 (2013) 801-805

  7. arXiv:1210.6287  [pdf, ps, other

    cs.DS cs.IR cs.LG

    Fast Exact Max-Kernel Search

    Authors: Ryan R. Curtin, Parikshit Ram, Alexander G. Gray

    Abstract: The wide applicability of kernels makes the problem of max-kernel search ubiquitous and more general than the usual similarity search in metric spaces. We focus on solving this problem efficiently. We begin by characterizing the inherent hardness of the max-kernel search problem with a novel notion of directional concentration. Following that, we present a method to use an $O(n \log n)$ algorithm… ▽ More

    Submitted 26 October, 2012; v1 submitted 23 October, 2012; originally announced October 2012.

    Comments: Under submission in SIAM Data Mining conference

  8. arXiv:1209.2784  [pdf, other

    cs.LG stat.ML

    Minimax Multi-Task Learning and a Generalized Loss-Compositional Paradigm for MTL

    Authors: Nishant A. Mehta, Dongryeol Lee, Alexander G. Gray

    Abstract: Since its inception, the modus operandi of multi-task learning (MTL) has been to minimize the task-wise mean of the empirical risks. We introduce a generalized loss-compositional paradigm for MTL that includes a spectrum of formulations as a subfamily. One endpoint of this spectrum is minimax MTL: a new MTL formulation that minimizes the maximum of the tasks' empirical risks. Via a certain relaxat… ▽ More

    Submitted 13 September, 2012; originally announced September 2012.

    Comments: appearing at NIPS 2012

  9. arXiv:1206.6857  [pdf

    cs.LG math.NA stat.ML

    Faster Gaussian Summation: Theory and Experiment

    Authors: Dongryeol Lee, Alexander G. Gray

    Abstract: We provide faster algorithms for the problem of Gaussian summation, which occurs in many machine learning methods. We develop two new extensions - an O(Dp) Taylor expansion for the Gaussian kernel with rigorous error bounds and a new error control scheme integrating any arbitrary approximation method - within the best discretealgorithmic framework using adaptive hierarchical data structures. We ri… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

    Report number: UAI-P-2006-PG-281-288

  10. arXiv:1206.5278  [pdf

    stat.ME cs.LG stat.ML

    Fast Nonparametric Conditional Density Estimation

    Authors: Michael P. Holmes, Alexander G. Gray, Charles Lee Isbell

    Abstract: Conditional density estimation generalizes regression by modeling a full density f(yjx) rather than only the expected value E(yjx). This is important for many tasks, including handling multi-modality and generating prediction intervals. Though fundamental and widely applicable, nonparametric conditional density estimators have received relatively little attention from statisticians and little or n… ▽ More

    Submitted 20 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI2007)

    Report number: UAI-P-2007-PG-175-182

  11. arXiv:1202.6101  [pdf, ps, other

    cs.CG cs.DS cs.IR

    Maximum Inner-Product Search using Tree Data-structures

    Authors: Parikshit Ram, Alexander G. Gray

    Abstract: The problem of {\em efficiently} finding the best match for a query in a given set with respect to the Euclidean distance or the cosine similarity has been extensively studied in literature. However, a closely related problem of efficiently finding the best match with respect to the inner product has never been explored in the general setting to the best of our knowledge. In this paper we consider… ▽ More

    Submitted 27 February, 2012; originally announced February 2012.

    Comments: Under submission in KDD 2012

  12. arXiv:1202.4050  [pdf, other

    cs.LG stat.ML

    On the Sample Complexity of Predictive Sparse Coding

    Authors: Nishant A. Mehta, Alexander G. Gray

    Abstract: The goal of predictive sparse coding is to learn a representation of examples as sparse linear combinations of elements from a dictionary, such that a learned hypothesis linear in the new representation performs well on a predictive task. Predictive sparse coding algorithms recently have demonstrated impressive performance on a variety of supervised tasks, but their generalization properties have… ▽ More

    Submitted 7 October, 2012; v1 submitted 17 February, 2012; originally announced February 2012.

    Comments: Sparse Coding Stability Theorem from version 1 has been relaxed considerably using a new notion of coding margin. Old Sparse Coding Stability Theorem still in new version, now as Theorem 2. Presentation of all proofs simplified/improved considerably. Paper reorganized. Empirical analysis showing new coding margin is non-trivial on real datasets

  13. arXiv:1105.2769  [pdf, ps, other

    physics.comp-ph cs.DS

    Multibody Multipole Methods

    Authors: Dongryeol Lee, Arkadas Ozakin, Alexander G. Gray

    Abstract: A three-body potential function can account for interactions among triples of particles which are uncaptured by pairwise interaction functions such as Coulombic or Lennard-Jones potentials. Likewise, a multibody potential of order $n$ can account for interactions among $n$-tuples of particles uncaptured by interaction functions of lower orders. To date, the computation of multibody potential funct… ▽ More

    Submitted 30 June, 2012; v1 submitted 13 May, 2011; originally announced May 2011.

    Comments: To appear in Journal of Computational Physics

    MSC Class: 68U01 ACM Class: J.2

  14. arXiv:1102.2878  [pdf, ps, other

    stat.CO cs.DS stat.ML

    Dual-Tree Fast Gauss Transforms

    Authors: Dongryeol Lee, Alexander G. Gray, Andrew W. Moore

    Abstract: Kernel density estimation (KDE) is a popular statistical technique for estimating the underlying density distribution with minimal assumptions. Although they can be shown to achieve asymptotic estimation optimality for any input distribution, cross-validating for an optimal parameter requires significant computation dominated by kernel summations. In this paper we present an improvement to the dua… ▽ More

    Submitted 14 February, 2011; originally announced February 2011.

    Comments: Extended version of a conference paper. Submitted to a journal

  15. arXiv:1005.0188  [pdf, other

    cs.LG stat.ML

    Generative and Latent Mean Map Kernels

    Authors: Nishant A. Mehta, Alexander G. Gray

    Abstract: We introduce two kernels that extend the mean map, which embeds probability measures in Hilbert spaces. The generative mean map kernel (GMMK) is a smooth similarity measure between probabilistic models. The latent mean map kernel (LMMK) generalizes the non-iid formulation of Hilbert space embeddings of empirical distributions in order to incorporate latent variable models. When comparing certain c… ▽ More

    Submitted 3 May, 2010; originally announced May 2010.

    Comments: 16 pages, 1 figure, 1 table

  16. arXiv:0810.4611  [pdf, ps, other

    cs.LG

    Learning Isometric Separation Maps

    Authors: Nikolaos Vasiloglou, Alexander G. Gray, David V. Anderson

    Abstract: Maximum Variance Unfolding (MVU) and its variants have been very successful in embedding data-manifolds in lower dimensional spaces, often revealing the true intrinsic dimension. In this paper we show how to also incorporate supervised class information into an MVU-like method without breaking its convexity. We call this method the Isometric Separation Map and we show that the resulting kernel m… ▽ More

    Submitted 15 April, 2009; v1 submitted 25 October, 2008; originally announced October 2008.

    Comments: Submitted to the NIPS workshop on Kernel Learning:Automatic Selection Of Kernels and now presented in MLSP 2009

  17. arXiv:0810.2311  [pdf, ps, other

    cs.AI cs.CV

    Non-Negative Matrix Factorization, Convexity and Isometry

    Authors: Nikolaos Vasiloglou, Alexander G. Gray, David V. Anderson

    Abstract: In this paper we explore avenues for improving the reliability of dimensionality reduction methods such as Non-Negative Matrix Factorization (NMF) as interpretive exploratory data analysis tools. We first explore the difficulties of the optimization problem underlying NMF, showing for the first time that non-trivial NMF solutions always exist and that the optimization problem is actually convex,… ▽ More

    Submitted 22 April, 2009; v1 submitted 13 October, 2008; originally announced October 2008.

    Comments: accpepted in SIAM Data Mining 2009, 12 pages