Skip to main content

Showing 1–37 of 37 results for author: Koehler, F

  1. arXiv:2402.18697  [pdf, other

    stat.ML cs.LG cs.SI math.OC math.ST

    Inferring Dynamic Networks from Marginals with Iterative Proportional Fitting

    Authors: Serina Chang, Frederic Koehler, Zhaonan Qu, Jure Leskovec, Johan Ugander

    Abstract: A common network inference problem, arising from real-world data constraints, is how to infer a dynamic network from its time-aggregated adjacency matrix and time-varying marginals (i.e., row and column sums). Prior approaches to this problem have repurposed the classic iterative proportional fitting (IPF) procedure, also known as Sinkhorn's algorithm, with promising empirical results. However, th… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  2. arXiv:2402.15409  [pdf, other

    stat.ML cs.CC cs.DS cs.LG math.ST

    Lasso with Latents: Efficient Estimation, Covariate Rescaling, and Computational-Statistical Gaps

    Authors: Jonathan Kelner, Frederic Koehler, Raghu Meka, Dhruv Rohatgi

    Abstract: It is well-known that the statistical performance of Lasso can suffer significantly when the covariates of interest have strong correlations. In particular, the prediction error of Lasso becomes much worse than computationally inefficient alternatives like Best Subset Selection. Due to a large conjectured computational-statistical tradeoff in the problem of sparse linear regression, it may be impo… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  3. arXiv:2310.01762  [pdf, other

    cs.LG cs.DS math.ST

    Sampling Multimodal Distributions with the Vanilla Score: Benefits of Data-Based Initialization

    Authors: Frederic Koehler, Thuy-Duong Vuong

    Abstract: There is a long history, as well as a recent explosion of interest, in statistical and generative modeling approaches based on score functions -- derivatives of the log-likelihood of a distribution. In seminal works, Hyvärinen proposed vanilla score matching as a way to learn distributions from data by computing an estimate of the score function of the underlying ground truth, and established conn… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  4. arXiv:2307.10466  [pdf, ps, other

    math.PR cs.DS math-ph

    Universality of Spectral Independence with Applications to Fast Mixing in Spin Glasses

    Authors: Nima Anari, Vishesh Jain, Frederic Koehler, Huy Tuan Pham, Thuy-Duong Vuong

    Abstract: We study Glauber dynamics for sampling from discrete distributions $μ$ on the hypercube $\{\pm 1\}^n$. Recently, techniques based on spectral independence have successfully yielded optimal $O(n)$ relaxation times for a host of different distributions $μ$. We show that spectral independence is universal: a relaxation time of $O(n)$ implies spectral independence. We then study a notion of tractabi… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  5. arXiv:2307.07625  [pdf, ps, other

    math.PR cs.DM math.CO

    Influences in Mixing Measures

    Authors: Frederic Koehler, Noam Lifshitz, Dor Minzer, Elchanan Mossel

    Abstract: The theory of influences in product measures has profound applications in theoretical computer science, combinatorics, and discrete probability. This deep theory is intimately connected to functional inequalities and to the Fourier analysis of discrete groups. Originally, influences of functions were motivated by the study of social choice theory, wherein a Boolean function represents a voting sch… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

  6. arXiv:2306.13188  [pdf, ps, other

    stat.ML cs.LG

    Uniform Convergence with Square-Root Lipschitz Loss

    Authors: Lijia Zhou, Zhen Dai, Frederic Koehler, Nathan Srebro

    Abstract: We establish generic uniform convergence guarantees for Gaussian data in terms of the Rademacher complexity of the hypothesis class and the Lipschitz constant of the square root of the scalar loss function. We show how these guarantees substantially generalize previous results based on smoothness (Lipschitz constant of the derivative), and allow us to handle the broader class of square-root-Lipsch… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  7. arXiv:2305.16892  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Feature Adaptation for Sparse Linear Regression

    Authors: Jonathan Kelner, Frederic Koehler, Raghu Meka, Dhruv Rohatgi

    Abstract: Sparse linear regression is a central problem in high-dimensional statistics. We study the correlated random design setting, where the covariates are drawn from a multivariate Gaussian $N(0,Σ)$, and we seek an estimator with small excess risk. If the true signal is $t$-sparse, information-theoretically, it is possible to achieve strong recovery guarantees with only $O(t\log n)$ samples. However,… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

  8. arXiv:2210.12082  [pdf, other

    stat.ML cs.LG math.ST

    A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized Linear Models

    Authors: Lijia Zhou, Frederic Koehler, Pragya Sur, Danica J. Sutherland, Nathan Srebro

    Abstract: We prove a new generalization bound that shows for any class of linear predictors in Gaussian space, the Rademacher complexity of the class and the training error under any continuous loss $\ell$ can control the test error under all Moreau envelopes of the loss $\ell$. We use our finite-sample bound to directly recover the "optimistic rate" of Zhou et al. (2021) for linear regression with the squa… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: As published at NeurIPS 2022

  9. arXiv:2210.00726  [pdf, other

    cs.LG math.ST stat.ML

    Statistical Efficiency of Score Matching: The View from Isoperimetry

    Authors: Frederic Koehler, Alexander Heckett, Andrej Risteski

    Abstract: Deep generative models parametrized up to a normalizing constant (e.g. energy-based models) are difficult to train by maximizing the likelihood of the data because the likelihood and/or gradients thereof cannot be explicitly or efficiently written down. Score matching is a training method, whereby instead of fitting the likelihood $\log p(x)$ for the training data, we instead fit the score functio… ▽ More

    Submitted 22 December, 2022; v1 submitted 3 October, 2022; originally announced October 2022.

  10. arXiv:2203.02824  [pdf, ps, other

    cs.DS cs.IT cs.LG math.ST stat.ML

    Distributional Hardness Against Preconditioned Lasso via Erasure-Robust Designs

    Authors: Jonathan A. Kelner, Frederic Koehler, Raghu Meka, Dhruv Rohatgi

    Abstract: Sparse linear regression with ill-conditioned Gaussian random designs is widely believed to exhibit a statistical/computational gap, but there is surprisingly little formal evidence for this belief, even in the form of examples that are hard for restricted classes of algorithms. Recent work has shown that, for certain covariance matrices, the broad class of Preconditioned Lasso programs provably c… ▽ More

    Submitted 5 March, 2022; originally announced March 2022.

    Comments: 39 pages

  11. arXiv:2202.08907  [pdf, ps, other

    cs.DS cs.LG math.PR stat.ML

    Sampling Approximately Low-Rank Ising Models: MCMC meets Variational Methods

    Authors: Frederic Koehler, Holden Lee, Andrej Risteski

    Abstract: We consider Ising models on the hypercube with a general interaction matrix $J$, and give a polynomial time sampling algorithm when all but $O(1)$ eigenvalues of $J$ lie in an interval of length one, a situation which occurs in many models of interest. This was previously known for the Glauber dynamics when *all* eigenvalues fit in an interval of length one; however, a single outlier can force the… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

    Comments: 43 pages

  12. arXiv:2112.06868  [pdf, other

    cs.LG stat.ML

    Variational autoencoders in the presence of low-dimensional data: landscape and implicit bias

    Authors: Frederic Koehler, Viraj Mehta, Chenghui Zhou, Andrej Risteski

    Abstract: Variational Autoencoders are one of the most commonly used generative models, particularly for image data. A prominent difficulty in training VAEs is data that is supported on a lower-dimensional manifold. Recent work by Dai and Wipf (2020) proposes a two-stage training algorithm for VAEs, based on a conjecture that in standard VAE training the generator will converge to a solution with 0 variance… ▽ More

    Submitted 17 May, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

    Comments: Accepted as a conference paper at ICLR 2022

  13. arXiv:2112.04470  [pdf, other

    stat.ML cs.LG math.ST

    Optimistic Rates: A Unifying Theory for Interpolation Learning and Regularization in Linear Regression

    Authors: Lijia Zhou, Frederic Koehler, Danica J. Sutherland, Nathan Srebro

    Abstract: We study a localized notion of uniform convergence known as an "optimistic rate" (Panchenko 2002; Srebro et al. 2010) for linear regression with Gaussian data. Our refined analysis avoids the hidden constant and logarithmic factor in existing results, which are known to be crucial in high-dimensional settings, especially for understanding interpolation learning. As a special case, our analysis rec… ▽ More

    Submitted 8 December, 2021; originally announced December 2021.

  14. arXiv:2111.06395  [pdf, ps, other

    stat.ML cs.DS cs.LG eess.SY

    Kalman Filtering with Adversarial Corruptions

    Authors: Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau

    Abstract: Here we revisit the classic problem of linear quadratic estimation, i.e. estimating the trajectory of a linear dynamical system from noisy measurements. The celebrated Kalman filter gives an optimal estimator when the measurement noise is Gaussian, but is widely known to break down when one deviates from this assumption, e.g. when the noise is heavy-tailed. Many ad hoc heuristics have been employe… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Comments: 57 pages, comments welcome

  15. arXiv:2111.03247  [pdf, ps, other

    cs.DS math-ph math.PR

    Entropic Independence II: Optimal Sampling and Concentration via Restricted Modified Log-Sobolev Inequalities

    Authors: Nima Anari, Vishesh Jain, Frederic Koehler, Huy Tuan Pham, Thuy-Duong Vuong

    Abstract: We introduce a framework for obtaining tight mixing times for Markov chains based on what we call restricted modified log-Sobolev inequalities. Modified log-Sobolev inequalities (MLSI) quantify the rate of relative entropy contraction for the Markov operator, and are notoriously difficult to establish. However, infinitesimally close to stationarity, entropy contraction becomes equivalent to varian… ▽ More

    Submitted 5 November, 2021; originally announced November 2021.

  16. arXiv:2109.11505  [pdf, other

    cs.LG cs.CG cs.DS stat.ML

    Multidimensional Scaling: Approximation and Complexity

    Authors: Erik Demaine, Adam Hesterberg, Frederic Koehler, Jayson Lynch, John Urschel

    Abstract: Metric Multidimensional scaling (MDS) is a classical method for generating meaningful (non-linear) low-dimensional embeddings of high-dimensional data. MDS has a long history in the statistics, machine learning, and graph drawing communities. In particular, the Kamada-Kawai force-directed graph drawing method is equivalent to MDS and is one of the most popular ways in practice to embed graphs into… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

  17. arXiv:2109.06915  [pdf, other

    math.PR cs.LG math.ST

    Reconstruction on Trees and Low-Degree Polynomials

    Authors: Frederic Koehler, Elchanan Mossel

    Abstract: The study of Markov processes and broadcasting on trees has deep connections to a variety of areas including statistical physics, graphical models, phylogenetic reconstruction, Markov Chain Monte Carlo, and community detection in random graphs. Notably, the celebrated Belief Propagation (BP) algorithm achieves Bayes-optimal performance for the reconstruction problem of predicting the value of the… ▽ More

    Submitted 25 October, 2022; v1 submitted 14 September, 2021; originally announced September 2021.

    Comments: 28 pages, comments welcome. V2: added new result and more exposition. abstract shortened for arxiv

  18. arXiv:2106.09276  [pdf, other

    stat.ML cs.LG math.ST

    Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds, and Benign Overfitting

    Authors: Frederic Koehler, Lijia Zhou, Danica J. Sutherland, Nathan Srebro

    Abstract: We consider interpolation learning in high-dimensional linear regression with Gaussian data, and prove a generic uniform convergence guarantee on the generalization error of interpolators in an arbitrary hypothesis class in terms of the class's Gaussian width. Applying the generic bound to Euclidean norm balls recovers the consistency result of Bartlett et al. (2020) for minimum-norm interpolators… ▽ More

    Submitted 3 January, 2022; v1 submitted 17 June, 2021; originally announced June 2021.

    Comments: v2: Minor changes only. As published at NeurIPS 2021: https://proceedings.neurips.cc/paper/2021/hash/ac9815bef801f58de83804bce86984ad-Abstract.html

  19. arXiv:2106.09207  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    On the Power of Preconditioning in Sparse Linear Regression

    Authors: Jonathan Kelner, Frederic Koehler, Raghu Meka, Dhruv Rohatgi

    Abstract: Sparse linear regression is a fundamental problem in high-dimensional statistics, but strikingly little is known about how to efficiently solve it without restrictive conditions on the design matrix. We consider the (correlated) random design setting, where the covariates are independently drawn from a multivariate Gaussian $N(0,Σ)$ with $Σ: n \times n$, and seek estimators $\hat{w}$ minimizing… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

    Comments: 73 pages, 5 figures

  20. arXiv:2106.04105  [pdf, ps, other

    cs.DS cs.DM math-ph math.PR

    Entropic Independence I: Modified Log-Sobolev Inequalities for Fractionally Log-Concave Distributions and High-Temperature Ising Models

    Authors: Nima Anari, Vishesh Jain, Frederic Koehler, Huy Tuan Pham, Thuy-Duong Vuong

    Abstract: We introduce a notion called entropic independence that is an entropic analog of spectral notions of high-dimensional expansion. Informally, entropic independence of a background distribution $μ$ on $k$-sized subsets of a ground set of elements says that for any (possibly randomly chosen) set $S$, the relative entropy of a single element of $S$ drawn uniformly at random carries at most $O(1/k)$ fr… ▽ More

    Submitted 4 November, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

  21. arXiv:2106.03969  [pdf, other

    cs.LG cs.DS cs.IT math.ST

    Chow-Liu++: Optimal Prediction-Centric Learning of Tree Ising Models

    Authors: Enric Boix-Adsera, Guy Bresler, Frederic Koehler

    Abstract: We consider the problem of learning a tree-structured Ising model from data, such that subsequent predictions computed using the model are accurate. Concretely, we aim to learn a model such that posteriors $P(X_i|X_S)$ for small sets of variables $S$ are accurate. Since its introduction more than 50 years ago, the Chow-Liu algorithm, which efficiently computes the maximum likelihood tree, has been… ▽ More

    Submitted 23 November, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: 49 pages, 3 figures, to appear in FOCS'21

  22. arXiv:2010.04157  [pdf, other

    cs.LG cs.DS stat.ML

    Online and Distribution-Free Robustness: Regression and Contextual Bandits with Huber Contamination

    Authors: Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau

    Abstract: In this work we revisit two classic high-dimensional online learning problems, namely linear regression and contextual bandits, from the perspective of adversarial robustness. Existing works in algorithmic robust statistics make strong distributional assumptions that ensure that the input data is evenly spread out or comes from a nice generative model. Is it possible to achieve strong robustness g… ▽ More

    Submitted 10 June, 2021; v1 submitted 8 October, 2020; originally announced October 2020.

    Comments: 66 pages, 1 figure, v3: refined exposition and improved rates

  23. arXiv:2010.01155  [pdf, other

    cs.LG stat.ML

    Representational aspects of depth and conditioning in normalizing flows

    Authors: Frederic Koehler, Viraj Mehta, Andrej Risteski

    Abstract: Normalizing flows are among the most popular paradigms in generative modeling, especially for images, primarily because we can efficiently evaluate the likelihood of a data point. This is desirable both for evaluating the fit of a model, and for ease of training, as maximizing the likelihood can be done by gradient descent. However, training normalizing flows comes with difficulties as well: model… ▽ More

    Submitted 25 June, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

    Comments: Appeared in ICML 2021

  24. arXiv:2007.12815  [pdf, other

    cs.LG cs.DS stat.ML

    From Boltzmann Machines to Neural Networks and Back Again

    Authors: Surbhi Goel, Adam Klivans, Frederic Koehler

    Abstract: Graphical models are powerful tools for modeling high-dimensional data, but learning graphical models in the presence of latent variables is well-known to be difficult. In this work we give new results for learning Restricted Boltzmann Machines, probably the most well-studied class of latent variable models. Our results are based on new connections to learning two-layer neural networks under… ▽ More

    Submitted 24 July, 2020; originally announced July 2020.

  25. arXiv:2006.04787  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Connections to Evolvability

    Authors: Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau

    Abstract: In this paper we revisit some classic problems on classification under misspecification. In particular, we study the problem of learning halfspaces under Massart noise with rate $η$. In a recent work, Diakonikolas, Goulekakis, and Tzamos resolved a long-standing problem by giving the first efficient algorithm for learning to accuracy $η+ ε$ for any $ε> 0$. However, their algorithm outputs a compli… ▽ More

    Submitted 20 September, 2023; v1 submitted 8 June, 2020; originally announced June 2020.

    Comments: 52 pages, v2: updated references

  26. arXiv:2004.12580  [pdf, other

    math.PR cs.DM cs.GT math.CO

    A Phase Transition in Arrow's Theorem

    Authors: Frederic Koehler, Elchanan Mossel

    Abstract: Arrow's Theorem concerns a fundamental problem in social choice theory: given the individual preferences of members of a group, how can they be aggregated to form rational group preferences? Arrow showed that in an election between three or more candidates, there are situations where any voting rule satisfying a small list of natural "fairness" axioms must produce an apparently irrational intransi… ▽ More

    Submitted 24 September, 2021; v1 submitted 27 April, 2020; originally announced April 2020.

    Comments: 52 pages; comments welcome!

  27. arXiv:1905.10031  [pdf, ps, other

    cs.IT math.ST stat.CO stat.ML

    Accuracy-Memory Tradeoffs and Phase Transitions in Belief Propagation

    Authors: Vishesh Jain, Frederic Koehler, Jingbo Liu, Elchanan Mossel

    Abstract: The analysis of Belief Propagation and other algorithms for the {\em reconstruction problem} plays a key role in the analysis of community detection in inference on graphs, phylogenetic reconstruction in bioinformatics, and the cavity method in statistical physics. We prove a conjecture of Evans, Kenyon, Peres, and Schulman (2000) which states that any bounded memory message passing algorithm is… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

    Comments: To be presented on COLT 2019

  28. arXiv:1905.09992  [pdf, other

    cs.LG cs.DS math.PR stat.ML

    Fast Convergence of Belief Propagation to Global Optima: Beyond Correlation Decay

    Authors: Frederic Koehler

    Abstract: Belief propagation is a fundamental message-passing algorithm for probabilistic reasoning and inference in graphical models. While it is known to be exact on trees, in most applications belief propagation is run on graphs with cycles. Understanding the behavior of "loopy" belief propagation has been a major challenge for researchers in machine learning, and positive convergence results for BP are… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

    Comments: 24 pages; comments welcome!

  29. arXiv:1905.01282  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Learning Some Popular Gaussian Graphical Models without Condition Number Bounds

    Authors: Jonathan Kelner, Frederic Koehler, Raghu Meka, Ankur Moitra

    Abstract: Gaussian Graphical Models (GGMs) have wide-ranging applications in machine learning and the natural and social sciences. In most of the settings in which they are applied, the number of observed samples is much smaller than the dimension and they are assumed to be sparse. While there are a variety of algorithms (e.g. Graphical Lasso, CLIME) that provably recover the graph structure with a logarith… ▽ More

    Submitted 8 March, 2020; v1 submitted 3 May, 2019; originally announced May 2019.

    Comments: V2: Updated version with some new results

  30. arXiv:1808.07226  [pdf, ps, other

    cs.LG cs.DS math.PR stat.ML

    Mean-field approximation, convex hierarchies, and the optimality of correlation rounding: a unified perspective

    Authors: Vishesh Jain, Frederic Koehler, Andrej Risteski

    Abstract: The free energy is a key quantity of interest in Ising models, but unfortunately, computing it in general is computationally intractable. Two popular (variational) approximation schemes for estimating the free energy of general Ising models (in particular, even in regimes where correlation decay does not hold) are: (i) the mean-field approximation with roots in statistical physics, which estimates… ▽ More

    Submitted 28 August, 2018; v1 submitted 22 August, 2018; originally announced August 2018.

    Comments: This version: minor formatting changes, added grant acknowledgements

  31. arXiv:1805.11405  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Representational Power of ReLU Networks and Polynomial Kernels: Beyond Worst-Case Analysis

    Authors: Frederic Koehler, Andrej Risteski

    Abstract: There has been a large amount of interest, both in the past and particularly recently, into the power of different families of universal approximators, e.g. ReLU networks, polynomials, rational functions. However, current research has focused almost exclusively on understanding this problem in a worst-case setting, e.g. bounding the error of the best infinity-norm approximation in a box. In this s… ▽ More

    Submitted 29 May, 2018; originally announced May 2018.

  32. arXiv:1805.10262  [pdf, other

    cs.LG cs.DS math.PR stat.ML

    Learning Restricted Boltzmann Machines via Influence Maximization

    Authors: Guy Bresler, Frederic Koehler, Ankur Moitra, Elchanan Mossel

    Abstract: Graphical models are a rich language for describing high-dimensional distributions in terms of their dependence structure. While there are algorithms with provable guarantees for learning undirected graphical models in a variety of settings, there has been much less progress in the important scenario when there are latent variables. Here we study Restricted Boltzmann Machines (or RBMs), which are… ▽ More

    Submitted 5 November, 2018; v1 submitted 25 May, 2018; originally announced May 2018.

    Comments: 29 pages

  33. arXiv:1802.06129  [pdf, ps, other

    cs.LG math.CO

    The Vertex Sample Complexity of Free Energy is Polynomial

    Authors: Vishesh Jain, Frederic Koehler, Elchanan Mossel

    Abstract: We study the following question: given a massive Markov random field on $n$ nodes, can a small sample from it provide a rough approximation to the free energy $\mathcal{F}_n = \log{Z_n}$? Results in graph limit literature by Borgs, Chayes, Lovász, Sós, and Vesztergombi show that for Ising models on $n$ nodes and interactions of strength $Θ(1/n)$, an $ε$ approximation to $\log Z_n / n$ can be ach… ▽ More

    Submitted 23 February, 2018; v1 submitted 16 February, 2018; originally announced February 2018.

    Comments: arXiv admin note: text overlap with arXiv:1802.06126 Updated bibliography

  34. arXiv:1802.06126  [pdf, ps, other

    cs.LG cond-mat.stat-mech math-ph math.CO

    The Mean-Field Approximation: Information Inequalities, Algorithms, and Complexity

    Authors: Vishesh Jain, Frederic Koehler, Elchanan Mossel

    Abstract: The mean field approximation to the Ising model is a canonical variational tool that is used for analysis and inference in Ising models. We provide a simple and optimal bound for the KL error of the mean field approximation for Ising models on general graphs, and extend it to higher order Markov random fields. Our bound improves on previous bounds obtained in work in the graph limit literature by… ▽ More

    Submitted 20 February, 2018; v1 submitted 16 February, 2018; originally announced February 2018.

    Comments: Updated bibliography

  35. arXiv:1711.01655  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Approximating Partition Functions in Constant Time

    Authors: Vishesh Jain, Frederic Koehler, Elchanan Mossel

    Abstract: We study approximations of the partition function of dense graphical models. Partition functions of graphical models play a fundamental role is statistical physics, in statistics and in machine learning. Two of the main methods for approximating the partition function are Markov Chain Monte Carlo and Variational Methods. An impressive body of work in mathematics, physics and theoretical computer s… ▽ More

    Submitted 20 February, 2018; v1 submitted 5 November, 2017; originally announced November 2017.

    Comments: This preprint is completely subsumed by preprints arXiv:1802.06126 and arXiv:1802.06129 by the same authors which also include important references that are missing in the current preprint

  36. arXiv:1705.11107  [pdf, other

    cs.LG cs.DS cs.IT math.ST

    Information Theoretic Properties of Markov Random Fields, and their Algorithmic Applications

    Authors: Linus Hamilton, Frederic Koehler, Ankur Moitra

    Abstract: Markov random fields area popular model for high-dimensional probability distributions. Over the years, many mathematical, statistical and algorithmic problems on them have been studied. Until recently, the only known algorithms for provably learning them relied on exhaustive search, correlation decay or various incoherence assumptions. Bresler gave an algorithm for learning general Ising models o… ▽ More

    Submitted 31 May, 2017; originally announced May 2017.

    Comments: 25 pages

  37. arXiv:1605.08491  [pdf, other

    cs.LG stat.ML

    Provable Algorithms for Inference in Topic Models

    Authors: Sanjeev Arora, Rong Ge, Frederic Koehler, Tengyu Ma, Ankur Moitra

    Abstract: Recently, there has been considerable progress on designing algorithms with provable guarantees -- typically using linear algebraic methods -- for parameter learning in latent variable models. But designing provable algorithms for inference has proven to be more challenging. Here we take a first step towards provable inference in topic models. We leverage a property of topic models that enables us… ▽ More

    Submitted 26 May, 2016; originally announced May 2016.

    Comments: to appear at ICML'2016