Skip to main content

Showing 1–50 of 96 results for author: Lee, Y T

  1. arXiv:2404.14219  [pdf, other

    cs.CL cs.AI

    Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

    Authors: Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Qin Cai, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Yen-Chun Chen, Yi-Ling Chen, Parul Chopra , et al. (90 additional authors not shown)

    Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset… ▽ More

    Submitted 23 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 19 pages

  2. arXiv:2403.19146  [pdf, ps, other

    cs.DS cs.DC math.OC

    Improving the Bit Complexity of Communication for Distributed Convex Optimization

    Authors: Mehrdad Ghadiri, Yin Tat Lee, Swati Padmanabhan, William Swartworth, David Woodruff, Guanghao Ye

    Abstract: We consider the communication complexity of some fundamental convex optimization problems in the point-to-point (coordinator) and blackboard communication models. We strengthen known bounds for approximately solving linear regression, $p$-norm regression (for $1\leq p\leq 2$), linear programming, minimizing the sum of finitely many convex nonsmooth functions with varying supports, and low rank app… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: To appear in STOC '24. Abstract shortened to meet the arXiv limits. Comments welcome!

  3. arXiv:2403.01749  [pdf, other

    cs.CL

    Differentially Private Synthetic Data via Foundation Model APIs 2: Text

    Authors: Chulin Xie, Zinan Lin, Arturs Backurs, Sivakanth Gopi, Da Yu, Huseyin A Inan, Harsha Nori, Haotian Jiang, Huishuai Zhang, Yin Tat Lee, Bo Li, Sergey Yekhanin

    Abstract: Text data has become extremely valuable due to the emergence of machine learning algorithms that learn from it. A lot of high-quality text data generated in the real world is private and therefore cannot be shared or used freely due to privacy concerns. Generating synthetic replicas of private text data with a formal privacy guarantee, i.e., differential privacy (DP), offers a promising and scalab… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  4. arXiv:2311.16452  [pdf, other

    cs.CL

    Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine

    Authors: Harsha Nori, Yin Tat Lee, Sheng Zhang, Dean Carignan, Richard Edgar, Nicolo Fusi, Nicholas King, Jonathan Larson, Yuanzhi Li, Weishung Liu, Renqian Luo, Scott Mayer McKinney, Robert Osazuwa Ness, Hoifung Poon, Tao Qin, Naoto Usuyama, Chris White, Eric Horvitz

    Abstract: Generalist foundation models such as GPT-4 have displayed surprising capabilities in a wide variety of domains and tasks. Yet, there is a prevalent assumption that they cannot match specialist capabilities of fine-tuned models. For example, most explorations to date on medical competency benchmarks have leveraged domain-specific training, as exemplified by efforts on BioGPT and Med-PaLM. We build… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 21 pages, 7 figures

    ACM Class: I.2.7

  5. arXiv:2311.14737  [pdf, other

    cs.CL cs.AI cs.LG

    Positional Description Matters for Transformers Arithmetic

    Authors: Ruoqi Shen, Sébastien Bubeck, Ronen Eldan, Yin Tat Lee, Yuanzhi Li, Yi Zhang

    Abstract: Transformers, central to the successes in modern Natural Language Processing, often falter on arithmetic tasks despite their vast capabilities --which paradoxically include remarkable coding abilities. We observe that a crucial challenge is their naive reliance on positional information to solve arithmetic problems with a small number of digits, leading to poor performance on larger numbers. Herei… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

    Comments: 18 pages

  6. arXiv:2309.05463  [pdf, other

    cs.CL cs.AI

    Textbooks Are All You Need II: phi-1.5 technical report

    Authors: Yuanzhi Li, Sébastien Bubeck, Ronen Eldan, Allie Del Giorno, Suriya Gunasekar, Yin Tat Lee

    Abstract: We continue the investigation into the power of smaller Transformer-based language models as initiated by \textbf{TinyStories} -- a 10 million parameter model that can produce coherent English -- and the follow-up work on \textbf{phi-1}, a 1.3 billion parameter model with Python coding performance close to the state-of-the-art. The latter work proposed to use existing Large Language Models (LLMs)… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

  7. arXiv:2306.11644  [pdf, other

    cs.CL cs.AI cs.LG

    Textbooks Are All You Need

    Authors: Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee, Yuanzhi Li

    Abstract: We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality" data from the web (6B tokens) and synthetically generated textbooks and exercises with GPT-3.5 (1B tokens). Despite this small scale, phi-1 attains pass@1 accu… ▽ More

    Submitted 2 October, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: 26 pages; changed color scheme of plot. fixed minor typos and added couple clarifications

  8. arXiv:2306.01337  [pdf, other

    cs.CL stat.ML

    MathChat: Converse to Tackle Challenging Math Problems with LLM Agents

    Authors: Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang

    Abstract: Employing Large Language Models (LLMs) to address mathematical problems is an intriguing research endeavor, considering the abundance of math problems expressed in natural language across numerous science and engineering fields. LLMs, with their generalized ability, are used as a foundation model to build AI agents for different tasks. In this paper, we study the effectiveness of utilizing LLM age… ▽ More

    Submitted 28 June, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: Update version

  9. arXiv:2305.03495  [pdf, other

    cs.CL cs.AI cs.LG

    Automatic Prompt Optimization with "Gradient Descent" and Beam Search

    Authors: Reid Pryzant, Dan Iter, Jerry Li, Yin Tat Lee, Chenguang Zhu, Michael Zeng

    Abstract: Large Language Models (LLMs) have shown impressive performance as general purpose agents, but their abilities remain highly dependent on prompts which are hand written with onerous trial-and-error effort. We propose a simple and nonparametric solution to this problem, Automatic Prompt Optimization (APO), which is inspired by numerical gradient descent to automatically improve prompts, assuming acc… ▽ More

    Submitted 19 October, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023

  10. arXiv:2304.03426  [pdf, ps, other

    cs.DS cs.DM math.OC

    Convex Minimization with Integer Minima in $\widetilde O(n^4)$ Time

    Authors: Haotian Jiang, Yin Tat Lee, Zhao Song, Lichen Zhang

    Abstract: Given a convex function $f$ on $\mathbb{R}^n$ with an integer minimizer, we show how to find an exact minimizer of $f$ using $O(n^2 \log n)$ calls to a separation oracle and $O(n^4 \log n)$ time. The previous best polynomial time algorithm for this problem given in [Jiang, SODA 2021, JACM 2022] achieves $O(n^2\log\log n/\log n)$ oracle complexity. However, the overall runtime of Jiang's algorithm… ▽ More

    Submitted 14 November, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

    Comments: SODA 2024

  11. arXiv:2303.12712  [pdf, other

    cs.CL cs.AI

    Sparks of Artificial General Intelligence: Early experiments with GPT-4

    Authors: Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang

    Abstract: Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest model developed by OpenAI, GPT-4, was trained using an unprecedented scale of compute and data. In this paper, we report on our investigation of an earl… ▽ More

    Submitted 13 April, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

  12. arXiv:2302.10879  [pdf, other

    cs.CL

    $k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models

    Authors: Yangsibo Huang, Daogao Liu, Zexuan Zhong, Weijia Shi, Yin Tat Lee

    Abstract: Fine-tuning a language model on a new domain is standard practice for domain adaptation. However, it can be infeasible when it comes to modern large-scale language models such as GPT-3, which can only be accessed through APIs, making it difficult to access the internal parameters of the model. In this paper, we propose $k$NN-Adapter, a method to effectively adapt these black-box large language mod… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

  13. arXiv:2302.06085  [pdf, ps, other

    cs.DS cs.CR cs.LG math.PR stat.CO

    Algorithmic Aspects of the Log-Laplace Transform and a Non-Euclidean Proximal Sampler

    Authors: Sivakanth Gopi, Yin Tat Lee, Daogao Liu, Ruoqi Shen, Kevin Tian

    Abstract: The development of efficient sampling algorithms catering to non-Euclidean geometries has been a challenging endeavor, as discretization techniques which succeed in the Euclidean setting do not readily carry over to more general settings. We develop a non-Euclidean analog of the recent proximal sampler of [LST21], which naturally induces regularization by an object known as the log-Laplace transfo… ▽ More

    Submitted 22 February, 2023; v1 submitted 12 February, 2023; originally announced February 2023.

    Comments: Comments welcome! v2 improves constant in duality result, adds citations

  14. arXiv:2301.00457  [pdf, other

    math.OC cs.CR cs.DS cs.LG stat.ML

    ReSQueing Parallel and Private Stochastic Convex Optimization

    Authors: Yair Carmon, Arun Jambulapati, Yujia Jin, Yin Tat Lee, Daogao Liu, Aaron Sidford, Kevin Tian

    Abstract: We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJJLST20, ACJJS21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO obj… ▽ More

    Submitted 27 October, 2023; v1 submitted 1 January, 2023; originally announced January 2023.

  15. arXiv:2212.07469  [pdf, other

    cs.LG cs.AI math.OC

    Learning threshold neurons via the "edge of stability"

    Authors: Kwangjun Ahn, Sébastien Bubeck, Sinho Chewi, Yin Tat Lee, Felipe Suarez, Yi Zhang

    Abstract: Existing analyses of neural network training often operate under the unrealistic assumption of an extremely small learning rate. This lies in stark contrast to practical wisdom and empirical studies, such as the work of J. Cohen et al. (ICLR 2021), which exhibit startling new phenomena (the "edge of stability" or "unstable convergence") and potential benefits for generalization in the large learni… ▽ More

    Submitted 19 October, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

    Comments: 31 pages, 13 figures, Published at NeurIPS 2023

  16. arXiv:2212.01539  [pdf, other

    cs.LG stat.ML

    Exploring the Limits of Differentially Private Deep Learning with Group-wise Clipping

    Authors: Jiyan He, Xuechen Li, Da Yu, Huishuai Zhang, Janardhan Kulkarni, Yin Tat Lee, Arturs Backurs, Nenghai Yu, Jiang Bian

    Abstract: Differentially private deep learning has recently witnessed advances in computational efficiency and privacy-utility trade-off. We explore whether further improvements along the two axes are possible and provide affirmative answers leveraging two instantiations of \emph{group-wise clipping}. To reduce the compute time overhead of private learning, we show that \emph{per-layer clipping}, where the… ▽ More

    Submitted 3 December, 2022; originally announced December 2022.

    Comments: 25 pages

  17. arXiv:2211.11860  [pdf, other

    cs.DS

    Upper and Lower Bounds on the Smoothed Complexity of the Simplex Method

    Authors: Sophie Huiberts, Yin Tat Lee, Xinzhi Zhang

    Abstract: The simplex method for linear programming is known to be highly efficient in practice, and understanding its performance from a theoretical perspective is an active research topic. The framework of smoothed analysis, first introduced by Spielman and Teng (JACM '04) for this purpose, defines the smoothed complexity of solving a linear program with $d$ variables and $n$ constraints as the expected r… ▽ More

    Submitted 15 May, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: 43 pages, 5 figures. STOC 2023

  18. arXiv:2210.07219  [pdf, ps, other

    cs.DS cs.LG math.NA stat.ML

    Condition-number-independent convergence rate of Riemannian Hamiltonian Monte Carlo with numerical integrators

    Authors: Yunbum Kook, Yin Tat Lee, Ruoqi Shen, Santosh S. Vempala

    Abstract: We study the convergence rate of discretized Riemannian Hamiltonian Monte Carlo on sampling from distributions in the form of $e^{-f(x)}$ on a convex body $\mathcal{M}\subset\mathbb{R}^{n}$. We show that for distributions in the form of $e^{-α^{\top}x}$ on a polytope with $m$ constraints, the convergence rate of a family of commonly-used integrators is independent of… ▽ More

    Submitted 10 February, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Improved writing & Theory for arXiv:2202.01908

  19. arXiv:2208.11644  [pdf, ps, other

    math.FA cs.DS math.PR

    A Slightly Improved Bound for the KLS Constant

    Authors: Arun Jambulapati, Yin Tat Lee, Santosh S. Vempala

    Abstract: We refine the recent breakthrough technique of Klartag and Lehec to obtain an improved polylogarithmic bound for the KLS constant.

    Submitted 6 October, 2022; v1 submitted 24 August, 2022; originally announced August 2022.

    Comments: minor revision fixing typos

  20. arXiv:2208.03811  [pdf, ps, other

    math.OC cs.LG

    Decomposable Non-Smooth Convex Optimization with Nearly-Linear Gradient Oracle Complexity

    Authors: Sally Dong, Haotian Jiang, Yin Tat Lee, Swati Padmanabhan, Guanghao Ye

    Abstract: Many fundamental problems in machine learning can be formulated by the convex program \[ \min_{θ\in R^d}\ \sum_{i=1}^{n}f_{i}(θ), \] where each $f_i$ is a convex, Lipschitz function supported on a subset of $d_i$ coordinates of $θ$. One common approach to this problem, exemplified by stochastic gradient descent, involves sampling one $f_i$ term at every iteration to make progress. This approach cr… ▽ More

    Submitted 7 August, 2022; originally announced August 2022.

  21. arXiv:2207.08347  [pdf, ps, other

    cs.LG cs.CR math.OC stat.ML

    Private Convex Optimization in General Norms

    Authors: Sivakanth Gopi, Yin Tat Lee, Daogao Liu, Ruoqi Shen, Kevin Tian

    Abstract: We propose a new framework for differentially private optimization of convex functions which are Lipschitz in an arbitrary norm $\|\cdot\|$. Our algorithms are based on a regularized exponential mechanism which samples from the density $\propto \exp(-k(F+μr))$ where $F$ is the empirical loss and $r$ is a regularizer which is strongly convex with respect to $\|\cdot\|$, generalizing a recent work o… ▽ More

    Submitted 10 November, 2022; v1 submitted 17 July, 2022; originally announced July 2022.

    Comments: SODA 2023

  22. arXiv:2207.00160  [pdf, other

    cs.LG cs.CR stat.ML

    When Does Differentially Private Learning Not Suffer in High Dimensions?

    Authors: Xuechen Li, Daogao Liu, Tatsunori Hashimoto, Huseyin A. Inan, Janardhan Kulkarni, Yin Tat Lee, Abhradeep Guha Thakurta

    Abstract: Large pretrained models can be privately fine-tuned to achieve performance approaching that of non-private models. A common theme in these results is the surprising observation that high-dimensional models can achieve favorable privacy-utility trade-offs. This seemingly contradicts known results on the model-size dependence of differentially private convex learning and raises the following researc… ▽ More

    Submitted 26 October, 2022; v1 submitted 30 June, 2022; originally announced July 2022.

    Comments: 26 pages; v3 includes additional experiments and clarification

  23. arXiv:2205.01562  [pdf, ps, other

    cs.DS

    Nested Dissection Meets IPMs: Planar Min-Cost Flow in Nearly-Linear Time

    Authors: Sally Dong, Yu Gao, Gramoz Goranci, Yin Tat Lee, Richard Peng, Sushant Sachdeva, Guanghao Ye

    Abstract: We present a nearly-linear time algorithm for finding a minimum-cost flow in planar graphs with polynomially bounded integer costs and capacities. The previous fastest algorithm for this problem is based on interior point methods (IPMs) and works for general sparse graphs in $O(n^{1.5}\text{poly}(\log n))$ time [Daitch-Spielman, STOC'08]. Intuitively, $Ω(n^{1.5})$ is a natural runtime barrier for… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: 93 pages

  24. arXiv:2203.00263  [pdf, ps, other

    cs.DS cs.CR cs.LG math.OC math.PR

    Private Convex Optimization via Exponential Mechanism

    Authors: Sivakanth Gopi, Yin Tat Lee, Daogao Liu

    Abstract: In this paper, we study private optimization problems for non-smooth convex functions $F(x)=\mathbb{E}_i f_i(x)$ on $\mathbb{R}^d$. We show that modifying the exponential mechanism by adding an $\ell_2^2$ regularizer to $F(x)$ and sampling from $π(x)\propto \exp(-k(F(x)+μ\|x\|_2^2/2))$ recovers both the known optimal empirical risk and population loss under $(ε,δ)$-DP. Furthermore, we show how to… ▽ More

    Submitted 28 July, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

  25. arXiv:2202.01908  [pdf, other

    cs.LG cs.DS

    Sampling with Riemannian Hamiltonian Monte Carlo in a Constrained Space

    Authors: Yunbum Kook, Yin Tat Lee, Ruoqi Shen, Santosh S. Vempala

    Abstract: We demonstrate for the first time that ill-conditioned, non-smooth, constrained distributions in very high dimension, upwards of 100,000, can be sampled efficiently $\textit{in practice}$. Our algorithm incorporates constraints into the Riemannian version of Hamiltonian Monte Carlo and maintains sparsity. This allows us to achieve a mixing rate independent of smoothness and condition numbers. On… ▽ More

    Submitted 15 October, 2022; v1 submitted 3 February, 2022; originally announced February 2022.

    Comments: Mixing-rate proof added. To appear in NeurIPS 2022

  26. arXiv:2112.00722  [pdf, ps, other

    cs.DS math.OC

    Faster Maxflow via Improved Dynamic Spectral Vertex Sparsifiers

    Authors: Jan van den Brand, Yu Gao, Arun Jambulapati, Yin Tat Lee, Yang P. Liu, Richard Peng, Aaron Sidford

    Abstract: We make several advances broadly related to the maintenance of electrical flows in weighted graphs undergoing dynamic resistance updates, including: 1. More efficient dynamic spectral vertex sparsification, achieved by faster length estimation of random walks in weighted graphs using Morris counters [Morris 1978, Nelson-Yu 2020]. 2. A direct reduction from detecting edges with large energy in… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

    Comments: 63 pages

  27. arXiv:2110.15563  [pdf, other

    cs.DS math.OC

    Computing Lewis Weights to High Precision

    Authors: Maryam Fazel, Yin Tat Lee, Swati Padmanabhan, Aaron Sidford

    Abstract: We present an algorithm for computing approximate $\ell_p$ Lewis weights to high precision. Given a full-rank $\mathbf{A} \in \mathbb{R}^{m \times n}$ with $m \geq n$ and a scalar $p>2$, our algorithm computes $ε$-approximate $\ell_p$ Lewis weights of $\mathbf{A}$ in $\widetilde{O}_p(\log(1/ε))$ iterations; the cost of each iteration is linear in the input size plus the cost of computing the lever… ▽ More

    Submitted 29 October, 2021; originally announced October 2021.

    Comments: 24 pages

  28. arXiv:2110.06500  [pdf, other

    cs.LG cs.CL cs.CR stat.ML

    Differentially Private Fine-tuning of Language Models

    Authors: Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, Huishuai Zhang

    Abstract: We give simpler, sparser, and faster algorithms for differentially private fine-tuning of large-scale pre-trained language models, which achieve the state-of-the-art privacy versus utility tradeoffs on many standard NLP tasks. We propose a meta-framework for this problem, inspired by the recent success of highly parameter-efficient methods for fine-tuning. Our experiments show that differentially… ▽ More

    Submitted 14 July, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: ICLR 2022. Code available at https://github.com/huseyinatahaninan/Differentially-Private-Fine-tuning-of-Language-Models

  29. arXiv:2108.04734  [pdf, other

    cs.DS math.OC

    Tutorial on the Robust Interior Point Method

    Authors: Yin Tat Lee, Santosh S. Vempala

    Abstract: We give a short, self-contained proof of the interior point method and its robust version.

    Submitted 10 August, 2021; originally announced August 2021.

  30. arXiv:2106.05480  [pdf, other

    cs.DS cs.CC cs.LG math.ST stat.ML

    Lower Bounds on Metropolized Sampling Methods for Well-Conditioned Distributions

    Authors: Yin Tat Lee, Ruoqi Shen, Kevin Tian

    Abstract: We give lower bounds on the performance of two of the most popular sampling methods in practice, the Metropolis-adjusted Langevin algorithm (MALA) and multi-step Hamiltonian Monte Carlo (HMC) with a leapfrog integrator, when applied to well-conditioned distributions. Our main result is a nearly-tight lower bound of $\widetildeΩ(κd)$ on the mixing time of MALA from an exponentially warm start, matc… ▽ More

    Submitted 26 October, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: 46 pages, 1 figure. This version removes Gaussian upper bound claim

  31. arXiv:2106.02848  [pdf, ps, other

    cs.DS cs.CR cs.LG

    Numerical Composition of Differential Privacy

    Authors: Sivakanth Gopi, Yin Tat Lee, Lukas Wutschitz

    Abstract: We give a fast algorithm to optimally compose privacy guarantees of differentially private (DP) algorithms to arbitrary accuracy. Our method is based on the notion of privacy loss random variables to quantify the privacy loss of DP algorithms. The running time and memory needed for our algorithm to approximate the privacy curve of a DP algorithm composed with itself $k$ times is… ▽ More

    Submitted 26 October, 2021; v1 submitted 5 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021 Spotlight

  32. arXiv:2105.13637  [pdf, ps, other

    cs.LG cs.CR math.OC

    The Power of Sampling: Dimension-free Risk Bounds in Private ERM

    Authors: Yin Tat Lee, Daogao Liu, Zhou Lu

    Abstract: Differentially private empirical risk minimization (DP-ERM) is a fundamental problem in private optimization. While the theory of DP-ERM is well-studied, as large-scale models become prevalent, traditional DP-ERM methods face new challenges, including (1) the prohibitive dependence on the ambient dimension, (2) the highly non-smooth objective functions, (3) costly first-order gradient oracles. Suc… ▽ More

    Submitted 3 June, 2024; v1 submitted 28 May, 2021; originally announced May 2021.

    Comments: We add the dimension-independent upper bounds results

  33. arXiv:2103.15352  [pdf, other

    cs.LG cs.CR cs.DS stat.ML

    Private Non-smooth Empirical Risk Minimization and Stochastic Convex Optimization in Subquadratic Steps

    Authors: Janardhan Kulkarni, Yin Tat Lee, Daogao Liu

    Abstract: We study the differentially private Empirical Risk Minimization (ERM) and Stochastic Convex Optimization (SCO) problems for non-smooth convex functions. We get a (nearly) optimal bound on the excess empirical risk and excess population loss with subquadratic gradient complexity. More precisely, our differentially private algorithm requires $O(\frac{N^{3/2}}{d^{1/8}}+ \frac{N^2}{d})$ gradient queri… ▽ More

    Submitted 29 March, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

  34. arXiv:2102.03013  [pdf, other

    cs.LG cs.CR

    Fast and Memory Efficient Differentially Private-SGD via JL Projections

    Authors: Zhiqi Bu, Sivakanth Gopi, Janardhan Kulkarni, Yin Tat Lee, Judy Hanwen Shen, Uthaipon Tantipongpipat

    Abstract: Differentially Private-SGD (DP-SGD) of Abadi et al. (2016) and its variations are the only known algorithms for private training of large scale neural networks. This algorithm requires computation of per-sample gradients norms which is extremely slow and memory intensive in practice. In this paper, we present a new framework to design differentially private optimizers called DP-SGD-JL and DP-Adam-… ▽ More

    Submitted 5 February, 2021; originally announced February 2021.

  35. arXiv:2101.08993  [pdf

    eess.IV cs.CV

    Automatic Volumetric Segmentation of Additive Manufacturing Defects with 3D U-Net

    Authors: Vivian Wen Hui Wong, Max Ferguson, Kincho H. Law, Yung-Tsun Tina Lee, Paul Witherell

    Abstract: Segmentation of additive manufacturing (AM) defects in X-ray Computed Tomography (XCT) images is challenging, due to the poor contrast, small sizes and variation in appearance of defects. Automatic segmentation can, however, provide quality control for additive manufacturing. Over recent years, three-dimensional convolutional neural networks (3D CNNs) have performed well in the volumetric segmenta… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

    Comments: Accepted by AAAI 2020 Spring Symposia

    Journal ref: AAAI 2020 Spring Symposia, Stanford, CA, USA, Mar 23-25, 2020

  36. arXiv:2101.05719  [pdf, ps, other

    cs.DS math.OC

    Minimum Cost Flows, MDPs, and $\ell_1$-Regression in Nearly Linear Time for Dense Instances

    Authors: Jan van den Brand, Yin Tat Lee, Yang P. Liu, Thatchaphol Saranurak, Aaron Sidford, Zhao Song, Di Wang

    Abstract: In this paper we provide new randomized algorithms with improved runtimes for solving linear programs with two-sided constraints. In the special case of the minimum cost flow problem on $n$-vertex $m$-edge graphs with integer polynomially-bounded costs and capacities we obtain a randomized method which solves the problem in $\tilde{O}(m+n^{1.5})$ time. This improves upon the previous best runtime… ▽ More

    Submitted 21 August, 2021; v1 submitted 14 January, 2021; originally announced January 2021.

  37. arXiv:2011.05365  [pdf, other

    cs.DS math.OC

    A Nearly-Linear Time Algorithm for Linear Programs with Small Treewidth: A Multiscale Representation of Robust Central Path

    Authors: Sally Dong, Yin Tat Lee, Guanghao Ye

    Abstract: Arising from structural graph theory, treewidth has become a focus of study in fixed-parameter tractable algorithms in various communities including combinatorics, integer-linear programming, and numerical analysis. Many NP-hard problems are known to be solvable in $\widetilde{O}(n \cdot 2^{O(\mathrm{tw})})$ time, where $\mathrm{tw}$ is the treewidth of the input graph. Analogously, many problems… ▽ More

    Submitted 13 September, 2023; v1 submitted 10 November, 2020; originally announced November 2020.

  38. arXiv:2010.03106  [pdf, ps, other

    cs.DS cs.LG math.OC stat.CO stat.ML

    Structured Logconcave Sampling with a Restricted Gaussian Oracle

    Authors: Yin Tat Lee, Ruoqi Shen, Kevin Tian

    Abstract: We give algorithms for sampling several structured logconcave families to high accuracy. We further develop a reduction framework, inspired by proximal point methods in convex optimization, which bootstraps samplers for regularized densities to improve dependences on problem conditioning. A key ingredient in our framework is the notion of a "restricted Gaussian oracle" (RGO) for… ▽ More

    Submitted 22 October, 2021; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: 58 pages. The results of Section 5 of this paper, as well as an empirical evaluation, appeared earlier as arXiv:2006.05976. This version fixes an error in the proof of Theorem 1, see Section 1.4

  39. arXiv:2009.10217  [pdf, ps, other

    cs.DS math.OC

    A Faster Interior Point Method for Semidefinite Programming

    Authors: Haotian Jiang, Tarun Kathuria, Yin Tat Lee, Swati Padmanabhan, Zhao Song

    Abstract: Semidefinite programs (SDPs) are a fundamental class of optimization problems with important recent applications in approximation algorithms, quantum complexity, robust learning, algorithmic rounding, and adversarial deep learning. This paper presents a faster interior point method to solve generic SDPs with variable size $n \times n$ and $m$ constraints in time \begin{align*} \widetilde{O}(\sqrt{… ▽ More

    Submitted 21 September, 2020; originally announced September 2020.

    Comments: FOCS 2020

  40. arXiv:2008.02146  [pdf, other

    cs.DS cs.CC math.FA

    Reducing Isotropy and Volume to KLS: An $O(n^3ψ^2)$ Volume Algorithm

    Authors: He Jia, Aditi Laddha, Yin Tat Lee, Santosh S. Vempala

    Abstract: We show that the volume of a convex body in ${\bf R}^{n}$ in the general membership oracle model can be computed to within relative error $\varepsilon$ using $\widetilde{O}(n^{3}ψ^{2} + n^{3}/\varepsilon^{2})$ oracle queries, where $ψ$ is the KLS constant. With the current bound of $ψ=\widetilde{O}(1)$, this gives an $\widetilde{O}(n^{3}/\varepsilon^{2})$ algorithm, improving on the Lovász-Vempala… ▽ More

    Submitted 3 September, 2022; v1 submitted 5 August, 2020; originally announced August 2020.

    Comments: 23 pages, 1 figure; updated with current KLS bound and resulting complexity

  41. arXiv:2006.05976  [pdf, other

    cs.LG cs.DS math.NA math.OC stat.ML

    Composite Logconcave Sampling with a Restricted Gaussian Oracle

    Authors: Ruoqi Shen, Kevin Tian, Yin Tat Lee

    Abstract: We consider sampling from composite densities on $\mathbb{R}^d$ of the form $dπ(x) \propto \exp(-f(x) - g(x))dx$ for well-conditioned $f$ and convex (but possibly non-smooth) $g$, a family generalizing restrictions to a convex set, through the abstraction of a restricted Gaussian oracle. For $f$ with condition number $κ$, our algorithm runs in $O \left(κ^2 d \log^2\tfrac{κd}ε\right)$ iterations, e… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

  42. arXiv:2006.02855  [pdf, ps, other

    cs.LG stat.ML

    Network size and weights size for memorization with two-layers neural networks

    Authors: Sébastien Bubeck, Ronen Eldan, Yin Tat Lee, Dan Mikulincer

    Abstract: In 1988, Eric B. Baum showed that two-layers neural networks with threshold activation function can perfectly memorize the binary labels of $n$ points in general position in $\mathbb{R}^d$ using only $\ulcorner n/d \urcorner$ neurons. We observe that with ReLU networks, using four times as many neurons one can fit arbitrary real labels. Moreover, for approximate memorization up to error $ε$, the n… ▽ More

    Submitted 3 November, 2020; v1 submitted 4 June, 2020; originally announced June 2020.

    Comments: 27 pages

  43. arXiv:2004.04250  [pdf, other

    cs.DS cs.LG stat.ML

    An Improved Cutting Plane Method for Convex Optimization, Convex-Concave Games and its Applications

    Authors: Haotian Jiang, Yin Tat Lee, Zhao Song, Sam Chiu-wai Wong

    Abstract: Given a separation oracle for a convex set $K \subset \mathbb{R}^n$ that is contained in a box of radius $R$, the goal is to either compute a point in $K$ or prove that $K$ does not contain a ball of radius $ε$. We propose a new cutting plane algorithm that uses an optimal $O(n \log (κ))$ evaluations of the oracle and an additional $O(n^2)$ time per evaluation, where $κ= nR/ε$. $\bullet$ This im… ▽ More

    Submitted 8 April, 2020; originally announced April 2020.

    Comments: STOC 2020

  44. arXiv:2003.08078  [pdf, other

    math.OC cs.DS

    Acceleration with a Ball Optimization Oracle

    Authors: Yair Carmon, Arun Jambulapati, Qijia Jiang, Yujia Jin, Yin Tat Lee, Aaron Sidford, Kevin Tian

    Abstract: Consider an oracle which takes a point $x$ and returns the minimizer of a convex function $f$ in an $\ell_2$ ball of radius $r$ around $x$. It is straightforward to show that roughly $r^{-1}\log\frac{1}ε$ calls to the oracle suffice to find an $ε$-approximate minimizer of $f$ in an $\ell_2$ unit ball. Perhaps surprisingly, this is not optimal: we design an accelerated algorithm which attains an… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

    Comments: 37 pages

  45. arXiv:2002.04830  [pdf, ps, other

    cs.DS math.OC

    Positive Semidefinite Programming: Mixed, Parallel, and Width-Independent

    Authors: Arun Jambulapati, Yin Tat Lee, Jerry Li, Swati Padmanabhan, Kevin Tian

    Abstract: We give the first approximation algorithm for mixed packing and covering semidefinite programs (SDPs) with polylogarithmic dependence on width. Mixed packing and covering SDPs constitute a fundamental algorithmic primitive with recent applications in combinatorial optimization, robust learning, and quantum complexity. The current approximate solvers for positive semidefinite programming can handle… ▽ More

    Submitted 12 July, 2021; v1 submitted 12 February, 2020; originally announced February 2020.

    Comments: There is an error in this manuscript. This version notes the source of the error on the first page

  46. arXiv:2002.04121  [pdf, ps, other

    cs.LG cs.DS math.OC stat.CO stat.ML

    Logsmooth Gradient Concentration and Tighter Runtimes for Metropolized Hamiltonian Monte Carlo

    Authors: Yin Tat Lee, Ruoqi Shen, Kevin Tian

    Abstract: We show that the gradient norm $\|\nabla f(x)\|$ for $x \sim \exp(-f(x))$, where $f$ is strongly convex and smooth, concentrates tightly around its mean. This removes a barrier in the prior state-of-the-art analysis for the well-studied Metropolized Hamiltonian Monte Carlo (HMC) algorithm for sampling from a strongly logconcave distribution. We correspondingly demonstrate that Metropolized HMC mix… ▽ More

    Submitted 13 June, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

    Comments: 31 pages. v2 propagates changes from COLT 2020 camera-ready

  47. arXiv:2002.02304  [pdf, other

    cs.DS math.OC

    Solving Tall Dense Linear Programs in Nearly Linear Time

    Authors: Jan van den Brand, Yin Tat Lee, Aaron Sidford, Zhao Song

    Abstract: In this paper we provide an $\tilde{O}(nd+d^{3})$ time randomized algorithm for solving linear programs with $d$ variables and $n$ constraints with high probability. To obtain this result we provide a robust, primal-dual $\tilde{O}(\sqrt{d})$-iteration interior point method inspired by the methods of Lee and Sidford (2014, 2019) and show how to efficiently implement this method using new data-stru… ▽ More

    Submitted 21 August, 2021; v1 submitted 6 February, 2020; originally announced February 2020.

  48. arXiv:1911.10765  [pdf, other

    cs.DS cs.DM

    Faster Matroid Intersection

    Authors: Deeparnab Chakrabarty, Yin Tat Lee, Aaron Sidford, Sahil Singla, Sam Chiu-wai Wong

    Abstract: In this paper we consider the classic matroid intersection problem: given two matroids $\M_{1}=(V,\I_{1})$ and $\M_{2}=(V,\I_{2})$ defined over a common ground set $V$, compute a set $S\in\I_{1}\cap\I_{2}$ of largest possible cardinality, denoted by $r$. We consider this problem both in the setting where each $\M_{i}$ is accessed through an independence oracle, i.e. a routine which returns whether… ▽ More

    Submitted 25 November, 2019; originally announced November 2019.

    Comments: 38 pages. Preliminary version appeared in FOCS 2019

  49. arXiv:1911.05656  [pdf, other

    cs.DS

    Strong Self-Concordance and Sampling

    Authors: Aditi Laddha, Yin Tat Lee, Santosh Vempala

    Abstract: Motivated by the Dikin walk, we develop aspects of an interior-point theory for sampling in high dimension. Specifically, we introduce a symmetric parameter and the notion of strong self-concordance. These properties imply that the corresponding Dikin walk mixes in $\tilde{O}(n\barν)$ steps from a warm start in a convex body in $\mathbb{R}^{n}$ using a strongly self-concordant barrier with symmetr… ▽ More

    Submitted 9 July, 2020; v1 submitted 13 November, 2019; originally announced November 2019.

  50. arXiv:1911.00612  [pdf, other

    cs.CG cs.DS

    Computing Circle Packing Representations of Planar Graphs

    Authors: Sally Dong, Yin Tat Lee, Kent Quanrud

    Abstract: The Circle Packing Theorem states that every planar graph can be represented as the tangency graph of a family of internally-disjoint circles. A well-known generalization is the Primal-Dual Circle Packing Theorem for 3-connected planar graphs. The existence of these representations has widespread applications in theoretical computer science and mathematics; however, the algorithmic aspect has rece… ▽ More

    Submitted 1 November, 2019; originally announced November 2019.

    Comments: 19 pages, 10 figures. SODA 2020