Skip to main content

Showing 1–32 of 32 results for author: Livni, R

  1. arXiv:2406.15916  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Credit Attribution and Stable Compression

    Authors: Roi Livni, Shay Moran, Kobbi Nissim, Chirag Pabbaraju

    Abstract: Credit attribution is crucial across various fields. In academic research, proper citation acknowledges prior work and establishes original contributions. Similarly, in generative models, such as those trained on existing artworks or music, it is important to ensure that any generated content influenced by these works appropriately credits the original creators. We study credit attribution by ma… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 15 pages, 1 figure

  2. arXiv:2404.04931  [pdf, other

    cs.LG math.OC

    The Sample Complexity of Gradient Descent in Stochastic Convex Optimization

    Authors: Roi Livni

    Abstract: We analyze the sample complexity of full-batch Gradient Descent (GD) in the setup of non-smooth Stochastic Convex Optimization. We show that the generalization error of GD, with common choice of hyper-parameters, can be $\tilde Θ(d/m + 1/\sqrt{m})$, where $d$ is the dimension and $m$ is the sample size. This matches the sample complexity of \emph{worst-case} empirical risk minimizers. That means t… ▽ More

    Submitted 10 April, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

  3. arXiv:2403.17691  [pdf, other

    cs.CV cs.CL

    Not All Similarities Are Created Equal: Leveraging Data-Driven Biases to Inform GenAI Copyright Disputes

    Authors: Uri Hacohen, Adi Haviv, Shahar Sarfaty, Bruria Friedman, Niva Elkin-Koren, Roi Livni, Amit H Bermano

    Abstract: The advent of Generative Artificial Intelligence (GenAI) models, including GitHub Copilot, OpenAI GPT, and Stable Diffusion, has revolutionized content creation, enabling non-professionals to produce high-quality content across various domains. This transformative technology has led to a surge of synthetic content and sparked legal disputes over copyright infringement. To address these challenges,… ▽ More

    Submitted 7 May, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: Presented at ACM CSLAW 2024

  4. arXiv:2402.09327  [pdf, other

    cs.LG

    Information Complexity of Stochastic Convex Optimization: Applications to Generalization and Memorization

    Authors: Idan Attias, Gintare Karolina Dziugaite, Mahdi Haghifam, Roi Livni, Daniel M. Roy

    Abstract: In this work, we investigate the interplay between memorization and learning in the context of \emph{stochastic convex optimization} (SCO). We define memorization via the information a learning algorithm reveals about its training data points. We then quantify this information using the framework of conditional mutual information (CMI) proposed by Steinke and Zakynthinou (2020). Our main result is… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 44 Pages

  5. arXiv:2311.05398  [pdf, other

    cs.LG stat.ML

    The Sample Complexity Of ERMs In Stochastic Convex Optimization

    Authors: Daniel Carmon, Roi Livni, Amir Yehudayoff

    Abstract: Stochastic convex optimization is one of the most well-studied models for learning in modern machine learning. Nevertheless, a central fundamental question in this setup remained unresolved: "How many data points must be observed so that any empirical risk minimizer (ERM) shows good performance on the true population?" This question was proposed by Feldman (2016), who proved that… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  6. arXiv:2305.14822  [pdf, other

    cs.LG cs.CR

    Can Copyright be Reduced to Privacy?

    Authors: Niva Elkin-Koren, Uri Hacohen, Roi Livni, Shay Moran

    Abstract: There is a growing concern that generative AI models will generate outputs closely resembling the copyrighted materials for which they are trained. This worry has intensified as the quality and complexity of generative models have immensely improved, and the availability of extensive datasets containing copyrighted material has expanded. Researchers are actively exploring strategies to mitigate th… ▽ More

    Submitted 24 March, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

  7. arXiv:2302.04925  [pdf, ps, other

    cs.LG stat.ML

    Information Theoretic Lower Bounds for Information Theoretic Upper Bounds

    Authors: Roi Livni

    Abstract: We examine the relationship between the mutual information between the output model and the empirical sample and the generalization of the algorithm in the context of stochastic convex optimization. Despite increasing interest in information-theoretic generalization bounds, it is uncertain if these bounds can provide insight into the exceptional performance of various learning algorithms. Our stud… ▽ More

    Submitted 14 January, 2024; v1 submitted 9 February, 2023; originally announced February 2023.

  8. arXiv:2206.03098  [pdf, ps, other

    cs.LG

    Better Best of Both Worlds Bounds for Bandits with Switching Costs

    Authors: Idan Amir, Guy Azov, Tomer Koren, Roi Livni

    Abstract: We study best-of-both-worlds algorithms for bandits with switching cost, recently addressed by Rouyer, Seldin and Cesa-Bianchi, 2021. We introduce a surprisingly simple and effective algorithm that simultaneously achieves minimax optimal regret bound of $\mathcal{O}(T^{2/3})$ in the oblivious adversarial setting and a bound of $\mathcal{O}(\min\{\log (T)/Δ^2,T^{2/3}\})$ in the stochastically-const… ▽ More

    Submitted 2 November, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

  9. arXiv:2204.08809  [pdf, ps, other

    cs.LG stat.ML

    Making Progress Based on False Discoveries

    Authors: Roi Livni

    Abstract: The study of adaptive data analysis examines how many statistical queries can be answered accurately using a fixed dataset while avoiding false discoveries (statistically inaccurate answers). In this paper, we tackle a question that precedes the field of study: Is data only valuable when it provides accurate answers to statistical queries? To answer this question, we use Stochastic Convex Optimiza… ▽ More

    Submitted 7 February, 2023; v1 submitted 19 April, 2022; originally announced April 2022.

  10. arXiv:2202.13361  [pdf, other

    cs.LG math.OC stat.ML

    Benign Underfitting of Stochastic Gradient Descent

    Authors: Tomer Koren, Roi Livni, Yishay Mansour, Uri Sherman

    Abstract: We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" learning rule that achieves generalization performance by obtaining a good fit to training data. We consider the fundamental stochastic convex optimization framework, where (one pass, without-replacement) SGD is classically known to minimize the population risk at rate $O(1/\sqrt n)$, and prove that, su… ▽ More

    Submitted 12 January, 2023; v1 submitted 27 February, 2022; originally announced February 2022.

  11. arXiv:2202.13328  [pdf, ps, other

    cs.LG math.OC

    Thinking Outside the Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization

    Authors: Idan Amir, Roi Livni, Nathan Srebro

    Abstract: We consider linear prediction with a convex Lipschitz loss, or more generally, stochastic convex optimization problems of generalized linear form, i.e.~where each instantaneous loss is a scalar convex function of a linear function. We show that in this setting, early stopped Gradient Descent (GD), without any explicit regularization or projection, ensures excess error at most $ε$ (compared to the… ▽ More

    Submitted 30 October, 2022; v1 submitted 27 February, 2022; originally announced February 2022.

  12. arXiv:2107.00469  [pdf, ps, other

    math.OC cs.LG

    Never Go Full Batch (in Stochastic Convex Optimization)

    Authors: Idan Amir, Yair Carmon, Tomer Koren, Roi Livni

    Abstract: We study the generalization performance of $\text{full-batch}$ optimization algorithms for stochastic convex optimization: these are first-order methods that only access the exact gradient of the empirical risk (rather than gradients with respect to individual data points), that include a wide range of algorithms such as gradient descent, mirror descent, and their regularized and/or accelerated va… ▽ More

    Submitted 29 June, 2021; originally announced July 2021.

  13. arXiv:2106.13513  [pdf, ps, other

    cs.LG cs.CR cs.DS

    Littlestone Classes are Privately Online Learnable

    Authors: Noah Golowich, Roi Livni

    Abstract: We consider the problem of online classification under a privacy constraint. In this setting a learner observes sequentially a stream of labelled examples $(x_t, y_t)$, for $1 \leq t \leq T$, and returns at each iteration $t$ a hypothesis $h_t$ which is used to predict the label of each new example $x_t$. The learner's performance is measured by her regret against a known hypothesis class… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

  14. arXiv:2102.01646  [pdf, ps, other

    cs.LG cs.DS cs.GT stat.ML

    Online Learning with Simple Predictors and a Combinatorial Characterization of Minimax in 0/1 Games

    Authors: Steve Hanneke, Roi Livni, Shay Moran

    Abstract: Which classes can be learned properly in the online model? -- that is, by an algorithm that at each round uses a predictor from the concept class. While there are simple and natural cases where improper learning is necessary, it is natural to ask how complex must the improper predictors be in such cases. Can one always achieve nearly optimal mistake/regret bounds using "simple" predictors? In th… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

  15. arXiv:2102.01117  [pdf, ps, other

    cs.LG stat.ML

    SGD Generalizes Better Than GD (And Regularization Doesn't Help)

    Authors: Idan Amir, Tomer Koren, Roi Livni

    Abstract: We give a new separation result between the generalization performance of stochastic gradient descent (SGD) and of full-batch gradient descent (GD) in the fundamental stochastic convex optimization model. While for SGD it is well-known that $O(1/ε^2)$ iterations suffice for obtaining a solution with $ε$ excess expected risk, we show that with the same number of steps GD may overfit and emit a solu… ▽ More

    Submitted 29 June, 2021; v1 submitted 1 February, 2021; originally announced February 2021.

    Journal ref: Conference on Learning Theory 2021

  16. arXiv:2006.13508  [pdf, ps, other

    cs.LG stat.ML

    A Limitation of the PAC-Bayes Framework

    Authors: Roi Livni, Shay Moran

    Abstract: PAC-Bayes is a useful framework for deriving generalization bounds which was introduced by McAllester ('98). This framework has the flexibility of deriving distribution- and algorithm-dependent bounds, which are often tighter than VC-related uniform convergence bounds. In this manuscript we present a limitation for the PAC-Bayes framework. We demonstrate an easy learning task that is not amenable… ▽ More

    Submitted 3 September, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

  17. arXiv:2003.06152  [pdf, other

    cs.LG stat.ML

    Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study

    Authors: Assaf Dauber, Meir Feder, Tomer Koren, Roi Livni

    Abstract: The notion of implicit bias, or implicit regularization, has been suggested as a means to explain the surprising generalization ability of modern-days overparameterized learning algorithms. This notion refers to the tendency of the optimization algorithm towards a certain structured solution that often generalizes well. Recently, several papers have studied implicit regularization and were able to… ▽ More

    Submitted 22 December, 2020; v1 submitted 13 March, 2020; originally announced March 2020.

  18. arXiv:2003.00563  [pdf, other

    cs.LG cs.CR stat.ML

    An Equivalence Between Private Classification and Online Prediction

    Authors: Mark Bun, Roi Livni, Shay Moran

    Abstract: We prove that every concept class with finite Littlestone dimension can be learned by an (approximate) differentially-private algorithm. This answers an open question of Alon et al. (STOC 2019) who proved the converse statement (this question was also asked by Neel et al.~(FOCS 2019)). Together these two results yield an equivalence between online learnability and private PAC learnability. We in… ▽ More

    Submitted 22 June, 2021; v1 submitted 1 March, 2020; originally announced March 2020.

    Comments: An earlier version of this manuscript claimed an upper bound over the sample complexity that is exponential in the Littlestone dimension. The argument was erranous, and the current version contains a correction, which leads to double-exponential dependence in the Littlestone-dimension

  19. arXiv:2002.10286  [pdf, other

    cs.LG stat.ML

    Prediction with Corrupted Expert Advice

    Authors: Idan Amir, Idan Attias, Tomer Koren, Roi Livni, Yishay Mansour

    Abstract: We revisit the fundamental problem of prediction with expert advice, in a setting where the environment is benign and generates losses stochastically, but the feedback observed by the learner is subject to a moderate adversarial corruption. We prove that a variant of the classical Multiplicative Weights algorithm with decreasing step sizes achieves constant regret in this setting and performs opti… ▽ More

    Submitted 20 October, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: NeurIPS 2020 Camera Ready

    Journal ref: Conference on Neural Information Processing Systems 2020

  20. arXiv:1906.00264  [pdf, ps, other

    cs.LG stat.ML

    Graph-based Discriminators: Sample Complexity and Expressiveness

    Authors: Roi Livni, Yishay Mansour

    Abstract: A basic question in learning theory is to identify if two distributions are identical when we have access only to examples sampled from the distributions. This basic task is considered, for example, in the context of Generative Adversarial Networks (GANs), where a discriminator is trained to distinguish between a real-life distribution and a synthetic distribution. % Classically, we use a hypothes… ▽ More

    Submitted 1 June, 2019; originally announced June 2019.

  21. arXiv:1902.04782  [pdf, ps, other

    cs.LG stat.ML

    On the Expressive Power of Kernel Methods and the Efficiency of Kernel Learning by Association Schemes

    Authors: Pravesh K. Kothari, Roi Livni

    Abstract: We study the expressive power of kernel methods and the algorithmic feasibility of multiple kernel learning for a special rich class of kernels. Specifically, we define \emph{Euclidean kernels}, a diverse class that includes most, if not all, families of kernels studied in literature such as polynomial kernels and radial basis functions. We then describe the geometric and spectral structure of t… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

  22. arXiv:1902.03468  [pdf, ps, other

    cs.LG stat.ML

    Synthetic Data Generators: Sequential and Private

    Authors: Olivier Bousquet, Roi Livni, Shay Moran

    Abstract: We study the sample complexity of private synthetic data generation over an unbounded sized class of statistical queries, and show that any class that is privately proper PAC learnable admits a private synthetic data generator (perhaps non-efficient). Previous work on synthetic data generators focused on the case that the query class $\mathcal{D}$ is finite and obtained sample complexity bounds th… ▽ More

    Submitted 7 December, 2020; v1 submitted 9 February, 2019; originally announced February 2019.

  23. arXiv:1806.00949  [pdf, other

    cs.LG cs.AI cs.CR math.LO stat.ML

    Private PAC learning implies finite Littlestone dimension

    Authors: Noga Alon, Roi Livni, Maryanthe Malliaris, Shay Moran

    Abstract: We show that every approximately differentially private learning algorithm (possibly improper) for a class $H$ with Littlestone dimension~$d$ requires $Ω\bigl(\log^*(d)\bigr)$ examples. As a corollary it follows that the class of thresholds over $\mathbb{N}$ can not be learned in a private manner; this resolves open question due to [Bun et al., 2015, Feldman and Xiao, 2015]. We leave as an open qu… ▽ More

    Submitted 8 March, 2019; v1 submitted 4 June, 2018; originally announced June 2018.

    Comments: STOC camera-ready version

  24. arXiv:1711.05893  [pdf, other

    cs.LG cs.CC cs.IT

    On Communication Complexity of Classification Problems

    Authors: Daniel M. Kane, Roi Livni, Shay Moran, Amir Yehudayoff

    Abstract: This work studies distributed learning in the spirit of Yao's model of communication complexity: consider a two-party setting, where each of the players gets a list of labelled examples and they communicate in order to jointly perform some learning task. To naturally fit into the framework of learning theory, the players can send each other examples (as well as bits) where each example/bit costs o… ▽ More

    Submitted 23 April, 2018; v1 submitted 15 November, 2017; originally announced November 2017.

  25. arXiv:1710.08997  [pdf, ps, other

    cs.LG

    Multi-Armed Bandits with Metric Movement Costs

    Authors: Tomer Koren, Roi Livni, Yishay Mansour

    Abstract: We consider the non-stochastic Multi-Armed Bandit problem in a setting where there is a fixed and known metric on the action space that determines a cost for switching between any pair of actions. The loss of the online learner has two components: the first is the usual loss of the selected actions, and the second is an additional loss due to switching between actions. Our main contribution gives… ▽ More

    Submitted 24 October, 2017; originally announced October 2017.

  26. arXiv:1709.03871  [pdf, ps, other

    cs.LG

    Agnostic Learning by Refuting

    Authors: Pravesh K. Kothari, Roi Livni

    Abstract: The sample complexity of learning a Boolean-valued function class is precisely characterized by its Rademacher complexity. This has little bearing, however, on the sample complexity of \emph{efficient} agnostic learning. We introduce \emph{refutation complexity}, a natural computational analog of Rademacher complexity of a Boolean concept class and show that it exactly characterizes the sample c… ▽ More

    Submitted 30 November, 2017; v1 submitted 12 September, 2017; originally announced September 2017.

  27. arXiv:1702.07444  [pdf, ps, other

    cs.LG cs.GT

    Bandits with Movement Costs and Adaptive Pricing

    Authors: Tomer Koren, Roi Livni, Yishay Mansour

    Abstract: We extend the model of Multi-armed Bandit with unit switching cost to incorporate a metric between the actions. We consider the case where the metric over the actions can be modeled by a complete binary tree, and the distance between two leaves is the size of the subtree of their least common ancestor, which abstracts the case that the actions are points on the continuous interval $[0,1]$ and the… ▽ More

    Submitted 23 February, 2017; originally announced February 2017.

  28. arXiv:1606.05316  [pdf, other

    cs.LG

    Learning Infinite-Layer Networks: Without the Kernel Trick

    Authors: Roi Livni, Daniel Carmon, Amir Globerson

    Abstract: Infinite--Layer Networks (ILN) have recently been proposed as an architecture that mimics neural networks while enjoying some of the advantages of kernel methods. ILN are networks that integrate over infinitely many nodes within a single hidden layer. It has been demonstrated by several authors that the problem of learning ILN can be reduced to the kernel trick, implying that whenever a certain in… ▽ More

    Submitted 28 July, 2017; v1 submitted 16 June, 2016; originally announced June 2016.

  29. arXiv:1603.06352  [pdf, ps, other

    cs.LG

    Online Learning with Low Rank Experts

    Authors: Elad Hazan, Tomer Koren, Roi Livni, Yishay Mansour

    Abstract: We consider the problem of prediction with expert advice when the losses of the experts have low-dimensional structure: they are restricted to an unknown $d$-dimensional subspace. We devise algorithms with regret bounds that are independent of the number of experts and depend only on the rank $d$. For the stochastic model we show a tight bound of $Θ(\sqrt{dT})$, and extend it to a setting of an ap… ▽ More

    Submitted 23 May, 2016; v1 submitted 21 March, 2016; originally announced March 2016.

  30. arXiv:1501.03273  [pdf, other

    cs.LG

    Classification with Low Rank and Missing Data

    Authors: Elad Hazan, Roi Livni, Yishay Mansour

    Abstract: We consider classification and regression tasks where we have missing data and assume that the (clean) data resides in a low rank subspace. Finding a hidden subspace is known to be computationally hard. Nevertheless, using a non-proper formulation we give an efficient agnostic algorithm that classifies as good as the best linear classifier coupled with the best low-dimensional subspace in which th… ▽ More

    Submitted 14 January, 2015; originally announced January 2015.

  31. arXiv:1410.1141  [pdf, other

    cs.LG cs.AI stat.ML

    On the Computational Efficiency of Training Neural Networks

    Authors: Roi Livni, Shai Shalev-Shwartz, Ohad Shamir

    Abstract: It is well-known that neural networks are computationally hard to train. On the other hand, in practice, modern day neural networks are trained efficiently using SGD and a variety of tricks that include different activation functions (e.g. ReLU), over-specification (i.e., train networks which are larger than needed), and regularization. In this paper we revisit the computational complexity of trai… ▽ More

    Submitted 28 October, 2014; v1 submitted 5 October, 2014; originally announced October 2014.

    Comments: Section 2 is revised due to a mistake

  32. arXiv:1304.7045  [pdf, other

    cs.LG cs.AI stat.ML

    An Algorithm for Training Polynomial Networks

    Authors: Roi Livni, Shai Shalev-Shwartz, Ohad Shamir

    Abstract: We consider deep neural networks, in which the output of each node is a quadratic function of its inputs. Similar to other deep architectures, these networks can compactly represent any function on a finite training set. The main goal of this paper is the derivation of an efficient layer-by-layer algorithm for training such networks, which we denote as the \emph{Basis Learner}. The algorithm is a… ▽ More

    Submitted 20 February, 2014; v1 submitted 25 April, 2013; originally announced April 2013.