Skip to main content

Showing 1–35 of 35 results for author: Vakili, S

  1. arXiv:2406.15250  [pdf, ps, other

    cs.LG

    Open Problem: Order Optimal Regret Bounds for Kernel-Based Reinforcement Learning

    Authors: Sattar Vakili

    Abstract: Reinforcement Learning (RL) has shown great empirical success in various application domains. The theoretical aspects of the problem have been extensively studied over past decades, particularly under tabular and linear Markov Decision Process structures. Recently, non-linear function approximation using kernel-based prediction has gained traction. This approach is particularly interesting as it n… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Open problem track. Conference on Learning Theory (COLT 2024)

  2. arXiv:2312.09674  [pdf, ps, other

    cs.LG cs.MA stat.ML

    Optimal Regret Bounds for Collaborative Learning in Bandits

    Authors: Amitis Shidani, Sattar Vakili

    Abstract: We consider regret minimization in a general collaborative multi-agent multi-armed bandit model, in which each agent faces a finite set of arms and may communicate with other agents through a central controller. The optimal arm for each agent in this model is the arm with the largest expected mixed reward, where the mixed reward of each arm is a weighted average of its rewards across all agents, m… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: Algorithmic Learning Theory (ALT) 2024

  3. arXiv:2311.04731  [pdf, other

    cs.LG stat.ML

    Robust Best-arm Identification in Linear Bandits

    Authors: Wei Wang, Sattar Vakili, Ilija Bogunovic

    Abstract: We study the robust best-arm identification problem (RBAI) in the case of linear rewards. The primary objective is to identify a near-optimal robust arm, which involves selecting arms at every round and assessing their robustness by exploring potential adversarial actions. This approach is particularly relevant when utilizing a simulator and seeking to identify a robust solution for real-world tra… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  4. arXiv:2310.15351  [pdf, other

    cs.LG stat.ML

    Random Exploration in Bayesian Optimization: Order-Optimal Regret and Computational Efficiency

    Authors: Sudeep Salgia, Sattar Vakili, Qing Zhao

    Abstract: We consider Bayesian optimization using Gaussian Process models, also referred to as kernel-based bandit optimization. We study the methodology of exploring the domain using random samples drawn from a distribution. We show that this random exploration approach achieves the optimal error rates. Our analysis is based on novel concentration bounds in an infinite dimensional Hilbert space established… ▽ More

    Submitted 2 February, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

  5. arXiv:2310.10053  [pdf, ps, other

    cs.AR

    Fast and Low-Cost Approximate Multiplier for FPGAs using Dynamic Reconfiguration

    Authors: Shervin Vakili, Mobin Vaziri, Amirhossein Zarei, J. M. Pierre Langlois

    Abstract: Multipliers are widely-used arithmetic operators in digital signal processing and machine learning circuits. Due to their relatively high complexity, they can have high latency and be a significant source of power consumption. One strategy to alleviate these limitations is to use approximate computing. This paper thus introduces an original FPGA-based approximate multiplier specifically optimized… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 5 figures, 3 tables

  6. arXiv:2310.01609  [pdf, ps, other

    stat.ML cs.LG

    Adversarial Contextual Bandits Go Kernelized

    Authors: Gergely Neu, Julia Olkhovskaya, Sattar Vakili

    Abstract: We study a generalization of the problem of online learning in adversarial linear contextual bandits by incorporating loss functions that belong to a reproducing kernel Hilbert space, which allows for a more flexible modeling of complex decision-making scenarios. We propose a computationally efficient algorithm that makes use of a new optimistically biased estimator for the loss functions and achi… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  7. arXiv:2308.05583  [pdf, other

    cs.AI cs.CE cs.NI stat.ML

    Generative Diffusion Models for Radio Wireless Channel Modelling and Sampling

    Authors: Ushnish Sengupta, Chinkuo Jao, Alberto Bernacchia, Sattar Vakili, Da-shan Shiu

    Abstract: Channel modelling is essential to designing modern wireless communication systems. The increasing complexity of channel modelling and the cost of collecting high-quality wireless channel data have become major challenges. In this paper, we propose a diffusion model based channel sampling approach for rapidly synthesizing channel realizations from limited data. We use a diffusion model with a U Net… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Comments: 2023 IEEE Global Communications Conference

  8. arXiv:2306.07745  [pdf, other

    cs.LG cs.AI stat.ML

    Kernelized Reinforcement Learning with Order Optimal Regret Bounds

    Authors: Sattar Vakili, Julia Olkhovskaya

    Abstract: Reinforcement learning (RL) has shown empirical success in various real world settings with complex models and large state-action spaces. The existing analytical results, however, typically focus on settings with a small number of state-actions or simple models such as linearly modeled state-action value functions. To derive RL policies that efficiently handle large state-action spaces with more g… ▽ More

    Submitted 14 March, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: Advances in Neural Information Processing Systems (NeurIPS), 2023. In the previous version, we utilized Lemma C.1 from Yang et al., 2020a to bound the RKHS norm of the kernel ridge predictor. In the current version, this is proven in Lemma 5

  9. arXiv:2306.00501  [pdf, other

    cs.CV cs.AI cs.LG

    Image generation with shortest path diffusion

    Authors: Ayan Das, Stathi Fotiadis, Anil Batra, Farhang Nabiei, FengTing Liao, Sattar Vakili, Da-shan Shiu, Alberto Bernacchia

    Abstract: The field of image generation has made significant progress thanks to the introduction of Diffusion Models, which learn to progressively reverse a given image corruption. Recently, a few studies introduced alternative ways of corrupting images in Diffusion Models, with an emphasis on blurring. However, these studies are purely empirical and it remains unclear what is the optimal procedure for corr… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: AD and SF contributed equally

  10. arXiv:2302.08436  [pdf, other

    stat.ML cs.LG

    Trieste: Efficiently Exploring The Depths of Black-box Functions with TensorFlow

    Authors: Victor Picheny, Joel Berkeley, Henry B. Moss, Hrvoje Stojic, Uri Granta, Sebastian W. Ober, Artem Artemev, Khurram Ghani, Alexander Goodall, Andrei Paleyes, Sattar Vakili, Sergio Pascual-Diaz, Stratis Markou, Jixiang Qing, Nasrulloh R. B. S Loka, Ivo Couckuyt

    Abstract: We present Trieste, an open-source Python package for Bayesian optimization and active learning benefiting from the scalability and efficiency of TensorFlow. Our library enables the plug-and-play of popular TensorFlow-based models within sequential decision-making loops, e.g. Gaussian processes from GPflow or GPflux, or neural networks from Keras. This modular mindset is central to the package and… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

  11. arXiv:2302.00727  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Sample Complexity of Kernel-Based Q-Learning

    Authors: Sing-Yuan Yeh, Fu-Chieh Chang, Chang-Wei Yueh, Pei-Yuan Wu, Alberto Bernacchia, Sattar Vakili

    Abstract: Modern reinforcement learning (RL) often faces an enormous state-action space. Existing analytical results are typically for settings with a small number of state-actions, or simple models such as linearly modeled Q-functions. To derive statistically efficient RL policies handling large state-action spaces, with more general Q-functions, some recent works have considered nonlinear function approxi… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

  12. arXiv:2302.00392  [pdf, other

    stat.ML cs.AI cs.LG

    Delayed Feedback in Kernel Bandits

    Authors: Sattar Vakili, Danyal Ahmed, Alberto Bernacchia, Ciara Pike-Burke

    Abstract: Black box optimisation of an unknown function from expensive and noisy evaluations is a ubiquitous problem in machine learning, academic research and industrial production. An abstraction of the problem can be formulated as a kernel based bandit problem (also known as Bayesian optimisation), where a learner aims at optimising a kernelized function through sequential noisy observations. The existin… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

  13. arXiv:2207.07948  [pdf, other

    stat.ML cs.LG

    Collaborative Learning in Kernel-based Bandits for Distributed Users

    Authors: Sudeep Salgia, Sattar Vakili, Qing Zhao

    Abstract: We study collaborative learning among distributed clients facilitated by a central server. Each client is interested in maximizing a personalized objective function that is a weighted sum of its local objective and a global objective. Each client has direct access to random bandit feedback on its local objective, but only has a partial view of the global objective and relies on information exchang… ▽ More

    Submitted 17 April, 2023; v1 submitted 16 July, 2022; originally announced July 2022.

  14. arXiv:2206.00121  [pdf, ps, other

    cs.LG stat.ML

    Near-Optimal Collaborative Learning in Bandits

    Authors: Clémence Réda, Sattar Vakili, Emilie Kaufmann

    Abstract: This paper introduces a general multi-agent bandit model in which each agent is facing a finite set of arms and may communicate with other agents through a central controller in order to identify, in pure exploration, or play, in regret minimization, its optimal arm. The twist is that the optimal arm for each agent is the arm with largest expected mixed reward, where the mixed reward of an arm is… ▽ More

    Submitted 28 October, 2022; v1 submitted 31 May, 2022; originally announced June 2022.

  15. arXiv:2206.00099  [pdf, other

    stat.ML cs.LG

    Provably and Practically Efficient Neural Contextual Bandits

    Authors: Sudeep Salgia, Sattar Vakili, Qing Zhao

    Abstract: We consider the neural contextual bandit problem. In contrast to the existing work which primarily focuses on ReLU neural nets, we consider a general set of smooth activation functions. Under this more general setting, (i) we derive non-asymptotic error bounds on the difference between an overparameterized neural net and its corresponding neural tangent kernel, (ii) we propose an algorithm with a… ▽ More

    Submitted 31 May, 2022; originally announced June 2022.

  16. arXiv:2202.04005  [pdf, ps, other

    cs.LG stat.ML

    Improved Convergence Rates for Sparse Approximation Methods in Kernel-Based Learning

    Authors: Sattar Vakili, Jonathan Scarlett, Da-shan Shiu, Alberto Bernacchia

    Abstract: Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine learning applications for regression and optimization. It is well known that a major downside for kernel-based models is the high computational cost; given a dataset of $n$ samples, the cost grows as $\mathcal{O}(n^3)$. Existing sparse approximation methods can yield a significant reduction in the… ▽ More

    Submitted 18 June, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

    Comments: International Conference on Machine Learning (ICML) 2022

  17. arXiv:2110.15458  [pdf, ps, other

    stat.ML cs.LG math.ST

    Open Problem: Tight Online Confidence Intervals for RKHS Elements

    Authors: Sattar Vakili, Jonathan Scarlett, Tara Javidi

    Abstract: Confidence intervals are a crucial building block in the analysis of various online learning problems. The analysis of kernel based bandit and reinforcement learning problems utilize confidence intervals applicable to the elements of a reproducing kernel Hilbert space (RKHS). However, the existing confidence bounds do not appear to be tight, resulting in suboptimal regret bounds. In fact, the exis… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

  18. arXiv:2109.06099  [pdf, other

    cs.LG stat.ML

    Uniform Generalization Bounds for Overparameterized Neural Networks

    Authors: Sattar Vakili, Michael Bromberg, Jezabel Garcia, Da-shan Shiu, Alberto Bernacchia

    Abstract: An interesting observation in artificial neural networks is their favorable generalization error despite typically being extremely overparameterized. It is well known that the classical statistical learning methods often result in vacuous generalization errors in the case of overparameterized neural networks. Adopting the recently developed Neural Tangent (NT) kernel theory, we prove uniform gener… ▽ More

    Submitted 11 October, 2021; v1 submitted 13 September, 2021; originally announced September 2021.

  19. arXiv:2108.09262  [pdf, other

    stat.ML cs.LG

    Optimal Order Simple Regret for Gaussian Process Bandits

    Authors: Sattar Vakili, Nacime Bouziani, Sepehr Jalali, Alberto Bernacchia, Da-shan Shiu

    Abstract: Consider the sequential optimization of a continuous, possibly non-convex, and expensive to evaluate objective function $f$. The problem can be cast as a Gaussian Process (GP) bandit where $f$ lives in a reproducing kernel Hilbert space (RKHS). The state of the art analysis of several learning algorithms shows a significant gap between the lower and upper bounds on the simple regret performance. W… ▽ More

    Submitted 20 August, 2021; originally announced August 2021.

  20. arXiv:2010.13997  [pdf, other

    stat.ML cs.LG

    A Domain-Shrinking based Bayesian Optimization Algorithm with Order-Optimal Regret Performance

    Authors: Sudeep Salgia, Sattar Vakili, Qing Zhao

    Abstract: We consider sequential optimization of an unknown function in a reproducing kernel Hilbert space. We propose a Gaussian process-based algorithm and establish its order-optimal regret performance (up to a poly-logarithmic factor). This is the first GP-based algorithm with an order-optimal regret guarantee. The proposed algorithm is rooted in the methodology of domain shrinking realized through a se… ▽ More

    Submitted 29 October, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

    Comments: Accepted to NeurIPS 2021

  21. arXiv:2010.00627  [pdf

    cs.AR

    CARLA: A Convolution Accelerator with a Reconfigurable and Low-Energy Architecture

    Authors: Mehdi Ahmadi, Shervin Vakili, J. M. Pierre Langlois

    Abstract: Convolutional Neural Networks (CNNs) have proven to be extremely accurate for image recognition, even outperforming human recognition capability. When deployed on battery-powered mobile devices, efficient computer architectures are required to enable fast and energy-efficient computation of costly convolution operations. Despite recent advances in hardware accelerator design for CNNs, two major pr… ▽ More

    Submitted 1 October, 2020; originally announced October 2020.

    Comments: 12 pages

  22. arXiv:2009.06966  [pdf, ps, other

    stat.ML cs.IT cs.LG

    On Information Gain and Regret Bounds in Gaussian Process Bandits

    Authors: Sattar Vakili, Kia Khezeli, Victor Picheny

    Abstract: Consider the sequential optimization of an expensive to evaluate and possibly non-convex objective function $f$ from noisy feedback, that can be considered as a continuum-armed bandit problem. Upper bounds on the regret performance of several learning algorithms (GP-UCB, GP-TS, and their variants) are known under both a Bayesian (when $f$ is a sample from a Gaussian process (GP)) and a frequentist… ▽ More

    Submitted 9 March, 2021; v1 submitted 15 September, 2020; originally announced September 2020.

  23. arXiv:2006.05356  [pdf, other

    stat.ML cs.LG

    Scalable Thompson Sampling using Sparse Gaussian Process Models

    Authors: Sattar Vakili, Henry Moss, Artem Artemev, Vincent Dutordoir, Victor Picheny

    Abstract: Thompson Sampling (TS) from Gaussian Process (GP) models is a powerful tool for the optimization of black-box functions. Although TS enjoys strong theoretical guarantees and convincing empirical performance, it incurs a large computational overhead that scales polynomially with the optimization budget. Recently, scalable TS methods based on sparse GP models have been proposed to increase the scope… ▽ More

    Submitted 5 November, 2021; v1 submitted 9 June, 2020; originally announced June 2020.

  24. arXiv:2003.05482  [pdf, other

    stat.ML cs.LG math.OC

    Stochastic Coordinate Minimization with Progressive Precision for Stochastic Convex Optimization

    Authors: Sudeep Salgia, Qing Zhao, Sattar Vakili

    Abstract: A framework based on iterative coordinate minimization (CM) is developed for stochastic convex optimization. Given that exact coordinate minimization is impossible due to the unknown stochastic nature of the objective function, the crux of the proposed optimization algorithm is an optimal control of the minimization precision in each iteration. We establish the optimal precision control and the re… ▽ More

    Submitted 11 March, 2020; originally announced March 2020.

  25. arXiv:2003.04125  [pdf, other

    stat.ML cs.LG stat.ME

    Amortized variance reduction for doubly stochastic objectives

    Authors: Ayman Boustati, Sattar Vakili, James Hensman, ST John

    Abstract: Approximate inference in complex probabilistic models such as deep Gaussian processes requires the optimisation of doubly stochastic objective functions. These objectives incorporate randomness both from mini-batch subsampling of the data and from Monte Carlo estimation of expectations. If the gradient variance is high, the stochastic optimisation problem becomes difficult with a slow rate of conv… ▽ More

    Submitted 9 March, 2020; originally announced March 2020.

  26. arXiv:2002.07711  [pdf

    eess.SP cs.AR

    An Energy-Efficient Accelerator Architecture with Serial Accumulation Dataflow for Deep CNNs

    Authors: Mehdi Ahmadi, Shervin Vakili, J. M. Pierre Langlois

    Abstract: Convolutional Neural Networks (CNNs) have shown outstanding accuracy for many vision tasks during recent years. When deploying CNNs on portable devices and embedded systems, however, the large number of parameters and computations result in long processing time and low battery life. An important factor in designing CNN hardware accelerators is to efficiently map the convolution computation onto ha… ▽ More

    Submitted 14 February, 2020; originally announced February 2020.

    Comments: 4 pages

  27. arXiv:2002.05096  [pdf, ps, other

    stat.ML cs.LG

    Regret Bounds for Noise-Free Kernel-Based Bandits

    Authors: Sattar Vakili

    Abstract: Kernel-based bandit is an extensively studied black-box optimization problem, in which the objective function is assumed to live in a known reproducing kernel Hilbert space. While nearly optimal regret bounds (up to logarithmic factors) are established in the noisy setting, surprisingly, less is known about the noise-free setting (when the exact values of the underlying function is accessible with… ▽ More

    Submitted 24 June, 2022; v1 submitted 12 February, 2020; originally announced February 2020.

    Comments: Conference on Learning Theory (COLT) 2022

  28. arXiv:1912.02493  [pdf, other

    stat.ML cs.LG math.OC

    Ordinal Bayesian Optimisation

    Authors: Victor Picheny, Sattar Vakili, Artem Artemev

    Abstract: Bayesian optimisation is a powerful tool to solve expensive black-box problems, but fails when the stationary assumption made on the objective function is strongly violated, which is the case in particular for ill-conditioned or discontinuous objectives. We tackle this problem by proposing a new Bayesian optimisation framework that only considers the ordering of variables, both in the input and ou… ▽ More

    Submitted 5 December, 2019; originally announced December 2019.

  29. arXiv:1905.06821  [pdf, other

    stat.ML cs.LG

    Adaptive Sensor Placement for Continuous Spaces

    Authors: James A Grant, Alexis Boukouvalas, Ryan-Rhys Griffiths, David S Leslie, Sattar Vakili, Enrique Munoz de Cote

    Abstract: We consider the problem of adaptively placing sensors along an interval to detect stochastically-generated events. We present a new formulation of the problem as a continuum-armed bandit problem with feedback in the form of partial observations of realisations of an inhomogeneous Poisson process. We design a solution method by combining Thompson sampling with nonparametric inference via increasing… ▽ More

    Submitted 16 May, 2019; originally announced May 2019.

    Comments: 13 pages, accepted to ICML 2019

  30. arXiv:1901.05947  [pdf, other

    stat.ML cs.LG math.OC

    Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization

    Authors: Sattar Vakili, Sudeep Salgia, Qing Zhao

    Abstract: Online minimization of an unknown convex function over the interval $[0,1]$ is considered under first-order stochastic bandit feedback, which returns a random realization of the gradient of the function at each query point. Without knowing the distribution of the random gradients, a learning algorithm sequentially chooses query points with the objective of minimizing regret defined as the expected… ▽ More

    Submitted 20 February, 2020; v1 submitted 17 January, 2019; originally announced January 2019.

  31. arXiv:1807.09089  [pdf, other

    stat.ML cs.LG

    Decision Variance in Online Learning

    Authors: Sattar Vakili, Alexis Boukouvalas, Qing Zhao

    Abstract: Online learning has traditionally focused on the expected rewards. In this paper, a risk-averse online learning problem under the performance measure of the mean-variance of the rewards is studied. Both the bandit and full information settings are considered. The performance of several existing policies is analyzed, and new fundamental limitations on risk-averse learning is established. In particu… ▽ More

    Submitted 14 March, 2019; v1 submitted 24 July, 2018; originally announced July 2018.

  32. arXiv:1802.04339  [pdf, ps, other

    cs.LG

    Multi-Armed Bandits on Partially Revealed Unit Interval Graphs

    Authors: Xiao Xu, Sattar Vakili, Qing Zhao, Ananthram Swami

    Abstract: A stochastic multi-armed bandit problem with side information on the similarity and dissimilarity across different arms is considered. The action space of the problem can be represented by a unit interval graph (UIG) where each node represents an arm and the presence (absence) of an edge between two nodes indicates similarity (dissimilarity) between their mean rewards. Two settings of complete and… ▽ More

    Submitted 31 August, 2019; v1 submitted 12 February, 2018; originally announced February 2018.

    Comments: Parts of the work have been presented at the 36th IEEE Military Communication Conference (MILCOM), October, 2017 and the 52nd Asilomar Conference on Signals, Systems and Computers, October, 2018

  33. arXiv:1709.03573  [pdf, ps, other

    cs.LG

    Anomaly Detection in Hierarchical Data Streams under Unknown Models

    Authors: Sattar Vakili, Qing Zhao, Chang Liu, Chen-Nee Chuah

    Abstract: We consider the problem of detecting a few targets among a large number of hierarchical data streams. The data streams are modeled as random processes with unknown and potentially heavy-tailed distributions. The objective is an active inference strategy that determines, sequentially, which data stream to collect samples from in order to minimize the sample complexity under a reliability constraint… ▽ More

    Submitted 11 September, 2017; originally announced September 2017.

  34. Risk-Averse Multi-Armed Bandit Problems under Mean-Variance Measure

    Authors: Sattar Vakili, Qing Zhao

    Abstract: The multi-armed bandit problems have been studied mainly under the measure of expected total reward accrued over a horizon of length $T$. In this paper, we address the issue of risk in multi-armed bandit problems and develop parallel results under the measure of mean-variance, a commonly adopted risk measure in economics and mathematical finance. We show that the model-specific regret and the mode… ▽ More

    Submitted 14 August, 2017; v1 submitted 18 April, 2016; originally announced April 2016.

  35. arXiv:1106.6104  [pdf, ps, other

    math.OC cs.LG eess.SY math.PR math.ST

    Deterministic Sequencing of Exploration and Exploitation for Multi-Armed Bandit Problems

    Authors: Sattar Vakili, Keqin Liu, Qing Zhao

    Abstract: In the Multi-Armed Bandit (MAB) problem, there is a given set of arms with unknown reward models. At each time, a player selects one arm to play, aiming to maximize the total expected reward over a horizon of length T. An approach based on a Deterministic Sequencing of Exploration and Exploitation (DSEE) is developed for constructing sequential arm selection policies. It is shown that for all ligh… ▽ More

    Submitted 9 March, 2013; v1 submitted 29 June, 2011; originally announced June 2011.

    Comments: 22 pages, 2 figures

    MSC Class: 62L05; 93E35 (Primary); 60G40 (Secondary)