Skip to main content

Showing 1–23 of 23 results for author: Ruszczyński, A

  1. arXiv:2405.10815  [pdf, other

    math.OC cs.LG stat.ML

    A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization

    Authors: Andrzej Ruszczyński, Shangzhe Yang

    Abstract: We consider stochastic optimization problems involving an expected value of a nonlinear function of a base random vector and a conditional expectation of another function depending on the base random vector, a dependent random vector, and the decision variables. We call such problems conditional stochastic optimization problems. They arise in many applications, such as uplift modeling, reinforceme… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    MSC Class: 90C15; 49J52; 60-08

  2. arXiv:2312.01432  [pdf, other

    cs.LG math.OC

    Fast Dual Subgradient Optimization of the Integrated Transportation Distance Between Stochastic Kernels

    Authors: Zhengqi Lin, Andrzej Ruszczynski

    Abstract: A generalization of the Wasserstein metric, the integrated transportation distance, establishes a novel distance between probability kernels of Markov systems. This metric serves as the foundation for an efficient approximation technique, enabling the replacement of the original system's kernel with a kernel with a discrete support of limited cardinality. To facilitate practical implementation, we… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: text overlap with arXiv:2311.06645

  3. arXiv:2312.00946  [pdf, other

    math.OC

    Risk-Averse Control of Markov Systems with Value Function Learning

    Authors: Andrzej Ruszczynski, Shangzhe Yang

    Abstract: We consider a control problem for a finite-state Markov system whose performance is evaluated by a coherent Markov risk measure. For each policy, the risk of a state is approximated by a function of its features, thus leading to a lower-dimensional policy evaluation problem, which involves non-differentiable stochastic operators. We introduce mini-batch transition risk mappings, which are particul… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  4. An Integrated Transportation Distance Between Kernels and Approximate Dynamic Risk Evaluation in Markov Systems

    Authors: Zhengqi Lin, Andrzej Ruszczynski

    Abstract: We introduce a distance between kernels based on the Wasserstein distances between their values, study its properties, and prove that it is a metric on an appropriately defined space of kernels. We also relate it to various modes of convergence in the space of kernels. Then we consider the problem of approximating solutions to forward--backward systems, where the forward part is a Markov system de… ▽ More

    Submitted 11 November, 2023; originally announced November 2023.

  5. arXiv:2301.06619  [pdf, other

    math.OC math.ST

    Distributionally Robust Learning with Weakly Convex Losses: Convergence Rates and Finite-Sample Guarantees

    Authors: Landi Zhu, Mert Gürbüzbalaban, Andrzej Ruszczyński

    Abstract: We consider a distributionally robust stochastic optimization problem and formulate it as a stochastic two-level composition optimization problem with the use of the mean--semideviation risk measure. In this setting, we consider a single time-scale algorithm, involving two versions of the inner function value tracking: linearized tracking of a continuously differentiable loss function, and SPIDER… ▽ More

    Submitted 9 June, 2023; v1 submitted 16 January, 2023; originally announced January 2023.

  6. arXiv:2206.09235  [pdf, ps, other

    math.OC math.PR q-fin.PM

    Risk Filtering and Risk-Averse Control of Markovian Systems Subject to Model Uncertainty

    Authors: Tomasz R. Bielecki, Igor Cialenco, Andrzej Ruszczyński

    Abstract: We consider a Markov decision process subject to model uncertainty in a Bayesian framework, where we assume that the state process is observed but its law is unknown to the observer. In addition, while the state process and the controls are observed at time $t$, the actual cost that may depend on the unknown parameter is not known at time $t$. The controller optimizes total cost by using a family… ▽ More

    Submitted 18 June, 2022; originally announced June 2022.

  7. arXiv:2006.04873  [pdf, other

    math.OC cs.LG math.ST

    A Stochastic Subgradient Method for Distributionally Robust Non-Convex Learning

    Authors: Mert Gürbüzbalaban, Andrzej Ruszczyński, Landi Zhu

    Abstract: We consider a distributionally robust formulation of stochastic optimization problems arising in statistical learning, where robustness is with respect to uncertainty in the underlying data distribution. Our formulation builds on risk-averse optimization techniques and the theory of coherent risk measures. It uses semi-deviation risk for quantifying uncertainty, allowing us to compute solutions th… ▽ More

    Submitted 7 June, 2021; v1 submitted 8 June, 2020; originally announced June 2020.

  8. arXiv:2003.00780  [pdf, other

    math.OC cs.LG

    Risk-Averse Learning by Temporal Difference Methods

    Authors: Umit Kose, Andrzej Ruszczynski

    Abstract: We consider reinforcement learning with performance evaluated by a dynamic risk measure. We construct a projected risk-averse dynamic programming equation and study its properties. Then we propose risk-averse counterparts of the methods of temporal differences and we prove their convergence with probability one. We also perform an empirical study on a complex transportation problem.

    Submitted 2 March, 2020; originally announced March 2020.

    MSC Class: 49L20; 62L20; 90C39

  9. arXiv:2001.10669  [pdf, ps, other

    math.OC

    A Stochastic Subgradient Method for Nonsmooth Nonconvex Multi-Level Composition Optimization

    Authors: Andrzej Ruszczynski

    Abstract: We propose a single time-scale stochastic subgradient method for constrained optimization of a composition of several nonsmooth and nonconvex functions. The functions are assumed to be locally Lipschitz and differentiable in a generalized sense. Only stochastic estimates of the values and generalized derivatives of the functions are used. The method is parameter-free. We prove convergence with pro… ▽ More

    Submitted 18 December, 2020; v1 submitted 28 January, 2020; originally announced January 2020.

    MSC Class: 90C15; 49J52; 62L20

  10. arXiv:2001.06924  [pdf, ps, other

    math.OC

    Subregular Recourse in Nonlinear Multistage Stochastic Optimization

    Authors: Darinka Dentcheva, Andrzej Ruszczynski

    Abstract: We consider nonlinear multistage stochastic optimization problems in the spaces of integrable functions. We allow for nonlinear dynamics and general objective functionals, including dynamic risk measures. We study causal operators describing the dynamics of the system and derive the Clarke subdifferential for a penalty function involving such operators. Then we introduce the concept of subregular… ▽ More

    Submitted 19 January, 2020; originally announced January 2020.

    MSC Class: 49K27; 90C15

  11. arXiv:1912.07580  [pdf, other

    math.OC

    Convergence of a Stochastic Subgradient Method with Averaging for Nonsmooth Nonconvex Constrained Optimization

    Authors: Andrzej Ruszczynski

    Abstract: We prove convergence of a single time-scale stochastic subgradient method with subgradient averaging for constrained problems with a nonsmooth and nonconvex objective function having the property of generalized differentiability. As a tool of our analysis, we also prove a chain rule on a path for such functions.

    Submitted 16 December, 2019; originally announced December 2019.

    MSC Class: 90C15; 90C48

  12. arXiv:1812.01094  [pdf, ps, other

    math.OC

    A Single Time-Scale Stochastic Approximation Method for Nested Stochastic Optimization

    Authors: Saeed Ghadimi, Andrzej Ruszczyński, Mengdi Wang

    Abstract: We study constrained nested stochastic optimization problems in which the objective function is a composition of two smooth functions whose exact values and derivatives are not available. We propose a single time-scale stochastic approximation algorithm, which we call the Nested Averaged Stochastic Approximation (NASA), to find an approximate stationary point of the problem. The algorithm has two… ▽ More

    Submitted 6 September, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

  13. arXiv:1807.02300  [pdf, ps, other

    math.OC

    Risk Forms: Representation, Disintegration, and Application to Partially Observable Two-Stage Systems

    Authors: Darinka Dentcheva, Andrzej Ruszczynski

    Abstract: We introduce the concept of a risk form, which is a real functional of two arguments: a measurable function on a Polish space and a measure on that space. We generalize the duality theory and the Kusuoka representation to this setting. For a risk form acting on a product of Polish spaces, we define marginal and conditional forms and we prove a disintegration formula, which represents a risk form a… ▽ More

    Submitted 17 November, 2018; v1 submitted 6 July, 2018; originally announced July 2018.

    MSC Class: 90C15; 90C48

  14. arXiv:1701.08453  [pdf, ps, other

    math.OC

    Time-Consistent Risk Measures for Continuous-Time Markov Chains

    Authors: Darinka Dentcheva, Andrzej Ruszczynski

    Abstract: We develop an approach to time-consistent risk evaluation of continuous-time processes in Markov systems. Our analysis is based on dual representation of coherent risk measures, differentiability concepts for multivalued mappings, and a refined concept of time consistency. We prove that the risk measures are defined by a family of risk evaluation functionals (transition risk mappings), which depen… ▽ More

    Submitted 29 January, 2017; originally announced January 2017.

  15. arXiv:1701.06234  [pdf, other

    math.OC q-fin.MF

    A Dual Method For Backward Stochastic Differential Equations with Application to Risk Valuation

    Authors: Andrzej Ruszczynski, Jianing Yao

    Abstract: We propose a numerical recipe for risk evaluation defined by a backward stochastic differential equation. Using dual representation of the risk measure, we convert the risk valuation to a stochastic control problem where the control is a certain Radon-Nikodym derivative process. By exploring the maximum principle, we show that a piecewise-constant dual control provides a good approximation on a sh… ▽ More

    Submitted 20 August, 2020; v1 submitted 22 January, 2017; originally announced January 2017.

    Journal ref: ESAIM: COCV, 2020

  16. arXiv:1609.00842  [pdf, ps, other

    math.OC

    Rate of Convergence of the Bundle Method

    Authors: Yu Du, Andrzej Ruszczynski

    Abstract: We prove that the bundle method for nonsmooth optimization achieves solution accuracy $\varepsilon$ in at most $\mathcal{O}\big(\ln(1/\varepsilon)/\varepsilon\big)$ iterations, if the function is strongly convex. The result is true for the versions of the method with multiple cuts and with cut aggregation.

    Submitted 3 September, 2016; originally announced September 2016.

    Comments: arXiv admin note: text overlap with arXiv:1511.01716

  17. arXiv:1511.01716  [pdf, ps, other

    math.OC

    Selective Linearization For Multi-Block Convex Optimization

    Authors: Yu Du, Xiaodong Lin, Andrzej Ruszczynski

    Abstract: We consider the problem of minimizing a sum of several convex non-smooth functions. We introduce a new algorithm called the selective linearization method, which iteratively linearizes all but one of the functions and employs simple proximal steps. The algorithm is a form of multiple operator splitting in which the order of processing partial functions is not fixed, but rather determined in the co… ▽ More

    Submitted 12 August, 2016; v1 submitted 5 November, 2015; originally announced November 2015.

  18. arXiv:1508.05316  [pdf, ps, other

    math.OC

    Discrete-Time Approximation of Risk-Averse Control Problems for Diffusion Processes

    Authors: Andrzej Ruszczynski, Jianing Yao

    Abstract: We consider optimal control problems for diffusion processes, where the objective functional is defined by a time-consistent dynamic risk measure. We focus on coherent risk measures defined by $g$-evaluations. For such problems, we construct a family of time and space perturbed systems with piecewise-constant control functions. We obtain a regularized optimal value function by a special mollificat… ▽ More

    Submitted 19 August, 2016; v1 submitted 21 August, 2015; originally announced August 2015.

  19. arXiv:1504.02658  [pdf, ps, other

    math.ST

    Statistical Estimation of Composite Risk Functionals and Risk Optimization Problems

    Authors: Darinka Dentcheva, Spiridon Penev, Andrzej Ruszczynski

    Abstract: We address the statistical estimation of composite functionals which may be nonlinear in the probability measure. Our study is motivated by the need to estimate coherent measures of risk, which become increasingly popular in finance, insurance, and other areas associated with optimization under uncertainty and risk. We establish central limit formulae for composite risk functionals. Furthermore, w… ▽ More

    Submitted 10 April, 2015; originally announced April 2015.

  20. arXiv:1411.2675  [pdf, ps, other

    math.OC q-fin.PM

    Process-Based Risk Measures and Risk-Averse Control of Discrete-Time Systems

    Authors: Jingnan Fan, Andrzej Ruszczynski

    Abstract: For controlled discrete-time stochastic processes we introduce a new class of dynamic risk measures, which we call process-based. Their main features are that they measure risk of processes that are functions of the history of a base process. We introduce a new concept of conditional stochastic time consistency and we derive the structure of process-based risk measures enjoying this property. We s… ▽ More

    Submitted 29 November, 2016; v1 submitted 10 November, 2014; originally announced November 2014.

    MSC Class: 90C40; 49L20

  21. arXiv:1211.4469  [pdf, ps, other

    math.FA math.OC

    Common Mathematical Foundations of Expected Utility and Dual Utility Theories

    Authors: Darinka Dentcheva, Andrzej Ruszczynski

    Abstract: We show that the main results of the expected utility and dual utility theories can be derived in a unified way from two fundamental mathematical ideas: the separation principle of convex analysis, and integral representations of continuous linear functionals from functional analysis. Our analysis reveals the dual character of utility functions. We also derive new integral representations of dual… ▽ More

    Submitted 19 November, 2012; originally announced November 2012.

  22. arXiv:1203.5437  [pdf, other

    math.OC

    Risk-Averse Control of Undiscounted Transient Markov Models

    Authors: Ozlem Cavus, Andrzej Ruszczynski

    Abstract: We use Markov risk measures to formulate a risk-averse version of the undiscounted total cost problem for a transient controlled Markov process. We derive risk-averse dynamic programming equations and we show that a randomized policy may be strictly better than deterministic policies, when risk measures are employed. We illustrate the results on an optimal stopping problem and an organ transplant… ▽ More

    Submitted 22 March, 2014; v1 submitted 24 March, 2012; originally announced March 2012.

    MSC Class: 90C39; 90C40 (Primary) 91B30 (Secondary)

  23. arXiv:1201.0306  [pdf, other

    stat.CO

    Alternating Linearization for Structured Regularization Problems

    Authors: Xiaodong Lin, Minh Pham, Andrzej Ruszczynski

    Abstract: We adapt the alternating linearization method for proximal decomposition to structured regularization problems, in particular, to the generalized lasso problems. The method is related to two well-known operator splitting methods, the Douglas--Rachford and the Peaceman--Rachford method, but it has descent properties with respect to the objective function. This is achieved by employing a special upd… ▽ More

    Submitted 24 March, 2014; v1 submitted 31 December, 2011; originally announced January 2012.