-
A Survey of Classical And Quantum Sequence Models
Authors:
I-Chi Chen,
Harshdeep Singh,
V L Anukruti,
Brian Quanz,
Kavitha Yogaraj
Abstract:
Our primary objective is to conduct a brief survey of various classical and quantum neural net sequence models, which includes self-attention and recurrent neural networks, with a focus on recent quantum approaches proposed to work with near-term quantum devices, while exploring some basic enhancements for these quantum models. We re-implement a key representative set of these existing methods, ad…
▽ More
Our primary objective is to conduct a brief survey of various classical and quantum neural net sequence models, which includes self-attention and recurrent neural networks, with a focus on recent quantum approaches proposed to work with near-term quantum devices, while exploring some basic enhancements for these quantum models. We re-implement a key representative set of these existing methods, adapting an image classification approach using quantum self-attention to create a quantum hybrid transformer that works for text and image classification, and applying quantum self-attention and quantum recurrent neural networks to natural language processing tasks. We also explore different encoding techniques and introduce positional encoding into quantum self-attention neural networks leading to improved accuracy and faster convergence in text and image classification experiments. This paper also performs a comparative analysis of classical self-attention models and their quantum counterparts, helping shed light on the differences in these models and their performance.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
An Optimistic-Robust Approach for Dynamic Positioning of Omnichannel Inventories
Authors:
Pavithra Harsha,
Shivaram Subramanian,
Ali Koc,
Mahesh Ramakrishna,
Brian Quanz,
Dhruv Shah,
Chandra Narayanaswami
Abstract:
We introduce a new class of data-driven and distribution-free optimistic-robust bimodal inventory optimization (BIO) strategy to effectively allocate inventory across a retail chain to meet time-varying, uncertain omnichannel demand. While prior Robust optimization (RO) methods emphasize the downside, i.e., worst-case adversarial demand, BIO also considers the upside to remain resilient like RO wh…
▽ More
We introduce a new class of data-driven and distribution-free optimistic-robust bimodal inventory optimization (BIO) strategy to effectively allocate inventory across a retail chain to meet time-varying, uncertain omnichannel demand. While prior Robust optimization (RO) methods emphasize the downside, i.e., worst-case adversarial demand, BIO also considers the upside to remain resilient like RO while also reaping the rewards of improved average-case performance by overcoming the presence of endogenous outliers. This bimodal strategy is particularly valuable for balancing the tradeoff between lost sales at the store and the costs of cross-channel e-commerce fulfillment, which is at the core of our inventory optimization model. These factors are asymmetric due to the heterogenous behavior of the channels, with a bias towards the former in terms of lost-sales cost and a dependence on network effects for the latter. We provide structural insights about the BIO solution and how it can be tuned to achieve a preferred tradeoff between robustness and the average-case. Our experiments show that significant benefits can be achieved by rethinking traditional approaches to inventory management, which are siloed by channel and location. Using a real-world dataset from a large American omnichannel retail chain, a business value assessment during a peak period indicates over a 15% profitability gain for BIO over RO and other baselines while also preserving the (practical) worst case performance.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Hierarchical Proxy Modeling for Improved HPO in Time Series Forecasting
Authors:
Arindam Jati,
Vijay Ekambaram,
Shaonli Pal,
Brian Quanz,
Wesley M. Gifford,
Pavithra Harsha,
Stuart Siegel,
Sumanta Mukherjee,
Chandra Narayanaswami
Abstract:
Selecting the right set of hyperparameters is crucial in time series forecasting. The classical temporal cross-validation framework for hyperparameter optimization (HPO) often leads to poor test performance because of a possible mismatch between validation and test periods. To address this test-validation mismatch, we propose a novel technique, H-Pro to drive HPO via test proxies by exploiting dat…
▽ More
Selecting the right set of hyperparameters is crucial in time series forecasting. The classical temporal cross-validation framework for hyperparameter optimization (HPO) often leads to poor test performance because of a possible mismatch between validation and test periods. To address this test-validation mismatch, we propose a novel technique, H-Pro to drive HPO via test proxies by exploiting data hierarchies often associated with time series datasets. Since higher-level aggregated time series often show less irregularity and better predictability as compared to the lowest-level time series which can be sparse and intermittent, we optimize the hyperparameters of the lowest-level base-forecaster by leveraging the proxy forecasts for the test period generated from the forecasters at higher levels. H-Pro can be applied on any off-the-shelf machine learning model to perform HPO. We validate the efficacy of our technique with extensive empirical evaluation on five publicly available hierarchical forecasting datasets. Our approach outperforms existing state-of-the-art methods in Tourism, Wiki, and Traffic datasets, and achieves competitive result in Tourism-L dataset, without any model-specific enhancements. Moreover, our method outperforms the winning method of the M5 forecast accuracy competition.
△ Less
Submitted 2 November, 2023; v1 submitted 28 November, 2022;
originally announced November 2022.
-
Towards Creativity Characterization of Generative Models via Group-based Subset Scanning
Authors:
Celia Cintas,
Payel Das,
Brian Quanz,
Girmaw Abebe Tadesse,
Skyler Speakman,
Pin-Yu Chen
Abstract:
Deep generative models, such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs), have been employed widely in computational creativity research. However, such models discourage out-of-distribution generation to avoid spurious sample generation, thereby limiting their creativity. Thus, incorporating research on human creativity into generative deep learning techniques pre…
▽ More
Deep generative models, such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs), have been employed widely in computational creativity research. However, such models discourage out-of-distribution generation to avoid spurious sample generation, thereby limiting their creativity. Thus, incorporating research on human creativity into generative deep learning techniques presents an opportunity to make their outputs more compelling and human-like. As we see the emergence of generative models directed toward creativity research, a need for machine learning-based surrogate metrics to characterize creative output from these models is imperative. We propose group-based subset scanning to identify, quantify, and characterize creative processes by detecting a subset of anomalous node-activations in the hidden layers of the generative models. Our experiments on the standard image benchmarks, and their "creatively generated" variants, reveal that the proposed subset scores distribution is more useful for detecting creative processes in the activation space rather than the pixel space. Further, we found that creative samples generate larger subsets of anomalies than normal or non-creative samples across datasets. The node activations highlighted during the creative decoding process are different from those responsible for the normal sample generation. Lastly, we assess if the images from the subsets selected by our method were also found creative by human evaluators, presenting a link between creativity perception in humans and node activations within deep neural nets.
△ Less
Submitted 26 May, 2022; v1 submitted 1 March, 2022;
originally announced March 2022.
-
Deep Policy Iteration with Integer Programming for Inventory Management
Authors:
Pavithra Harsha,
Ashish Jagmohan,
Jayant R. Kalagnanam,
Brian Quanz,
Divya Singhvi
Abstract:
We present a Reinforcement Learning (RL) based framework for optimizing long-term discounted reward problems with large combinatorial action space and state dependent constraints. These characteristics are common to many operations management problems, e.g., network inventory replenishment, where managers have to deal with uncertain demand, lost sales, and capacity constraints that results in more…
▽ More
We present a Reinforcement Learning (RL) based framework for optimizing long-term discounted reward problems with large combinatorial action space and state dependent constraints. These characteristics are common to many operations management problems, e.g., network inventory replenishment, where managers have to deal with uncertain demand, lost sales, and capacity constraints that results in more complex feasible action spaces. Our proposed Programmable Actor Reinforcement Learning (PARL) uses a deep-policy iteration method that leverages neural networks (NNs) to approximate the value function and combines it with mathematical programming (MP) and sample average approximation (SAA) to solve the per-step-action optimally while accounting for combinatorial action spaces and state-dependent constraint sets.
We show how the proposed methodology can be applied to complex inventory replenishment problems where analytical solutions are intractable. We also benchmark the proposed algorithm against state-of-the-art RL algorithms and commonly used replenishment heuristics and find that the proposed algorithm considerably outperforms existing methods by as much as 14.7\% on average in various supply chain settings.
This improvement in performance of PARL over benchmark algorithms can be attributed to better inventory cost management, especially in inventory constrained settings. Furthermore, in a simpler back order setting where the optimal solution is tractable, we find that the RL based policy also converges to the optimal policy. Finally, to make RL algorithms more accessible for inventory management researchers, we also discuss a modular Python library developed that can be used to test the performance of RL algorithms with various supply chain structures. This library can spur future research in developing practical and near-optimal algorithms for inventory management problems.
△ Less
Submitted 14 October, 2022; v1 submitted 3 December, 2021;
originally announced December 2021.
-
Learning to shortcut and shortlist order fulfillment deciding
Authors:
Brian Quanz,
Ajay Deshpande,
Dahai Xing,
Xuan Liu
Abstract:
With the increase of order fulfillment options and business objectives taken into consideration in the deciding process, order fulfillment deciding is becoming more and more complex. For example, with the advent of ship from store retailers now have many more fulfillment nodes to consider, and it is now common to take into account many and varied business goals in making fulfillment decisions. Wit…
▽ More
With the increase of order fulfillment options and business objectives taken into consideration in the deciding process, order fulfillment deciding is becoming more and more complex. For example, with the advent of ship from store retailers now have many more fulfillment nodes to consider, and it is now common to take into account many and varied business goals in making fulfillment decisions. With increasing complexity, efficiency of the deciding process can become a real concern. Finding the optimal fulfillment assignments among all possible ones may be too costly to do for every order especially during peak times. In this work, we explore the possibility of exploiting regularity in the fulfillment decision process to reduce the burden on the deciding system. By using data mining we aim to find patterns in past fulfillment decisions that can be used to efficiently predict most likely assignments for future decisions. Essentially, those assignments that can be predicted with high confidence can be used to shortcut, or bypass, the expensive deciding process, or else a set of most likely assignments can be used for shortlisting -- sending a much smaller set of candidates for consideration by the fulfillment deciding system.
△ Less
Submitted 4 October, 2021;
originally announced October 2021.
-
Predicting Deep Neural Network Generalization with Perturbation Response Curves
Authors:
Yair Schiff,
Brian Quanz,
Payel Das,
Pin-Yu Chen
Abstract:
The field of Deep Learning is rich with empirical evidence of human-like performance on a variety of prediction tasks. However, despite these successes, the recent Predicting Generalization in Deep Learning (PGDL) NeurIPS 2020 competition suggests that there is a need for more robust and efficient measures of network generalization. In this work, we propose a new framework for evaluating the gener…
▽ More
The field of Deep Learning is rich with empirical evidence of human-like performance on a variety of prediction tasks. However, despite these successes, the recent Predicting Generalization in Deep Learning (PGDL) NeurIPS 2020 competition suggests that there is a need for more robust and efficient measures of network generalization. In this work, we propose a new framework for evaluating the generalization capabilities of trained networks. We use perturbation response (PR) curves that capture the accuracy change of a given network as a function of varying levels of training sample perturbation. From these PR curves, we derive novel statistics that capture generalization capability. Specifically, we introduce two new measures for accurately predicting generalization gaps: the Gi-score and Pal-score, which are inspired by the Gini coefficient and Palma ratio (measures of income inequality), that accurately predict generalization gaps. Using our framework applied to intra and inter-class sample mixup, we attain better predictive scores than the current state-of-the-art measures on a majority of tasks in the PGDL competition. In addition, we show that our framework and the proposed statistics can be used to capture to what extent a trained network is invariant to a given parametric input transformation, such as rotation or translation. Therefore, these generalization gap prediction statistics also provide a useful means for selecting optimal network architectures and hyperparameters that are invariant to a certain perturbation.
△ Less
Submitted 26 October, 2021; v1 submitted 8 June, 2021;
originally announced June 2021.
-
Gi and Pal Scores: Deep Neural Network Generalization Statistics
Authors:
Yair Schiff,
Brian Quanz,
Payel Das,
Pin-Yu Chen
Abstract:
The field of Deep Learning is rich with empirical evidence of human-like performance on a variety of regression, classification, and control tasks. However, despite these successes, the field lacks strong theoretical error bounds and consistent measures of network generalization and learned invariances. In this work, we introduce two new measures, the Gi-score and Pal-score, that capture a deep ne…
▽ More
The field of Deep Learning is rich with empirical evidence of human-like performance on a variety of regression, classification, and control tasks. However, despite these successes, the field lacks strong theoretical error bounds and consistent measures of network generalization and learned invariances. In this work, we introduce two new measures, the Gi-score and Pal-score, that capture a deep neural network's generalization capabilities. Inspired by the Gini coefficient and Palma ratio, measures of income inequality, our statistics are robust measures of a network's invariance to perturbations that accurately predict generalization gaps, i.e., the difference between accuracy on training and test sets.
△ Less
Submitted 9 June, 2021; v1 submitted 7 April, 2021;
originally announced April 2021.
-
Towards creativity characterization of generative models via group-based subset scanning
Authors:
Celia Cintas,
Payel Das,
Brian Quanz,
Skyler Speakman,
Victor Akinwande,
Pin-Yu Chen
Abstract:
Deep generative models, such as Variational Autoencoders (VAEs), have been employed widely in computational creativity research. However, such models discourage out-of-distribution generation to avoid spurious sample generation, limiting their creativity. Thus, incorporating research on human creativity into generative deep learning techniques presents an opportunity to make their outputs more com…
▽ More
Deep generative models, such as Variational Autoencoders (VAEs), have been employed widely in computational creativity research. However, such models discourage out-of-distribution generation to avoid spurious sample generation, limiting their creativity. Thus, incorporating research on human creativity into generative deep learning techniques presents an opportunity to make their outputs more compelling and human-like. As we see the emergence of generative models directed to creativity research, a need for machine learning-based surrogate metrics to characterize creative output from these models is imperative. We propose group-based subset scanning to quantify, detect, and characterize creative processes by detecting a subset of anomalous node-activations in the hidden layers of generative models. Our experiments on original, typically decoded, and "creatively decoded" (Das et al 2020) image datasets reveal that the proposed subset scores distribution is more useful for detecting creative processes in the activation space rather than the pixel space. Further, we found that creative samples generate larger subsets of anomalies than normal or non-creative samples across datasets. The node activations highlighted during the creative decoding process are different from those responsible for normal sample generation.
△ Less
Submitted 26 May, 2021; v1 submitted 1 April, 2021;
originally announced April 2021.
-
Temporal Latent Auto-Encoder: A Method for Probabilistic Multivariate Time Series Forecasting
Authors:
Nam Nguyen,
Brian Quanz
Abstract:
Probabilistic forecasting of high dimensional multivariate time series is a notoriously challenging task, both in terms of computational burden and distribution modeling. Most previous work either makes simple distribution assumptions or abandons modeling cross-series correlations. A promising line of work exploits scalable matrix factorization for latent-space forecasting, but is limited to linea…
▽ More
Probabilistic forecasting of high dimensional multivariate time series is a notoriously challenging task, both in terms of computational burden and distribution modeling. Most previous work either makes simple distribution assumptions or abandons modeling cross-series correlations. A promising line of work exploits scalable matrix factorization for latent-space forecasting, but is limited to linear embeddings, unable to model distributions, and not trainable end-to-end when using deep learning forecasting. We introduce a novel temporal latent auto-encoder method which enables nonlinear factorization of multivariate time series, learned end-to-end with a temporal deep learning latent space forecast model. By imposing a probabilistic latent space model, complex distributions of the input series are modeled via the decoder. Extensive experiments demonstrate that our model achieves state-of-the-art performance on many popular multivariate datasets, with gains sometimes as high as $50\%$ for several standard metrics.
△ Less
Submitted 25 January, 2021;
originally announced January 2021.
-
Practical application improvement to Quantum SVM: theory to practice
Authors:
Jae-Eun Park,
Brian Quanz,
Steve Wood,
Heather Higgins,
Ray Harishankar
Abstract:
Quantum machine learning (QML) has emerged as an important area for Quantum applications, although useful QML applications would require many qubits. Therefore our paper is aimed at exploring the successful application of the Quantum Support Vector Machine (QSVM) algorithm while balancing several practical and technical considerations under the Noisy Intermediate-Scale Quantum (NISQ) assumption. F…
▽ More
Quantum machine learning (QML) has emerged as an important area for Quantum applications, although useful QML applications would require many qubits. Therefore our paper is aimed at exploring the successful application of the Quantum Support Vector Machine (QSVM) algorithm while balancing several practical and technical considerations under the Noisy Intermediate-Scale Quantum (NISQ) assumption. For the quantum SVM under NISQ, we use quantum feature maps to translate data into quantum states and build the SVM kernel out of these quantum states, and further compare with classical SVM with radial basis function (RBF) kernels. As data sets are more complex or abstracted in some sense, classical SVM with classical kernels leads to less accuracy compared to QSVM, as classical SVM with typical classical kernels cannot easily separate different class data. Similarly, QSVM should be able to provide competitive performance over a broader range of data sets including ``simpler'' data cases in which smoother decision boundaries are required to avoid any model variance issues (i.e., overfitting). To bridge the gap between ``classical-looking'' decision boundaries and complex quantum decision boundaries, we propose to utilize general shallow unitary transformations to create feature maps with rotation factors to define a tunable quantum kernel, and added regularization to smooth the separating hyperplane model. We show in experiments that this allows QSVM to perform equally to SVM regardless of the complexity of the data sets and outperform in some commonly used reference data sets.
△ Less
Submitted 14 December, 2020;
originally announced December 2020.
-
Machine learning based co-creative design framework
Authors:
Brian Quanz,
Wei Sun,
Ajay Deshpande,
Dhruv Shah,
Jae-eun Park
Abstract:
We propose a flexible, co-creative framework bringing together multiple machine learning techniques to assist human users to efficiently produce effective creative designs. We demonstrate its potential with a perfume bottle design case study, including human evaluation and quantitative and qualitative analyses.
We propose a flexible, co-creative framework bringing together multiple machine learning techniques to assist human users to efficiently produce effective creative designs. We demonstrate its potential with a perfume bottle design case study, including human evaluation and quantitative and qualitative analyses.
△ Less
Submitted 23 January, 2020;
originally announced January 2020.
-
Toward A Neuro-inspired Creative Decoder
Authors:
Payel Das,
Brian Quanz,
Pin-Yu Chen,
Jae-wook Ahn,
Dhruv Shah
Abstract:
Creativity, a process that generates novel and meaningful ideas, involves increased association between task-positive (control) and task-negative (default) networks in the human brain. Inspired by this seminal finding, in this study we propose a creative decoder within a deep generative framework, which involves direct modulation of the neuronal activation pattern after sampling from the learned l…
▽ More
Creativity, a process that generates novel and meaningful ideas, involves increased association between task-positive (control) and task-negative (default) networks in the human brain. Inspired by this seminal finding, in this study we propose a creative decoder within a deep generative framework, which involves direct modulation of the neuronal activation pattern after sampling from the learned latent space. The proposed approach is fully unsupervised and can be used off-the-shelf. Several novelty metrics and human evaluation were used to evaluate the creative capacity of the deep decoder. Our experiments on different image datasets (MNIST, FMNIST, MNIST+FMNIST, WikiArt and CelebA) reveal that atypical co-activation of highly activated and weakly activated neurons in a deep decoder promotes generation of novel and meaningful artifacts.
△ Less
Submitted 22 April, 2020; v1 submitted 6 February, 2019;
originally announced February 2019.