-
Cauchy-Schwarz Regularized Autoencoder
Authors:
Linh Tran,
Maja Pantic,
Marc Peter Deisenroth
Abstract:
Recent work in unsupervised learning has focused on efficient inference and learning in latent variables models. Training these models by maximizing the evidence (marginal likelihood) is typically intractable. Thus, a common approximation is to maximize the Evidence Lower BOund (ELBO) instead. Variational autoencoders (VAE) are a powerful and widely-used class of generative models that optimize th…
▽ More
Recent work in unsupervised learning has focused on efficient inference and learning in latent variables models. Training these models by maximizing the evidence (marginal likelihood) is typically intractable. Thus, a common approximation is to maximize the Evidence Lower BOund (ELBO) instead. Variational autoencoders (VAE) are a powerful and widely-used class of generative models that optimize the ELBO efficiently for large datasets. However, the VAE's default Gaussian choice for the prior imposes a strong constraint on its ability to represent the true posterior, thereby degrading overall performance. A Gaussian mixture model (GMM) would be a richer prior, but cannot be handled efficiently within the VAE framework because of the intractability of the Kullback-Leibler divergence for GMMs. We deviate from the common VAE framework in favor of one with an analytical solution for Gaussian mixture prior. To perform efficient inference for GMM priors, we introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs. This new objective allows us to incorporate richer, multi-modal priors into the autoencoding framework. We provide empirical studies on a range of datasets and show that our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
△ Less
Submitted 12 February, 2021; v1 submitted 6 January, 2021;
originally announced January 2021.
-
Dynamic Federated Learning-Based Economic Framework for Internet-of-Vehicles
Authors:
Yuris Mulya Saputra,
Dinh Thai Hoang,
Diep N. Nguyen,
Le-Nam Tran,
Shimin Gong,
Eryk Dutkiewicz
Abstract:
Federated learning (FL) can empower Internet-of-Vehicles (IoV) networks by leveraging smart vehicles (SVs) to participate in the learning process with minimum data exchanges and privacy disclosure. The collected data and learned knowledge can help the vehicular service provider (VSP) improve the global model accuracy, e.g., for road safety as well as better profits for both VSP and participating S…
▽ More
Federated learning (FL) can empower Internet-of-Vehicles (IoV) networks by leveraging smart vehicles (SVs) to participate in the learning process with minimum data exchanges and privacy disclosure. The collected data and learned knowledge can help the vehicular service provider (VSP) improve the global model accuracy, e.g., for road safety as well as better profits for both VSP and participating SVs. Nonetheless, there exist major challenges when implementing the FL in IoV networks, such as dynamic activities and diverse quality-of-information (QoI) from a large number of SVs, VSP's limited payment budget, and profit competition among SVs. In this paper, we propose a novel dynamic FL-based economic framework for an IoV network to address these challenges. Specifically, the VSP first implements an SV selection method to determine a set of the best SVs for the FL process according to the significance of their current locations and information history at each learning round. Then, each selected SV can collect on-road information and offer a payment contract to the VSP based on its collected QoI. For that, we develop a multi-principal one-agent contract-based policy to maximize the profits of the VSP and learning SVs under the VSP's limited payment budget and asymmetric information between the VSP and SVs. Through experimental results using real-world on-road datasets, we show that our framework can converge 57% faster (even with only 10% of active SVs in the network) and obtain much higher social welfare of the network (up to 27.2 times) compared with those of other baseline FL methods.
△ Less
Submitted 11 March, 2021; v1 submitted 1 January, 2021;
originally announced January 2021.
-
Learning from What We Know: How to Perform Vulnerability Prediction using Noisy Historical Data
Authors:
Aayush Garg,
Renzo Degiovanni,
Matthieu Jimenez,
Maxime Cordy,
Mike Papadakis,
Yves Le Traon
Abstract:
Vulnerability prediction refers to the problem of identifying system components that are most likely to be vulnerable. Typically, this problem is tackled by training binary classifiers on historical data. Unfortunately, recent research has shown that such approaches underperform due to the following two reasons: a) the imbalanced nature of the problem, and b) the inherently noisy historical data,…
▽ More
Vulnerability prediction refers to the problem of identifying system components that are most likely to be vulnerable. Typically, this problem is tackled by training binary classifiers on historical data. Unfortunately, recent research has shown that such approaches underperform due to the following two reasons: a) the imbalanced nature of the problem, and b) the inherently noisy historical data, i.e., most vulnerabilities are discovered much later than they are introduced. This misleads classifiers as they learn to recognize actual vulnerable components as non-vulnerable. To tackle these issues, we propose TROVON, a technique that learns from known vulnerable components rather than from vulnerable and non-vulnerable components, as typically performed. We perform this by contrasting the known vulnerable, and their respective fixed components. This way, TROVON manages to learn from the things we know, i.e., vulnerabilities, hence reducing the effects of noisy and unbalanced data. We evaluate TROVON by comparing it with existing techniques on three security-critical open source systems, i.e., Linux Kernel, OpenSSL, and Wireshark, with historical vulnerabilities that have been reported in the National Vulnerability Database (NVD). Our evaluation demonstrates that the prediction capability of TROVON significantly outperforms existing vulnerability prediction techniques such as Software Metrics, Imports, Function Calls, Text Mining, Devign, LSTM, and LSTM-RF with an improvement of 40.84% in Matthews Correlation Coefficient (MCC) score under Clean Training Data Settings, and an improvement of 35.52% under Realistic Training Data Settings.
△ Less
Submitted 25 July, 2022; v1 submitted 21 December, 2020;
originally announced December 2020.
-
Influence-Driven Data Poisoning in Graph-Based Semi-Supervised Classifiers
Authors:
Adriano Franci,
Maxime Cordy,
Martin Gubri,
Mike Papadakis,
Yves Le Traon
Abstract:
Graph-based Semi-Supervised Learning (GSSL) is a practical solution to learn from a limited amount of labelled data together with a vast amount of unlabelled data. However, due to their reliance on the known labels to infer the unknown labels, these algorithms are sensitive to data quality. It is therefore essential to study the potential threats related to the labelled data, more specifically, la…
▽ More
Graph-based Semi-Supervised Learning (GSSL) is a practical solution to learn from a limited amount of labelled data together with a vast amount of unlabelled data. However, due to their reliance on the known labels to infer the unknown labels, these algorithms are sensitive to data quality. It is therefore essential to study the potential threats related to the labelled data, more specifically, label poisoning. In this paper, we propose a novel data poisoning method which efficiently approximates the result of label inference to identify the inputs which, if poisoned, would produce the highest number of incorrectly inferred labels. We extensively evaluate our approach on three classification problems under 24 different experimental settings each. Compared to the state of the art, our influence-driven attack produces an average increase of error rate 50\% higher, while being faster by multiple orders of magnitude. Moreover, our method can inform engineers of inputs that deserve investigation (relabelling them) before training the learning model. We show that relabelling one-third of the poisoned inputs (selected based on their influence) reduces the poisoning effect by 50\%.
△ Less
Submitted 11 May, 2022; v1 submitted 14 December, 2020;
originally announced December 2020.
-
IBIR: Bug Report driven Fault Injection
Authors:
Ahmed Khanfir,
Anil Koyuncu,
Mike Papadakis,
Maxime Cordy,
Tegawendé F. Bissyandé,
Jacques Klein,
Yves Le Traon
Abstract:
Much research on software engineering and software testing relies on experimental studies based on fault injection. Fault injection, however, is not often relevant to emulate real-world software faults since it "blindly" injects large numbers of faults. It remains indeed challenging to inject few but realistic faults that target a particular functionality in a program. In this work, we introduce I…
▽ More
Much research on software engineering and software testing relies on experimental studies based on fault injection. Fault injection, however, is not often relevant to emulate real-world software faults since it "blindly" injects large numbers of faults. It remains indeed challenging to inject few but realistic faults that target a particular functionality in a program. In this work, we introduce IBIR, a fault injection tool that addresses this challenge by exploring change patterns associated to user-reported faults. To inject realistic faults, we create mutants by retargeting a bug report driven automated program repair system, i.e., reversing its code transformation templates. IBIR is further appealing in practice since it requires deep knowledge of neither of the code nor the tests, but just of the program's relevant bug reports. Thus, our approach focuses the fault injection on the feature targeted by the bug report. We assess IBIR by considering the Defects4J dataset. Experimental results show that our approach outperforms the fault injection performed by traditional mutation testing in terms of semantic similarity with the original bug, when applied at either system or class levels of granularity, and provides better, statistically significant, estimations of test effectiveness (fault detection). Additionally, when injecting 100 faults, IBIR injects faults that couple with the real ones in 36% of the cases, while mutants from mutation testing inject less than 1%. Overall, IBIR targets real functionality and injects realistic and diverse faults.
△ Less
Submitted 11 December, 2020;
originally announced December 2020.
-
On the Secrecy Capacity of MIMO Wiretap Channels: Convex Reformulation and Efficient Numerical Methods
Authors:
Anshu Mukherjee,
Björn Ottersten,
Le-Nam Tran
Abstract:
This paper presents novel numerical approaches to finding the secrecy capacity of the multiple-input multiple-output (MIMO) wiretap channel subject to multiple linear transmit covariance constraints, including sum power constraint, per antenna power constraints and interference power constraint. An analytical solution to this problem is not known and existing numerical solutions suffer from slow c…
▽ More
This paper presents novel numerical approaches to finding the secrecy capacity of the multiple-input multiple-output (MIMO) wiretap channel subject to multiple linear transmit covariance constraints, including sum power constraint, per antenna power constraints and interference power constraint. An analytical solution to this problem is not known and existing numerical solutions suffer from slow convergence rate and/or high per-iteration complexity. Deriving computationally efficient solutions to the secrecy capacity problem is challenging since the secrecy rate is expressed as a difference of convex functions (DC) of the transmit covariance matrix, for which its convexity is only known for some special cases. In this paper we propose two low-complexity methods to compute the secrecy capacity along with a convex reformulation for degraded channels. In the first method we capitalize on the accelerated DC algorithm which requires solving a sequence of convex subproblems, for which we propose an efficient iterative algorithm where each iteration admits a closed-form solution. In the second method, we rely on the concave-convex equivalent reformulation of the secrecy capacity problem which allows us to derive the so-called partial best response algorithm to obtain an optimal solution. Notably, each iteration of the second method can also be done in closed form. The simulation results demonstrate a faster convergence rate of our methods compared to other known solutions. We carry out extensive numerical experiments to evaluate the impact of various parameters on the achieved secrecy capacity.
△ Less
Submitted 8 July, 2021; v1 submitted 10 December, 2020;
originally announced December 2020.
-
Optimization of RIS-aided MIMO Systems via the Cutoff Rate
Authors:
Nemanja Stefan Perović,
Le-Nam Tran,
Marco Di Renzo,
Mark F. Flanagan
Abstract:
The main difficulty concerning optimizing the mutual information (MI) in reconfigurable intelligent surface (RIS)-aided communication systems with discrete signaling is the inability to formulate this optimization problem in an analytically tractable manner. Therefore, we propose to use the cutoff rate (CR) as a more tractable metric for optimizing the MI and introduce two optimization methods to…
▽ More
The main difficulty concerning optimizing the mutual information (MI) in reconfigurable intelligent surface (RIS)-aided communication systems with discrete signaling is the inability to formulate this optimization problem in an analytically tractable manner. Therefore, we propose to use the cutoff rate (CR) as a more tractable metric for optimizing the MI and introduce two optimization methods to maximize the CR, assuming perfect knowledge of the channel state information (CSI). The first method is based on the projected gradient method (PGM), while the second method is derived from the principles of successive convex approximation (SCA). Simulation results show that the proposed optimization methods significantly enhance the CR and the corresponding MI.
△ Less
Submitted 3 September, 2021; v1 submitted 9 December, 2020;
originally announced December 2020.
-
FlexiRepair: Transparent Program Repair with Generic Patches
Authors:
Anil Koyuncu,
Tegawendé F. Bissyandé,
Jacques Klein,
Yves Le Traon
Abstract:
Template-based program repair research is in need for a common ground to express fix patterns in a standard and reusable manner. We propose to build on the concept of generic patch (also known as semantic patch), which is widely used in the Linux community to automate code evolution. We advocate that generic patches could provide at the same time a unified representation and a specification for fi…
▽ More
Template-based program repair research is in need for a common ground to express fix patterns in a standard and reusable manner. We propose to build on the concept of generic patch (also known as semantic patch), which is widely used in the Linux community to automate code evolution. We advocate that generic patches could provide at the same time a unified representation and a specification for fix patterns. Generic patches are indeed formally defined, and there exists a robust, industry-adapted, and extensible engine that processes generic patches to perform control-flow code matching and automatically generates concretes patches based on the specified change operations. In this paper, we present the design and implementation of a repair framework, FLEXIREPAIR, that explores generic patches as the core concept. In particular, we show how concretely generic patches can be inferred and applied in a pipeline of Automated Program Repair (APR). With FLEXIREPAIR, we address an urgent challenge in the template-based APR community to separate implementation details from actual scientific contributions by providing an open, transparent and flexible repair pipeline on top of which all advancements in terms of efficiency, efficacy and usability can be measured and assessed rigorously. Furthermore, because the underlying tools and concepts have already been accepted by a wide practitioner community, we expect FLEXIREPAIR's adoption by industry to be facilitated. Preliminary experiments with a prototype FLEXIREPAIR on the IntroClass and CodeFlaws benchmarks suggest that it already constitutes a solid baseline with comparable performance to some of the state of the art.
△ Less
Submitted 26 November, 2020;
originally announced November 2020.
-
A Low-Complexity Approach for Max-Min Fairness in Uplink Cell-Free Massive MIMO
Authors:
Muhammad Farooq,
Hien Quoc Ngo,
Le Nam Tran
Abstract:
We consider the problem of max-min fairness for uplink cell-free massive multiple-input multiple-output which is a potential technology for beyond 5G networks. More specifically, we aim to maximize the minimum spectral efficiency of all users subject to the per-user power constraint, assuming linear receive combining technique at access points. The considered problem can be further divided into tw…
▽ More
We consider the problem of max-min fairness for uplink cell-free massive multiple-input multiple-output which is a potential technology for beyond 5G networks. More specifically, we aim to maximize the minimum spectral efficiency of all users subject to the per-user power constraint, assuming linear receive combining technique at access points. The considered problem can be further divided into two subproblems: the receiver filter coefficient design and the power control problem. While the receiver coefficient design turns out to be a generalized eigenvalue problem, and thus, admits a closed-form solution, the power control problem is numerically troublesome. To solve the power control problem, existing approaches rely on geometric programming (GP) which is not suitable for large-scale systems. To overcome the high-complexity issue of the GP method, we first reformulate the power control problem intro a convex program, and then apply a smoothing technique in combination with an accelerated projected gradient method to solve it. The simulation results demonstrate that the proposed solution can achieve almost the same objective but in much lesser time than the existing GP-based method.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
Efficient and Transferable Adversarial Examples from Bayesian Neural Networks
Authors:
Martin Gubri,
Maxime Cordy,
Mike Papadakis,
Yves Le Traon,
Koushik Sen
Abstract:
An established way to improve the transferability of black-box evasion attacks is to craft the adversarial examples on an ensemble-based surrogate to increase diversity. We argue that transferability is fundamentally related to uncertainty. Based on a state-of-the-art Bayesian Deep Learning technique, we propose a new method to efficiently build a surrogate by sampling approximately from the poste…
▽ More
An established way to improve the transferability of black-box evasion attacks is to craft the adversarial examples on an ensemble-based surrogate to increase diversity. We argue that transferability is fundamentally related to uncertainty. Based on a state-of-the-art Bayesian Deep Learning technique, we propose a new method to efficiently build a surrogate by sampling approximately from the posterior distribution of neural network weights, which represents the belief about the value of each parameter. Our extensive experiments on ImageNet, CIFAR-10 and MNIST show that our approach improves the success rates of four state-of-the-art attacks significantly (up to 83.2 percentage points), in both intra-architecture and inter-architecture transferability. On ImageNet, our approach can reach 94% of success rate while reducing training computations from 11.6 to 2.4 exaflops, compared to an ensemble of independently trained DNNs. Our vanilla surrogate achieves 87.5% of the time higher transferability than three test-time techniques designed for this purpose. Our work demonstrates that the way to train a surrogate has been overlooked, although it is an important element of transfer-based attacks. We are, therefore, the first to review the effectiveness of several training methods in increasing transferability. We provide new directions to better understand the transferability phenomenon and offer a simple but strong baseline for future work.
△ Less
Submitted 18 June, 2022; v1 submitted 10 November, 2020;
originally announced November 2020.
-
Utility Maximization for Large-Scale Cell-Free Massive MIMO Downlink
Authors:
Muhammad Farooq,
Hien Quoc Ngo,
Een-Kee Hong,
Le-Nam Tran
Abstract:
We consider the system-wide utility maximization problem in the downlink of a cell-free massive multiple-input multiple-output (MIMO) system whereby a very large number of access points (APs) simultaneously serve a group of users. Specifically, four fundamental problems with increasing order of user fairness are of interest: (i) to maximize the average spectral efficiency (SE), (ii) to maximize th…
▽ More
We consider the system-wide utility maximization problem in the downlink of a cell-free massive multiple-input multiple-output (MIMO) system whereby a very large number of access points (APs) simultaneously serve a group of users. Specifically, four fundamental problems with increasing order of user fairness are of interest: (i) to maximize the average spectral efficiency (SE), (ii) to maximize the proportional fairness, (iii) to maximize the harmonic-rate of all users, and lastly (iv) to maximize the minimum SE of all users, subject to a sum power constraint at each AP. As the considered problems are non-convex, existing solutions normally rely on successive convex approximation to find a sub-optimal solution. More specifically, these known methods use off-the-shelf convex solvers, which basically implement an interior-point algorithm, to solve the derived convex problems. The main issue of such methods is that their complexity does not scale favorably with the problem size, limiting previous studies to cell-free massive MIMO of moderate scales. Thus the potential of cell-free massive MIMO has not been fully understood. To address this issue, we propose a unified framework based on an accelerated projected gradient method to solve the considered problems. Particularly, the proposed solution is found in closed-form expressions and only requires the first order oracle of the objective, rather than the Hessian matrix as in known solutions, and thus is much more memory efficient. Numerical results demonstrate that our proposed solution achieves the same utility performance but with far less run-time, compared to other second-order methods. Simulation results for large-scale cell-free massive MIMO show that the four utility functions can deliver nearly uniformed services to all users. In other words, user fairness is not a great concern in large-scale cell-free massive MIMO.
△ Less
Submitted 20 September, 2020; v1 submitted 15 September, 2020;
originally announced September 2020.
-
Achievable Rate Optimization for MIMO Systems with Reconfigurable Intelligent Surfaces
Authors:
Nemanja Stefan Perović,
Le-Nam Tran,
Marco Di Renzo,
Mark F. Flanagan
Abstract:
Reconfigurable intelligent surfaces (RISs) represent a radical new technology that can shape the radio wave propagation in wireless communication systems and offers a great variety of possible performance and implementation gains. Motivated by this, in this paper we study the achievable rate optimization for a multi-stream multiple-input multiple-output (MIMO) system equipped with an RIS, and form…
▽ More
Reconfigurable intelligent surfaces (RISs) represent a radical new technology that can shape the radio wave propagation in wireless communication systems and offers a great variety of possible performance and implementation gains. Motivated by this, in this paper we study the achievable rate optimization for a multi-stream multiple-input multiple-output (MIMO) system equipped with an RIS, and formulate a joint optimization problem of the covariance matrix of the transmitted signal and the RIS elements. To solve this problem, we propose an iterative optimization algorithm that is based on the projected gradient method (PGM). We derive the step size that guarantees the convergence of the proposed algorithm and we define a backtracking line search to improve its convergence rate. Furthermore, we introduce the total free space path loss (FSPL) ratio of the indirect and direct links as a first-order measure of the applicability of an RIS in the considered communication system. Simulation results show that the proposed PGM achieves the same achievable rate as a state-of-the-art benchmark scheme, but with a significantly lower computational complexity. In addition, it is demonstrated that the RIS application is particularly suitable to increase the achievable rate in an indoor environment, as in this case even a small number of RIS elements is sufficient to provide a substantial achievable rate gain.
△ Less
Submitted 3 September, 2021; v1 submitted 21 August, 2020;
originally announced August 2020.
-
Directed hypergraph neural network
Authors:
Loc Hoang Tran,
Linh Hoang Tran
Abstract:
To deal with irregular data structure, graph convolution neural networks have been developed by a lot of data scientists. However, data scientists just have concentrated primarily on developing deep neural network method for un-directed graph. In this paper, we will present the novel neural network method for directed hypergraph. In the other words, we will develop not only the novel directed hype…
▽ More
To deal with irregular data structure, graph convolution neural networks have been developed by a lot of data scientists. However, data scientists just have concentrated primarily on developing deep neural network method for un-directed graph. In this paper, we will present the novel neural network method for directed hypergraph. In the other words, we will develop not only the novel directed hypergraph neural network method but also the novel directed hypergraph based semi-supervised learning method. These methods are employed to solve the node classification task. The two datasets that are used in the experiments are the cora and the citeseer datasets. Among the classic directed graph based semi-supervised learning method, the novel directed hypergraph based semi-supervised learning method, the novel directed hypergraph neural network method that are utilized to solve this node classification task, we recognize that the novel directed hypergraph neural network achieves the highest accuracies.
△ Less
Submitted 3 September, 2022; v1 submitted 8 August, 2020;
originally announced August 2020.
-
On the Efficiency of Test Suite based Program Repair: A Systematic Assessment of 16 Automated Repair Systems for Java Programs
Authors:
Kui Liu,
Shangwen Wang,
Anil Koyuncu,
Kisub Kim,
Tegawendé F. Bissyandé,
Dongsun Kim,
Peng Wu,
Jacques Klein,
Xiaoguang Mao,
Yves Le Traon
Abstract:
Test-based automated program repair has been a prolific field of research in software engineering in the last decade. Many approaches have indeed been proposed, which leverage test suites as a weak, but affordable, approximation to program specifications. Although the literature regularly sets new records on the number of benchmark bugs that can be fixed, several studies increasingly raise concern…
▽ More
Test-based automated program repair has been a prolific field of research in software engineering in the last decade. Many approaches have indeed been proposed, which leverage test suites as a weak, but affordable, approximation to program specifications. Although the literature regularly sets new records on the number of benchmark bugs that can be fixed, several studies increasingly raise concerns about the limitations and biases of state-of-the-art approaches. For example, the correctness of generated patches has been questioned in a number of studies, while other researchers pointed out that evaluation schemes may be misleading with respect to the processing of fault localization results. Nevertheless, there is little work addressing the efficiency of patch generation, with regard to the practicality of program repair. In this paper, we fill this gap in the literature, by providing an extensive review on the efficiency of test suite based program repair. Our objective is to assess the number of generated patch candidates, since this information is correlated to (1) the strategy to traverse the search space efficiently in order to select sensical repair attempts, (2) the strategy to minimize the test effort for identifying a plausible patch, (3) as well as the strategy to prioritize the generation of a correct patch. To that end, we perform a large-scale empirical study on the efficiency, in terms of quantity of generated patch candidates of the 16 open-source repair tools for Java programs. The experiments are carefully conducted under the same fault localization configurations to limit biases.
△ Less
Submitted 3 August, 2020;
originally announced August 2020.
-
Improving excited state potential energy surfaces via optimal orbital shapes
Authors:
Lan Nguyen Tran,
Eric Neuscamman
Abstract:
We demonstrate that, rather than resorting to high-cost dynamic correlation methods, qualitative failures in excited-state potential energy surface predictions can often be remedied at no additional cost by ensuring that optimal molecular orbitals are used for each individual excited state. This approach also avoids the weighting choices required by state-averaging and dynamic weighting and obviat…
▽ More
We demonstrate that, rather than resorting to high-cost dynamic correlation methods, qualitative failures in excited-state potential energy surface predictions can often be remedied at no additional cost by ensuring that optimal molecular orbitals are used for each individual excited state. This approach also avoids the weighting choices required by state-averaging and dynamic weighting and obviates their need for expensive wave function response calculations when relaxing excited state geometries. Although multi-state approaches are of course preferred near conical intersections, other features of excited-state potential energy surfaces can benefit significantly from our single state approach. In three different systems, including a double bond dissociation, a biologically relevant amino hydrogen dissociation, and an amino-to-ring intramolecular charge transfer, we show that state-specific orbitals offer qualitative improvements over the state-averaged status quo.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.
-
Data-driven Simulation and Optimization for Covid-19 Exit Strategies
Authors:
Salah Ghamizi,
Renaud Rwemalika,
Lisa Veiber,
Maxime Cordy,
Tegawende F. Bissyande,
Mike Papadakis,
Jacques Klein,
Yves Le Traon
Abstract:
The rapid spread of the Coronavirus SARS-2 is a major challenge that led almost all governments worldwide to take drastic measures to respond to the tragedy. Chief among those measures is the massive lockdown of entire countries and cities, which beyond its global economic impact has created some deep social and psychological tensions within populations. While the adopted mitigation measures (incl…
▽ More
The rapid spread of the Coronavirus SARS-2 is a major challenge that led almost all governments worldwide to take drastic measures to respond to the tragedy. Chief among those measures is the massive lockdown of entire countries and cities, which beyond its global economic impact has created some deep social and psychological tensions within populations. While the adopted mitigation measures (including the lockdown) have generally proven useful, policymakers are now facing a critical question: how and when to lift the mitigation measures? A carefully-planned exit strategy is indeed necessary to recover from the pandemic without risking a new outbreak. Classically, exit strategies rely on mathematical modeling to predict the effect of public health interventions. Such models are unfortunately known to be sensitive to some key parameters, which are usually set based on rules-of-thumb.In this paper, we propose to augment epidemiological forecasting with actual data-driven models that will learn to fine-tune predictions for different contexts (e.g., per country). We have therefore built a pandemic simulation and forecasting toolkit that combines a deep learning estimation of the epidemiological parameters of the disease in order to predict the cases and deaths, and a genetic algorithm component searching for optimal trade-offs/policies between constraints and objectives set by decision-makers. Replaying pandemic evolution in various countries, we experimentally show that our approach yields predictions with much lower error rates than pure epidemiological models in 75% of the cases and achieves a 95% R2 score when the learning is transferred and tested on unseen countries. When used for forecasting, this approach provides actionable insights into the impact of individual measures and strategies.
△ Less
Submitted 12 June, 2020;
originally announced June 2020.
-
Noncoherent Joint Transmission Beamforming for Dense Small Cell Networks: Global Optimality, Efficient Solution and Distributed Implementation
Authors:
Quang-Doanh Vu,
Le-Nam Tran,
Markku Juntti
Abstract:
We investigate the coordinated multi-point noncoherent joint transmission (JT) in dense small cell networks. The goal is to design beamforming vectors for macro cell and small cell base stations (BSs) such that the weighted sum rate of the system is maximized, subject to a total transmit power at individual BSs. The optimization problem is inherently nonconvex and intractable, making it difficult…
▽ More
We investigate the coordinated multi-point noncoherent joint transmission (JT) in dense small cell networks. The goal is to design beamforming vectors for macro cell and small cell base stations (BSs) such that the weighted sum rate of the system is maximized, subject to a total transmit power at individual BSs. The optimization problem is inherently nonconvex and intractable, making it difficult to explore the full potential performance of the scheme. To this end, we first propose an algorithm to find a globally optimal solution based on the generic monotonic branch reduce and bound optimization framework. Then, for a more computationally efficient method, we adopt the inner approximation (InAp) technique to efficiently derive a locally optimal solution, which is numerically shown to achieve near-optimal performance. In addition, for decentralized networks such as those comprising of multi-access edge computing servers, we develop an algorithm based on the alternating direction method of multipliers, which distributively implements the InAp-based solution. Our main conclusion is that the noncoherent JT is a promising transmission scheme for dense small cell networks, since it can exploit the densitification gain, outperforms the coordinated beamforming, and is amenable to distributed implementation.
△ Less
Submitted 25 May, 2020;
originally announced May 2020.
-
XACs-DyPol: Towards an XACML-based Access Control Model for Dynamic Security Policy
Authors:
Tran Khanh Dang,
Xuan Son Ha,
Luong Khiem Tran
Abstract:
Authorization and access control play an essential role in protecting sensitive information from malicious users. The system is based on security policies to determine if an access request is allowed. However, of late, the growing popularity of big data has created a new challenge which the security policy management is facing with such as dynamic and update policies in run time. Applications of d…
▽ More
Authorization and access control play an essential role in protecting sensitive information from malicious users. The system is based on security policies to determine if an access request is allowed. However, of late, the growing popularity of big data has created a new challenge which the security policy management is facing with such as dynamic and update policies in run time. Applications of dynamic policies have brought many benefits to modern domains. To the best of our knowledge, there are no previous studies focusing on solving authorization problems in the dynamic policy environments. In this article, we focus on analyzing and classifying when an update policy occurs, and provide a pragmatic solution for such dynamic policies. The contribution of this work is twofold: a novel solution for managing the policy changes even when the access request has been granted, and an XACML-based implementation to empirically evaluate the proposed solution. The experimental results show the comparison between the newly introduced XACs-DyPol framework with Balana (an open source framework supporting XACML 3.0). The datasets are XACML 3.0-based policies, including three samples of real-world policy sets. According to the comparison results, our XACs-DyPol framework performs better than Balana in terms of all updates in dynamic security policy cases. Specially, our proposed solution outperforms by an order of magnitude when the policy structure includes complex policy sets, policies, and rules or some complicated comparison expression which contains higher than function and less than function.
△ Less
Submitted 10 April, 2020;
originally announced May 2020.
-
The Medical Scribe: Corpus Development and Model Performance Analyses
Authors:
Izhak Shafran,
Nan Du,
Linh Tran,
Amanda Perry,
Lauren Keyes,
Mark Knichel,
Ashley Domin,
Lei Huang,
Yuhui Chen,
Gang Li,
Mingqiu Wang,
Laurent El Shafey,
Hagen Soltau,
Justin S. Paul
Abstract:
There is a growing interest in creating tools to assist in clinical note generation using the audio of provider-patient encounters. Motivated by this goal and with the help of providers and medical scribes, we developed an annotation scheme to extract relevant clinical concepts. We used this annotation scheme to label a corpus of about 6k clinical encounters. This was used to train a state-of-the-…
▽ More
There is a growing interest in creating tools to assist in clinical note generation using the audio of provider-patient encounters. Motivated by this goal and with the help of providers and medical scribes, we developed an annotation scheme to extract relevant clinical concepts. We used this annotation scheme to label a corpus of about 6k clinical encounters. This was used to train a state-of-the-art tagging model. We report ontologies, labeling results, model performances, and detailed analyses of the results. Our results show that the entities related to medications can be extracted with a relatively high accuracy of 0.90 F-score, followed by symptoms at 0.72 F-score, and conditions at 0.57 F-score. In our task, we not only identify where the symptoms are mentioned but also map them to canonical forms as they appear in the clinical notes. Of the different types of errors, in about 19-38% of the cases, we find that the model output was correct, and about 17-32% of the errors do not impact the clinical note. Taken together, the models developed in this work are more useful than the F-scores reflect, making it a promising approach for practical applications.
△ Less
Submitted 11 March, 2020;
originally announced March 2020.
-
FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques
Authors:
Tai Vu,
Leon Tran
Abstract:
Reinforcement learning is one of the most popular approaches for automated game playing. This method allows an agent to estimate the expected utility of its state in order to make optimal actions in an unknown environment. We seek to apply reinforcement learning algorithms to the game Flappy Bird. We implement SARSA and Q-Learning with some modifications such as $ε$-greedy policy, discretization a…
▽ More
Reinforcement learning is one of the most popular approaches for automated game playing. This method allows an agent to estimate the expected utility of its state in order to make optimal actions in an unknown environment. We seek to apply reinforcement learning algorithms to the game Flappy Bird. We implement SARSA and Q-Learning with some modifications such as $ε$-greedy policy, discretization and backward updates. We find that SARSA and Q-Learning outperform the baseline, regularly achieving scores of 1400+, with the highest in-game score of 2069.
△ Less
Submitted 8 April, 2020; v1 submitted 21 March, 2020;
originally announced March 2020.
-
Undulation instabilities in cholesteric liquid crystals induced by anchoring transitions
Authors:
Maxim O. Lavrentovich,
Lisa Tran
Abstract:
Cholesteric liquid crystals (CLCs) have a characteristic length scale given by the pitch of the twisted stacking of their constituent rod-like molecules. Under homeotropic anchoring conditions where the molecules prefer to orient perpendicular to an interface, cholesteric interfaces exhibit striped phases with stripe widths commensurate with the pitch. Conversely, planar anchoring conditions have…
▽ More
Cholesteric liquid crystals (CLCs) have a characteristic length scale given by the pitch of the twisted stacking of their constituent rod-like molecules. Under homeotropic anchoring conditions where the molecules prefer to orient perpendicular to an interface, cholesteric interfaces exhibit striped phases with stripe widths commensurate with the pitch. Conversely, planar anchoring conditions have the molecules remain in the plane of the interface so that the CLC twists perpendicular to it. Recent work [L. Tran et al. Phys. Rev. X 7, 041029 (2017)] shows that varying the anchoring conditions dramatically rearranges the CLC stripe pattern, exchanging defects in the stripe pattern with defects in the molecular orientation of the liquid crystal molecules. We show with experiments and numerical simulations that the CLC stripes also undergo an undulation instability when we transition from homeotropic to planar anchoring conditions and vice versa. The undulation can be interpreted as a transient relaxation of the CLC resulting from a strain in the cholesteric layers due to a tilting pitch axis, with properties analogous to the classic Helfrich-Hurault instability. We focus on CLC shells in particular and show that the spherical topology of the shell also plays an important role in shaping the undulations.
△ Less
Submitted 10 February, 2020;
originally announced February 2020.
-
The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks
Authors:
Jakub Swiatkowski,
Kevin Roth,
Bastiaan S. Veeling,
Linh Tran,
Joshua V. Dillon,
Jasper Snoek,
Stephan Mandt,
Tim Salimans,
Rodolphe Jenatton,
Sebastian Nowozin
Abstract:
Variational Bayesian Inference is a popular methodology for approximating posterior distributions over Bayesian neural network weights. Recent work developing this class of methods has explored ever richer parameterizations of the approximate posterior in the hope of improving performance. In contrast, here we share a curious experimental finding that suggests instead restricting the variational d…
▽ More
Variational Bayesian Inference is a popular methodology for approximating posterior distributions over Bayesian neural network weights. Recent work developing this class of methods has explored ever richer parameterizations of the approximate posterior in the hope of improving performance. In contrast, here we share a curious experimental finding that suggests instead restricting the variational distribution to a more compact parameterization. For a variety of deep Bayesian neural networks trained using Gaussian mean-field variational inference, we find that the posterior standard deviations consistently exhibit strong low-rank structure after convergence. This means that by decomposing these variational parameters into a low-rank factorization, we can make our variational approximation more compact without decreasing the models' performance. Furthermore, we find that such factorized parameterizations improve the signal-to-noise ratio of stochastic gradient estimates of the variational lower bound, resulting in faster convergence.
△ Less
Submitted 5 July, 2020; v1 submitted 7 February, 2020;
originally announced February 2020.
-
What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer Learning
Authors:
Patrick Keller,
Laura Plein,
Tegawendé F. Bissyandé,
Jacques Klein,
Yves Le Traon
Abstract:
Recent successes in training word embeddings for NLP tasks have encouraged a wave of research on representation learning for source code, which builds on similar NLP methods. The overall objective is then to produce code embeddings that capture the maximum of program semantics. State-of-the-art approaches invariably rely on a syntactic representation (i.e., raw lexical tokens, abstract syntax tree…
▽ More
Recent successes in training word embeddings for NLP tasks have encouraged a wave of research on representation learning for source code, which builds on similar NLP methods. The overall objective is then to produce code embeddings that capture the maximum of program semantics. State-of-the-art approaches invariably rely on a syntactic representation (i.e., raw lexical tokens, abstract syntax trees, or intermediate representation tokens) to generate embeddings, which are criticized in the literature as non-robust or non-generalizable. In this work, we investigate a novel embedding approach based on the intuition that source code has visual patterns of semantics. We further use these patterns to address the outstanding challenge of identifying semantic code clones. We propose the WYSIWIM ("What You See Is What It Means") approach where visual representations of source code are fed into powerful pre-trained image classification neural networks from the field of computer vision to benefit from the practical advantages of transfer learning. We evaluate the proposed embedding approach on two variations of the task of semantic code clone identification: code clone detection (a binary classification problem), and code classification (a multi-classification problem). We show with experiments on the BigCloneBench (Java) and Open Judge (C) datasets that although simple, our WYSIWIM approach performs as effectively as state of the art approaches such as ASTNN or TBCNN. We further explore the influence of different steps in our approach, such as the choice of visual representations or the classification algorithm, to eventually discuss the promises and limitations of this research direction.
△ Less
Submitted 7 February, 2020;
originally announced February 2020.
-
How Good is the Bayes Posterior in Deep Neural Networks Really?
Authors:
Florian Wenzel,
Kevin Roth,
Bastiaan S. Veeling,
Jakub Świątkowski,
Linh Tran,
Stephan Mandt,
Jasper Snoek,
Tim Salimans,
Rodolphe Jenatton,
Sebastian Nowozin
Abstract:
During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neura…
▽ More
During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neural networks in industrial practice. In this work we cast doubt on the current understanding of Bayes posteriors in popular deep neural networks: we demonstrate through careful MCMC sampling that the posterior predictive induced by the Bayes posterior yields systematically worse predictions compared to simpler methods including point estimates obtained from SGD. Furthermore, we demonstrate that predictive performance is improved significantly through the use of a "cold posterior" that overcounts evidence. Such cold posteriors sharply deviate from the Bayesian paradigm but are commonly used as heuristic in Bayesian deep learning papers. We put forward several hypotheses that could explain cold posteriors and evaluate the hypotheses through experiments. Our work questions the goal of accurate posterior approximations in Bayesian deep learning: If the true Bayes posterior is poor, what is the use of more accurate approximations? Instead, we argue that it is timely to focus on understanding the origin of the improved performance of cold posteriors.
△ Less
Submitted 2 July, 2020; v1 submitted 6 February, 2020;
originally announced February 2020.
-
Learning to Catch Security Patches
Authors:
Arthur D. Sawadogo,
Tegawendé F. Bissyandé,
Naouel Moha,
Kevin Allix,
Jacques Klein,
Li Li,
Yves Le Traon
Abstract:
Timely patching is paramount to safeguard users and maintainers against dire consequences of malicious attacks. In practice, patching is prioritized following the nature of the code change that is committed in the code repository. When such a change is labeled as being security-relevant, i.e., as fixing a vulnerability, maintainers rapidly spread the change and users are notified about the need to…
▽ More
Timely patching is paramount to safeguard users and maintainers against dire consequences of malicious attacks. In practice, patching is prioritized following the nature of the code change that is committed in the code repository. When such a change is labeled as being security-relevant, i.e., as fixing a vulnerability, maintainers rapidly spread the change and users are notified about the need to update to a new version of the library or of the application. Unfortunately, oftentimes, some security-relevant changes go unnoticed as they represent silent fixes of vulnerabilities. In this paper, we propose a Co-Training-based approach to catch security patches as part of an automatic monitoring service of code repositories. Leveraging different classes of features, we empirically show that such automation is feasible and can yield a precision of over 90% in identifying security patches, with an unprecedented recall of over 80%. Beyond such a benchmarking with ground truth data which demonstrates an improvement over the state-of-the-art, we confirmed that our approach can help catch security patches that were not reported as such.
△ Less
Submitted 24 January, 2020;
originally announced January 2020.
-
Hydra: Preserving Ensemble Diversity for Model Distillation
Authors:
Linh Tran,
Bastiaan S. Veeling,
Kevin Roth,
Jakub Swiatkowski,
Joshua V. Dillon,
Jasper Snoek,
Stephan Mandt,
Tim Salimans,
Sebastian Nowozin,
Rodolphe Jenatton
Abstract:
Ensembles of models have been empirically shown to improve predictive performance and to yield robust measures of uncertainty. However, they are expensive in computation and memory. Therefore, recent research has focused on distilling ensembles into a single compact model, reducing the computational and memory burden of the ensemble while trying to preserve its predictive behavior. Most existing d…
▽ More
Ensembles of models have been empirically shown to improve predictive performance and to yield robust measures of uncertainty. However, they are expensive in computation and memory. Therefore, recent research has focused on distilling ensembles into a single compact model, reducing the computational and memory burden of the ensemble while trying to preserve its predictive behavior. Most existing distillation formulations summarize the ensemble by capturing its average predictions. As a result, the diversity of the ensemble predictions, stemming from each member, is lost. Thus, the distilled model cannot provide a measure of uncertainty comparable to that of the original ensemble. To retain more faithfully the diversity of the ensemble, we propose a distillation method based on a single multi-headed neural network, which we refer to as Hydra. The shared body network learns a joint feature representation that enables each head to capture the predictive behavior of each ensemble member. We demonstrate that with a slight increase in parameter count, Hydra improves distillation performance on classification and regression settings while capturing the uncertainty behavior of the original ensemble over both in-domain and out-of-distribution tasks.
△ Less
Submitted 19 March, 2021; v1 submitted 14 January, 2020;
originally announced January 2020.
-
Killing Stubborn Mutants with Symbolic Execution
Authors:
Thierry Titcheu Chekam,
Mike Papadakis,
Maxime Cordy,
Yves Le Traon
Abstract:
We introduce SeMu, a Dynamic Symbolic Execution technique that generates test inputs capable of killing stubborn mutants (killable mutants that remain undetected after a reasonable amount of testing). SeMu aims at mutant propagation (triggering erroneous states to the program output) by incrementally searching for divergent program behaviours between the original and the mutant versions. We model…
▽ More
We introduce SeMu, a Dynamic Symbolic Execution technique that generates test inputs capable of killing stubborn mutants (killable mutants that remain undetected after a reasonable amount of testing). SeMu aims at mutant propagation (triggering erroneous states to the program output) by incrementally searching for divergent program behaviours between the original and the mutant versions. We model the mutant killing problem as a symbolic execution search within a specific area in the programs' symbolic tree. In this framework, the search area is defined and controlled by parameters that allow scalable and cost-effective mutant killing. We integrate SeMu in KLEE and experimented with Coreutils (a benchmark frequently used in symbolic execution studies). Our results show that our modelling plays an important role in mutant killing. Perhaps more importantly, our results also show that, within a two-hour time limit, SeMu kills 37% of the stubborn mutants, where KLEE kills none and where the mutant infection strategy (strategy suggested by previous research) kills 17%.
△ Less
Submitted 9 January, 2020;
originally announced January 2020.
-
Swelling cholesteric liquid crystal shells to direct colloids at the interface
Authors:
Lisa Tran,
Kyle J. M. Bishop
Abstract:
Cholesteric liquid crystals can exhibit spatial patterns in molecular alignment at interfaces that can be exploited for particle assembly. These patterns emerge from the competition between bulk and surface energies, tunable with the system geometry. In this work, we use the osmotic swelling of cholesteric double emulsions to assemble colloidal particles through a pathway-dependent process. Partic…
▽ More
Cholesteric liquid crystals can exhibit spatial patterns in molecular alignment at interfaces that can be exploited for particle assembly. These patterns emerge from the competition between bulk and surface energies, tunable with the system geometry. In this work, we use the osmotic swelling of cholesteric double emulsions to assemble colloidal particles through a pathway-dependent process. Particles can be repositioned from a surface-mediated to an elasticity-mediated state through dynamically thinning the cholesteric shell at a rate comparable to that of colloidal adsorption. By tuning the balance between surface and bulk energies with the system geometry, colloidal assemblies on the cholesteric interface can be molded by the underlying elastic field to form linear aggregates. The transition of adsorbed particles from surface regions with homeotropic anchoring to defect regions is accompanied by a reduction in particle mobility. The arrested assemblies subsequently map out and stabilize topological defects. These results demonstrate the kinetic arrest of interfacial particles within definable patterns by regulating the energetic frustration within cholesterics. This work highlights the importance of kinetic pathways for particle assembly in liquid crystals, of relevance to optical and energy applications.
△ Less
Submitted 4 January, 2020;
originally announced January 2020.
-
Excitons bound by photon exchange
Authors:
Erika Cortese,
Linh Tran,
Jean-Michel Manceau,
Adel Bousseksou,
Iacopo Carusotto,
Giorgio Biasiol,
Raffaele Colombelli,
Simone De Liberato
Abstract:
In contrast to interband excitons in undoped quantum wells, doped quantum wells do not display sharp resonances due to excitonic bound states. In these systems the effective Coulomb interaction between electrons and holes typically only leads to a depolarization shift of the single-electron intersubband transitions. Non-perturbative light-matter interaction in solid-state devices has been investig…
▽ More
In contrast to interband excitons in undoped quantum wells, doped quantum wells do not display sharp resonances due to excitonic bound states. In these systems the effective Coulomb interaction between electrons and holes typically only leads to a depolarization shift of the single-electron intersubband transitions. Non-perturbative light-matter interaction in solid-state devices has been investigated as a pathway to tune optoelectronic properties of materials. A recent theoretical work [Cortese et al., Optica 6, 354 (2019)] predicted that, when the doped quantum wells are embedded in a photonic cavity, emission-reabsorption processes of cavity photons can generate an effective attractive interaction which binds electrons and holes together, leading to the creation of an intraband bound exciton. Spectroscopically, this bound state manifests itself as a novel discrete resonance which appears below the ionisation threshold only when the coupling between light and matter is increased above a critical value. Here we report the first experimental observation of such a bound state using doped GaAs/AlGaAs quantum wells embedded in metal-metal resonators whose confinement is high enough to permit operation in strong coupling. Our result provides the first evidence of bound states of charged particles kept together not by Coulomb interaction, but by the exchange of transverse photons. Light-matter coupling can thus be used as a novel tool in quantum material engineering, tuning electronic properties of semiconductor heterostructures beyond those permitted by mere crystal structures, with direct applications to mid-infrared optoelectronics.
△ Less
Submitted 12 December, 2019;
originally announced December 2019.
-
Adversarial Embedding: A robust and elusive Steganography and Watermarking technique
Authors:
Salah Ghamizi,
Maxime Cordy,
Mike Papadakis,
Yves Le Traon
Abstract:
We propose adversarial embedding, a new steganography and watermarking technique that embeds secret information within images. The key idea of our method is to use deep neural networks for image classification and adversarial attacks to embed secret information within images. Thus, we use the attacks to embed an encoding of the message within images and the related deep neural network outputs to e…
▽ More
We propose adversarial embedding, a new steganography and watermarking technique that embeds secret information within images. The key idea of our method is to use deep neural networks for image classification and adversarial attacks to embed secret information within images. Thus, we use the attacks to embed an encoding of the message within images and the related deep neural network outputs to extract it. The key properties of adversarial attacks (invisible perturbations, nontransferability, resilience to tampering) offer guarantees regarding the confidentiality and the integrity of the hidden messages. We empirically evaluate adversarial embedding using more than 100 models and 1,000 messages. Our results confirm that our embedding passes unnoticed by both humans and steganalysis methods, while at the same time impedes illicit retrieval of the message (less than 13% recovery rate when the interceptor has some knowledge about our model), and is resilient to soft and (to some extent) aggressive image tampering (up to 100% recovery rate under jpeg compression). We further develop our method by proposing a new type of adversarial attack which improves the embedding density (amount of hidden information) of our method to up to 10 bits per pixel.
△ Less
Submitted 14 November, 2019;
originally announced December 2019.
-
Reaching a Consensus on Random Networks: The Power of Few
Authors:
Linh Tran,
Van Vu
Abstract:
A community of $n$ individuals splits into two camps, Red and Blue. The individuals are connected by a social network, which influences their colors. Everyday, each person changes his/her color according to the majority among his/her neighbors. Red (Blue) wins if everyone in the community becomes Red (Blue) at some point.
We study this process when the underlying network is the random Erdos-Reny…
▽ More
A community of $n$ individuals splits into two camps, Red and Blue. The individuals are connected by a social network, which influences their colors. Everyday, each person changes his/her color according to the majority among his/her neighbors. Red (Blue) wins if everyone in the community becomes Red (Blue) at some point.
We study this process when the underlying network is the random Erdos-Renyi graph $G(n, p)$. With a balanced initial state ($n/2$ person in each camp), it is clear that each color wins with the same probability.
Our study reveals that for any constants $p$ and $\varepsilon$, there is a constant $C$ such that if one camp has $n/2 +C$ individuals, then it wins with probability at least $1 - \varepsilon$. The surprising key fact here is that $C$ does not depend on $n$, the population of the community. When $p=1/2$ and $\varepsilon =.1$, one can set $C$ as small as 6. If the aim of the process is to choose a candidate, then this means it takes only $6$ "defectors" to win an election unanimously with overwhelming odd.
△ Less
Submitted 18 May, 2020; v1 submitted 22 November, 2019;
originally announced November 2019.
-
Direct comparison of many-body methods for realistic electronic Hamiltonians
Authors:
Kiel T. Williams,
Yuan Yao,
Jia Li,
Li Chen,
Hao Shi,
Mario Motta,
Chunyao Niu,
Ushnish Ray,
Sheng Guo,
Robert J. Anderson,
Junhao Li,
Lan Nguyen Tran,
Chia-Nan Yeh,
Bastien Mussard,
Sandeep Sharma,
Fabien Bruneval,
Mark van Schilfgaarde,
George H. Booth,
Garnet Kin-Lic Chan,
Shiwei Zhang,
Emanuel Gull,
Dominika Zgid,
Andrew Millis,
Cyrus J. Umrigar,
Lucas K. Wagner
Abstract:
A large collaboration carefully benchmarks 20 first principles many-body electronic structure methods on a test set of 7 transition metal atoms, and their ions and monoxides. Good agreement is attained between the 3 systematically converged methods, resulting in experiment-free reference values. These reference values are used to assess the accuracy of modern emerging and scalable approaches to th…
▽ More
A large collaboration carefully benchmarks 20 first principles many-body electronic structure methods on a test set of 7 transition metal atoms, and their ions and monoxides. Good agreement is attained between the 3 systematically converged methods, resulting in experiment-free reference values. These reference values are used to assess the accuracy of modern emerging and scalable approaches to the many-electron problem. The most accurate methods obtain energies indistinguishable from experimental results, with the agreement mainly limited by the experimental uncertainties. Comparison between methods enables a unique perspective on calculations of many-body systems of electrons.
△ Less
Submitted 5 October, 2019; v1 submitted 30 September, 2019;
originally announced October 2019.
-
To Detect Irregular Trade Behaviors In Stock Market By Using Graph Based Ranking Methods
Authors:
Loc Tran,
Linh Tran
Abstract:
To detect the irregular trade behaviors in the stock market is the important problem in machine learning field. These irregular trade behaviors are obviously illegal. To detect these irregular trade behaviors in the stock market, data scientists normally employ the supervised learning techniques. In this paper, we employ the three graph Laplacian based semi-supervised ranking methods to solve the…
▽ More
To detect the irregular trade behaviors in the stock market is the important problem in machine learning field. These irregular trade behaviors are obviously illegal. To detect these irregular trade behaviors in the stock market, data scientists normally employ the supervised learning techniques. In this paper, we employ the three graph Laplacian based semi-supervised ranking methods to solve the irregular trade behavior detection problem. Experimental results show that that the un-normalized and symmetric normalized graph Laplacian based semi-supervised ranking methods outperform the random walk Laplacian based semi-supervised ranking method.
△ Less
Submitted 3 September, 2019;
originally announced September 2019.
-
On Learning Disentangled Representations for Gait Recognition
Authors:
Ziyuan Zhang,
Luan Tran,
Feng Liu,
Xiaoming Liu
Abstract:
Gait, the walking pattern of individuals, is one of the important biometrics modalities. Most of the existing gait recognition methods take silhouettes or articulated body models as gait features. These methods suffer from degraded recognition performance when handling confounding variables, such as clothing, carrying and viewing angle. To remedy this issue, we propose a novel AutoEncoder framewor…
▽ More
Gait, the walking pattern of individuals, is one of the important biometrics modalities. Most of the existing gait recognition methods take silhouettes or articulated body models as gait features. These methods suffer from degraded recognition performance when handling confounding variables, such as clothing, carrying and viewing angle. To remedy this issue, we propose a novel AutoEncoder framework, GaitNet, to explicitly disentangle appearance, canonical and pose features from RGB imagery. The LSTM integrates pose features over time as a dynamic gait feature while canonical features are averaged as a static gait feature. Both of them are utilized as classification features. In addition, we collect a Frontal-View Gait (FVG) dataset to focus on gait recognition from frontal-view walking, which is a challenging problem since it contains minimal gait cues compared to other views. FVG also includes other important variations, e.g., walking speed, carrying, and clothing. With extensive experiments on CASIA-B, USF, and FVG datasets, our method demonstrates superior performance to the SOTA quantitatively, the ability of feature disentanglement qualitatively, and promising computational efficiency. We further compare our GaitNet with state-of-the-art face recognition to demonstrate the advantages of gait biometrics identification under certain scenarios, e.g., long distance/lower resolutions, cross viewing angles.
△ Less
Submitted 5 September, 2019;
originally announced September 2019.
-
PageRank algorithm for Directed Hypergraph
Authors:
Loc Tran,
Tho Quan,
An Mai
Abstract:
During the last two decades, we easilly see that the World Wide Web's link structure is modeled as the directed graph. In this paper, we will model the World Wide Web's link structure as the directed hypergraph. Moreover, we will develop the PageRank algorithm for this directed hypergraph. Due to the lack of the World Wide Web directed hypergraph datasets, we will apply the PageRank algorithm to t…
▽ More
During the last two decades, we easilly see that the World Wide Web's link structure is modeled as the directed graph. In this paper, we will model the World Wide Web's link structure as the directed hypergraph. Moreover, we will develop the PageRank algorithm for this directed hypergraph. Due to the lack of the World Wide Web directed hypergraph datasets, we will apply the PageRank algorithm to the metabolic network which is the directed hypergraph itself. The experiments show that our novel PageRank algorithm is successfully applied to this metabolic network.
△ Less
Submitted 6 September, 2022; v1 submitted 29 August, 2019;
originally announced September 2019.
-
Solve fraud detection problem by using graph based learning methods
Authors:
Loc Tran,
Tuan Tran,
Linh Tran,
An Mai
Abstract:
The credit cards' fraud transactions detection is the important problem in machine learning field. To detect the credit cards's fraud transactions help reduce the significant loss of the credit cards' holders and the banks. To detect the credit cards' fraud transactions, data scientists normally employ the unsupervised learning techniques and supervised learning techniques. In this paper, we emplo…
▽ More
The credit cards' fraud transactions detection is the important problem in machine learning field. To detect the credit cards's fraud transactions help reduce the significant loss of the credit cards' holders and the banks. To detect the credit cards' fraud transactions, data scientists normally employ the unsupervised learning techniques and supervised learning techniques. In this paper, we employ the graph p-Laplacian based semi-supervised learning methods combined with the undersampling techniques such as Cluster Centroids to solve the credit cards' fraud transactions detection problem. Experimental results show that the graph p-Laplacian semi-supervised learning methods outperform the current state of the art graph Laplacian based semi-supervised learning method (p=2).
△ Less
Submitted 29 August, 2019;
originally announced August 2019.
-
Learning to Infer Entities, Properties and their Relations from Clinical Conversations
Authors:
Nan Du,
Mingqiu Wang,
Linh Tran,
Gang Li,
Izhak Shafran
Abstract:
Recently we proposed the Span Attribute Tagging (SAT) Model (Du et al., 2019) to infer clinical entities (e.g., symptoms) and their properties (e.g., duration). It tackles the challenge of large label space and limited training data using a hierarchical two-stage approach that identifies the span of interest in a tagging step and assigns labels to the span in a classification step.
We extend the…
▽ More
Recently we proposed the Span Attribute Tagging (SAT) Model (Du et al., 2019) to infer clinical entities (e.g., symptoms) and their properties (e.g., duration). It tackles the challenge of large label space and limited training data using a hierarchical two-stage approach that identifies the span of interest in a tagging step and assigns labels to the span in a classification step.
We extend the SAT model to jointly infer not only entities and their properties but also relations between them. Most relation extraction models restrict inferring relations between tokens within a few neighboring sentences, mainly to avoid high computational complexity. In contrast, our proposed Relation-SAT (R-SAT) model is computationally efficient and can infer relations over the entire conversation, spanning an average duration of 10 minutes.
We evaluate our model on a corpus of clinical conversations. When the entities are given, the R-SAT outperforms baselines in identifying relations between symptoms and their properties by about 32% (0.82 vs 0.62 F-score) and by about 50% (0.60 vs 0.41 F-score) on medications and their properties. On the more difficult task of jointly inferring entities and relations, the R-SAT model achieves a performance of 0.34 and 0.45 for symptoms and medications respectively, which is significantly better than 0.18 and 0.35 for the baseline model. The contributions of different components of the model are quantified using ablation analysis.
△ Less
Submitted 30 August, 2019;
originally announced August 2019.
-
Search and Rescue under the Forest Canopy using Multiple UAVs
Authors:
Yulun Tian,
Katherine Liu,
Kyel Ok,
Loc Tran,
Danette Allen,
Nicholas Roy,
Jonathan P. How
Abstract:
We present a multi-robot system for GPS-denied search and rescue under the forest canopy. Forests are particularly challenging environments for collaborative exploration and mapping, in large part due to the existence of severe perceptual aliasing which hinders reliable loop closure detection for mutual localization and map fusion. Our proposed system features unmanned aerial vehicles (UAVs) that…
▽ More
We present a multi-robot system for GPS-denied search and rescue under the forest canopy. Forests are particularly challenging environments for collaborative exploration and mapping, in large part due to the existence of severe perceptual aliasing which hinders reliable loop closure detection for mutual localization and map fusion. Our proposed system features unmanned aerial vehicles (UAVs) that perform onboard sensing, estimation, and planning. When communication is available, each UAV transmits compressed tree-based submaps to a central ground station for collaborative simultaneous localization and mapping (CSLAM). To overcome high measurement noise and perceptual aliasing, we use the local configuration of a group of trees as a distinctive feature for robust loop closure detection. Furthermore, we propose a novel procedure based on cycle consistent multiway matching to recover from incorrect pairwise data associations. The returned global data association is guaranteed to be cycle consistent, and is shown to improve both precision and recall compared to the input pairwise associations. The proposed multi-UAV system is validated both in simulation and during real-world collaborative exploration missions at NASA Langley Research Center.
△ Less
Submitted 7 June, 2020; v1 submitted 28 August, 2019;
originally announced August 2019.
-
iFixR: Bug Report driven Program Repair
Authors:
Anil Koyuncu,
Kui Liu,
Tegawendé F. Bissyandé,
Dongsun Kim,
Martin Monperrus,
Jacques Klein,
Yves Le Traon
Abstract:
Issue tracking systems are commonly used in modern software development for collecting feedback from users and developers. An ultimate automation target of software maintenance is then the systematization of patch generation for user-reported bugs. Although this ambition is aligned with the momentum of automated program repair, the literature has, so far, mostly focused on generate-and-validate se…
▽ More
Issue tracking systems are commonly used in modern software development for collecting feedback from users and developers. An ultimate automation target of software maintenance is then the systematization of patch generation for user-reported bugs. Although this ambition is aligned with the momentum of automated program repair, the literature has, so far, mostly focused on generate-and-validate setups where fault localization and patch generation are driven by a well-defined test suite. On the one hand, however, the common (yet strong) assumption on the existence of relevant test cases does not hold in practice for most development settings: many bugs are reported without the available test suite being able to reveal them. On the other hand, for many projects, the number of bug reports generally outstrips the resources available to triage them. Towards increasing the adoption of patch generation tools by practitioners, we investigate a new repair pipeline, iFixR, driven by bug reports: (1) bug reports are fed to an IR-based fault localizer; (2) patches are generated from fix patterns and validated via regression testing; (3) a prioritized list of generated patches is proposed to developers. We evaluate iFixR on the Defects4J dataset, which we enriched (i.e., faults are linked to bug reports) and carefully-reorganized (i.e., the timeline of test-cases is naturally split). iFixR generates genuine/plausible patches for 21/44 Defects4J faults with its IR-based fault localizer. iFixR accurately places a genuine/plausible patch among its top-5 recommendation for 8/13 of these faults (without using future test cases in generation-and-validation).
△ Less
Submitted 12 July, 2019;
originally announced July 2019.
-
Quaternion Collaborative Filtering for Recommendation
Authors:
Shuai Zhang,
Lina Yao,
Lucas Vinh Tran,
Aston Zhang,
Yi Tay
Abstract:
This paper proposes Quaternion Collaborative Filtering (QCF), a novel representation learning method for recommendation. Our proposed QCF relies on and exploits computation with Quaternion algebra, benefiting from the expressiveness and rich representation learning capability of Hamilton products. Quaternion representations, based on hypercomplex numbers, enable rich inter-latent dependencies betw…
▽ More
This paper proposes Quaternion Collaborative Filtering (QCF), a novel representation learning method for recommendation. Our proposed QCF relies on and exploits computation with Quaternion algebra, benefiting from the expressiveness and rich representation learning capability of Hamilton products. Quaternion representations, based on hypercomplex numbers, enable rich inter-latent dependencies between imaginary components. This encourages intricate relations to be captured when learning user-item interactions, serving as a strong inductive bias as compared with the real-space inner product. All in all, we conduct extensive experiments on six real-world datasets, demonstrating the effectiveness of Quaternion algebra in recommender systems. The results exhibit that QCF outperforms a wide spectrum of strong neural baselines on all datasets. Ablative experiments confirm the effectiveness of Hamilton-based composition over multi-embedding composition in real space.
△ Less
Submitted 6 June, 2019;
originally announced June 2019.
-
Extracting Symptoms and their Status from Clinical Conversations
Authors:
Nan Du,
Kai Chen,
Anjuli Kannan,
Linh Tran,
Yuhui Chen,
Izhak Shafran
Abstract:
This paper describes novel models tailored for a new application, that of extracting the symptoms mentioned in clinical conversations along with their status. Lack of any publicly available corpus in this privacy-sensitive domain led us to develop our own corpus, consisting of about 3K conversations annotated by professional medical scribes. We propose two novel deep learning approaches to infer t…
▽ More
This paper describes novel models tailored for a new application, that of extracting the symptoms mentioned in clinical conversations along with their status. Lack of any publicly available corpus in this privacy-sensitive domain led us to develop our own corpus, consisting of about 3K conversations annotated by professional medical scribes. We propose two novel deep learning approaches to infer the symptom names and their status: (1) a new hierarchical span-attribute tagging (\SAT) model, trained using curriculum learning, and (2) a variant of sequence-to-sequence model which decodes the symptoms and their status from a few speaker turns within a sliding window over the conversation. This task stems from a realistic application of assisting medical providers in capturing symptoms mentioned by patients from their clinical conversations. To reflect this application, we define multiple metrics. From inter-rater agreement, we find that the task is inherently difficult. We conduct comprehensive evaluations on several contrasting conditions and observe that the performance of the models range from an F-score of 0.5 to 0.8 depending on the condition. Our analysis not only reveals the inherent challenges of the task, but also provides useful directions to improve the models.
△ Less
Submitted 5 June, 2019;
originally announced June 2019.
-
On Estimating Maximum Sum Rate of MIMO Systems with Successive Zero-Forcing Dirty Paper Coding and Per-antenna Power Constraint
Authors:
Thuy M. Pham,
Ronan Farrell,
Le-Nam Tran
Abstract:
In this paper, we study the sum rate maximization for successive zero-forcing dirty-paper coding (SZFDPC) with per-antenna power constraint (PAPC). Although SZFDPC is a low-complexity alternative to the optimal dirty paper coding (DPC), efficient algorithms to compute its sum rate are still open problems especially under practical PAPC. The existing solution to the considered problem is computatio…
▽ More
In this paper, we study the sum rate maximization for successive zero-forcing dirty-paper coding (SZFDPC) with per-antenna power constraint (PAPC). Although SZFDPC is a low-complexity alternative to the optimal dirty paper coding (DPC), efficient algorithms to compute its sum rate are still open problems especially under practical PAPC. The existing solution to the considered problem is computationally inefficient due to employing high-complexity interior-point method. In this study, we propose two new low-complexity approaches to this important problem. More specifically, the first algorithm achieves the optimal solution by transforming the original problem in the broadcast channel into an equivalent problem in the multiple access channel, then the resulting problem is solved by alternating optimization together with successive convex approximation. We also derive a suboptimal solution based on machine learning to which simple linear regressions are applicable. The approaches are analyzed and validated extensively to demonstrate their superiors over the existing approach.
△ Less
Submitted 14 May, 2019;
originally announced May 2019.
-
Test Selection for Deep Learning Systems
Authors:
Wei Ma,
Mike Papadakis,
Anestis Tsakmalis,
Maxime Cordy,
Yves Le Traon
Abstract:
Testing of deep learning models is challenging due to the excessive number and complexity of computations involved. As a result, test data selection is performed manually and in an ad hoc way. This raises the question of how we can automatically select candidate test data to test deep learning models. Recent research has focused on adapting test selection metrics from code-based software testing (…
▽ More
Testing of deep learning models is challenging due to the excessive number and complexity of computations involved. As a result, test data selection is performed manually and in an ad hoc way. This raises the question of how we can automatically select candidate test data to test deep learning models. Recent research has focused on adapting test selection metrics from code-based software testing (such as coverage) to deep learning. However, deep learning models have different attributes from code such as spread of computations across the entire network reflecting training data properties, balance of neuron weights and redundancy (use of many more neurons than needed). Such differences make code-based metrics inappropriate to select data that can challenge the models (can trigger misclassification). We thus propose a set of test selection metrics based on the notion of model uncertainty (model confidence on specific inputs). Intuitively, the more uncertain we are about a candidate sample, the more likely it is that this sample triggers a misclassification. Similarly, the samples for which we are the most uncertain, are the most informative and should be used to improve the model by retraining. We evaluate these metrics on two widely-used image classification problems involving real and artificial (adversarial) data. We show that uncertainty-based metrics have a strong ability to select data that are misclassified and lead to major improvement in classification accuracy during retraining: up to 80% more gain than random selection and other state-of-the-art metrics on one dataset and up to 29% on the other.
△ Less
Submitted 30 April, 2019;
originally announced April 2019.
-
Tensor Sparse PCA and Face Recognition: A Novel Approach
Authors:
Loc Hoang Tran,
Linh Hoang Tran
Abstract:
Face recognition is the important field in machine learning and pattern recognition research area. It has a lot of applications in military, finance, public security, to name a few. In this paper, the combination of the tensor sparse PCA with the nearest-neighbor method (and with the kernel ridge regression method) will be proposed and applied to the face dataset. Experimental results show that th…
▽ More
Face recognition is the important field in machine learning and pattern recognition research area. It has a lot of applications in military, finance, public security, to name a few. In this paper, the combination of the tensor sparse PCA with the nearest-neighbor method (and with the kernel ridge regression method) will be proposed and applied to the face dataset. Experimental results show that the combination of the tensor sparse PCA with any classification system does not always reach the best accuracy performance measures. However, the accuracy of the combination of the sparse PCA method and one specific classification system is always better than the accuracy of the combination of the PCA method and one specific classification system and is always better than the accuracy of the classification system itself.
△ Less
Submitted 11 August, 2020; v1 submitted 11 April, 2019;
originally announced April 2019.
-
Tunable cloaking of Mexican-hat confined states in bilayer silicene
Authors:
Le Bin Ho,
Lan Nguyen Tran
Abstract:
We present the ballistic quantum transport of a p-n-p bilayer silicene junction in the presence of spin-orbit coupling and electric field using a four-band model in combination with the transfer-matrix approach. A Mexican-hat shape of the low-energy spectrum is observed similarly to bilayer graphene under an interlayer bias. We show that while bilayer silicene shares some physics with bilayer grap…
▽ More
We present the ballistic quantum transport of a p-n-p bilayer silicene junction in the presence of spin-orbit coupling and electric field using a four-band model in combination with the transfer-matrix approach. A Mexican-hat shape of the low-energy spectrum is observed similarly to bilayer graphene under an interlayer bias. We show that while bilayer silicene shares some physics with bilayer graphene, it has many intriguing phenomena that have not been reported for the latter. First, the confined state producing a significantly non-zero transmission in Mexican hat. Second, the cloaking of the Mexican-hat confined state is found. Third, we observe that the Mexican-hat cloaking results in a strong oscillation of conductance when the incident energy is below the potential height. Finally, unlike monolayer silicene, the conductance at large interlayer distances increases with the rise of electric field when the incident energy is above the potential height.
△ Less
Submitted 16 April, 2019;
originally announced April 2019.
-
Tracking excited states in wave function optimization using density matrices and variational principles
Authors:
Lan Nguyen Tran,
Jacqueline A. R. Shea,
Eric Neuscamman
Abstract:
We present a method for finding individual excited states' energy stationary points in complete active space self-consistent field theory that is compatible with standard optimization methods and highly effective at overcoming difficulties due to root flipping and near-degeneracies. Inspired by both the maximum overlap method and recent progress in excited state variational principles, our approac…
▽ More
We present a method for finding individual excited states' energy stationary points in complete active space self-consistent field theory that is compatible with standard optimization methods and highly effective at overcoming difficulties due to root flipping and near-degeneracies. Inspired by both the maximum overlap method and recent progress in excited state variational principles, our approach combines these ideas in order to track individual excited states throughout the orbital optimization process. In a series of tests involving root flipping, near-degeneracies, charge transfers, and double excitations, we show that this approach is more effective for state-specific optimization than either the naive selection of roots based on energy ordering or a more direct generalization of the maximum overlap method. Furthermore, we provide evidence that this state-specific approach improves the performance of complete active space perturbation theory. With a simple implementation, a low cost, and compatibility with large active space methods, the approach is designed to be useful in a wide range of excited state investigations.
△ Less
Submitted 10 April, 2019;
originally announced April 2019.
-
Towards High-fidelity Nonlinear 3D Face Morphable Model
Authors:
Luan Tran,
Feng Liu,
Xiaoming Liu
Abstract:
Embedding 3D morphable basis functions into deep neural networks opens great potential for models with better representation power. However, to faithfully learn those models from an image collection, it requires strong regularization to overcome ambiguities involved in the learning process. This critically prevents us from learning high fidelity face models which are needed to represent face image…
▽ More
Embedding 3D morphable basis functions into deep neural networks opens great potential for models with better representation power. However, to faithfully learn those models from an image collection, it requires strong regularization to overcome ambiguities involved in the learning process. This critically prevents us from learning high fidelity face models which are needed to represent face images in high level of details. To address this problem, this paper presents a novel approach to learn additional proxies as means to side-step strong regularizations, as well as, leverages to promote detailed shape/albedo. To ease the learning, we also propose to use a dual-pathway network, a carefully-designed architecture that brings a balance between global and local-based models. By improving the nonlinear 3D morphable model in both learning objective and network architecture, we present a model which is superior in capturing higher level of details than the linear or its precedent nonlinear counterparts. As a result, our model achieves state-of-the-art performance on 3D face reconstruction by solely optimizing latent representations.
△ Less
Submitted 9 April, 2019;
originally announced April 2019.
-
Gait Recognition via Disentangled Representation Learning
Authors:
Ziyuan Zhang,
Luan Tran,
Xi Yin,
Yousef Atoum,
Xiaoming Liu,
Jian Wan,
Nanxin Wang
Abstract:
Gait, the walking pattern of individuals, is one of the most important biometrics modalities. Most of the existing gait recognition methods take silhouettes or articulated body models as the gait features. These methods suffer from degraded recognition performance when handling confounding variables, such as clothing, carrying and view angle. To remedy this issue, we propose a novel AutoEncoder fr…
▽ More
Gait, the walking pattern of individuals, is one of the most important biometrics modalities. Most of the existing gait recognition methods take silhouettes or articulated body models as the gait features. These methods suffer from degraded recognition performance when handling confounding variables, such as clothing, carrying and view angle. To remedy this issue, we propose a novel AutoEncoder framework to explicitly disentangle pose and appearance features from RGB imagery and the LSTM-based integration of pose features over time produces the gait feature. In addition, we collect a Frontal-View Gait (FVG) dataset to focus on gait recognition from frontal-view walking, which is a challenging problem since it contains minimal gait cues compared to other views. FVG also includes other important variations, e.g., walking speed, carrying, and clothing. With extensive experiments on CASIA-B, USF and FVG datasets, our method demonstrates superior performance to the state of the arts quantitatively, the ability of feature disentanglement qualitatively, and promising computational efficiency.
△ Less
Submitted 9 April, 2019;
originally announced April 2019.
-
Automated Search for Configurations of Deep Neural Network Architectures
Authors:
Salah Ghamizi,
Maxime Cordy,
Mike Papadakis,
Yves Le Traon
Abstract:
Deep Neural Networks (DNNs) are intensively used to solve a wide variety of complex problems. Although powerful, such systems require manual configuration and tuning. To this end, we view DNNs as configurable systems and propose an end-to-end framework that allows the configuration, evaluation and automated search for DNN architectures. Therefore, our contribution is threefold. First, we model the…
▽ More
Deep Neural Networks (DNNs) are intensively used to solve a wide variety of complex problems. Although powerful, such systems require manual configuration and tuning. To this end, we view DNNs as configurable systems and propose an end-to-end framework that allows the configuration, evaluation and automated search for DNN architectures. Therefore, our contribution is threefold. First, we model the variability of DNN architectures with a Feature Model (FM) that generalizes over existing architectures. Each valid configuration of the FM corresponds to a valid DNN model that can be built and trained. Second, we implement, on top of Tensorflow, an automated procedure to deploy, train and evaluate the performance of a configured model. Third, we propose a method to search for configurations and demonstrate that it leads to good DNN models. We evaluate our method by applying it on image classification tasks (MNIST, CIFAR-10) and show that, with limited amount of computation and training, our method can identify high-performing architectures (with high accuracy). We also demonstrate that we outperform existing state-of-the-art architectures handcrafted by ML researchers. Our FM and framework have been released %and are publicly available to support replication and future research.
△ Less
Submitted 9 April, 2019;
originally announced April 2019.
-
Characterisation of different stages of hadronic showers using the CALICE Si-W ECAL physics prototype
Authors:
CALICE Collaboration,
G. Eigen,
T. Price,
N. K. Watson,
A. Winter,
Y. Do,
A. Khan,
D. Kim,
G. C. Blazey,
A. Dyshkant,
K. Francis,
V. Zutshi,
K. Kawagoe,
Y. Miura,
R. Mori,
I. Sekiya,
T. Suehara,
T. Yoshioka,
J. Apostolakis,
J. Giraud,
D. Grondin,
J. -Y. Hostachy,
O. Bach,
V. Bocharnikov,
E. Brianne
, et al. (81 additional authors not shown)
Abstract:
A detailed investigation of hadronic interactions is performed using $π^-$-mesons with energies in the range 2--10 GeV incident on a high granularity silicon-tungsten electromagnetic calorimeter. The data were recorded at FNAL in 2008. The region in which the $π^-$-mesons interact with the detector material and the produced secondary particles are characterised using a novel track-finding algorith…
▽ More
A detailed investigation of hadronic interactions is performed using $π^-$-mesons with energies in the range 2--10 GeV incident on a high granularity silicon-tungsten electromagnetic calorimeter. The data were recorded at FNAL in 2008. The region in which the $π^-$-mesons interact with the detector material and the produced secondary particles are characterised using a novel track-finding algorithm that reconstructs tracks within hadronic showers in a calorimeter in the absence of a magnetic field. The principle of carrying out detector monitoring and calibration using secondary tracks is also demonstrated.
△ Less
Submitted 18 September, 2019; v1 submitted 16 February, 2019;
originally announced February 2019.