subscribe to arXiv mailings

CAAP: Context-Aware Action Planning Prompting to Solve Computer Tasks with Front-End UI Only

Authors: Junhee Cho, Jihoon Kim, Daseul Bae, Jinho Choo, Youngjune Gwon, Yeong-Dae Kwon

Abstract: Software robots have long been deployed in Robotic Process Automation (RPA) to automate mundane and repetitive computer tasks. The advent of Large Language Models (LLMs) with advanced reasoning capabilities has set the stage for these agents to now undertake more complex and even previously unseen tasks. However, the LLM-based automation techniques in recent literature frequently rely on HTML sour… ▽ More Software robots have long been deployed in Robotic Process Automation (RPA) to automate mundane and repetitive computer tasks. The advent of Large Language Models (LLMs) with advanced reasoning capabilities has set the stage for these agents to now undertake more complex and even previously unseen tasks. However, the LLM-based automation techniques in recent literature frequently rely on HTML source codes for input, limiting their application to web environments. Moreover, the information contained in HTML codes is often inaccurate or incomplete, making the agent less reliable for practical applications. We propose an LLM-based agent that functions solely on the basis of screenshots for recognizing environments, while leveraging in-context learning to eliminate the need for collecting large datasets of human demonstration. Our strategy, named Context-Aware Action Planning (CAAP) prompting encourages the agent to meticulously review the context in various angles. Through our proposed methodology, we achieve a success rate of 94.4% on 67~types of MiniWoB++ problems, utilizing only 1.48~demonstrations per problem type. Our method offers the potential for broader applications, especially for tasks that require inter-application coordination on computers or smartphones, showcasing a significant advancement in the field of automation agents. Codes and models are accessible at https://github.com/caap-agent/caap-agent. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 10 pages, 5 figures; (19 pages and 6 figures more in appendix)

arXiv:2403.03230 [pdf, other]

Large language models surpass human experts in predicting neuroscience results

Authors: Xiaoliang Luo, Akilles Rechardt, Guangzhi Sun, Kevin K. Nejad, Felipe Yáñez, Bati Yilmaz, Kangjoo Lee, Alexandra O. Cohen, Valentina Borghesani, Anton Pashkov, Daniele Marinazzo, Jonathan Nicholas, Alessandro Salatiello, Ilia Sucholutsky, Pasquale Minervini, Sepehr Razavi, Roberta Rocca, Elkhan Yusifov, Tereza Okalova, Nianlong Gu, Martin Ferianc, Mikail Khona, Kaustubh R. Patil, Pui-Shee Lee, Rui Mata , et al. (14 additional authors not shown)

Abstract: Scientific discoveries often hinge on synthesizing decades of research, a task that potentially outstrips human information processing capacities. Large language models (LLMs) offer a solution. LLMs trained on the vast scientific literature could potentially integrate noisy yet interrelated findings to forecast novel results better than human experts. To evaluate this possibility, we created Brain… ▽ More Scientific discoveries often hinge on synthesizing decades of research, a task that potentially outstrips human information processing capacities. Large language models (LLMs) offer a solution. LLMs trained on the vast scientific literature could potentially integrate noisy yet interrelated findings to forecast novel results better than human experts. To evaluate this possibility, we created BrainBench, a forward-looking benchmark for predicting neuroscience results. We find that LLMs surpass experts in predicting experimental outcomes. BrainGPT, an LLM we tuned on the neuroscience literature, performed better yet. Like human experts, when LLMs were confident in their predictions, they were more likely to be correct, which presages a future where humans and LLMs team together to make discoveries. Our approach is not neuroscience-specific and is transferable to other knowledge-intensive endeavors. △ Less

Submitted 21 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

arXiv:2402.12984 [pdf, other]

Can GNN be Good Adapter for LLMs?

Authors: Xuanwen Huang, Kaiqiao Han, Yang Yang, Dezheng Bao, Quanjin Tao, Ziwei Chai, Qi Zhu

Abstract: Recently, large language models (LLMs) have demonstrated superior capabilities in understanding and zero-shot learning on textual data, promising significant advances for many text-related domains. In the graph domain, various real-world scenarios also involve textual data, where tasks and node features can be described by text. These text-attributed graphs (TAGs) have broad applications in social… ▽ More Recently, large language models (LLMs) have demonstrated superior capabilities in understanding and zero-shot learning on textual data, promising significant advances for many text-related domains. In the graph domain, various real-world scenarios also involve textual data, where tasks and node features can be described by text. These text-attributed graphs (TAGs) have broad applications in social media, recommendation systems, etc. Thus, this paper explores how to utilize LLMs to model TAGs. Previous methods for TAG modeling are based on million-scale LMs. When scaled up to billion-scale LLMs, they face huge challenges in computational costs. Additionally, they also ignore the zero-shot inference capabilities of LLMs. Therefore, we propose GraphAdapter, which uses a graph neural network (GNN) as an efficient adapter in collaboration with LLMs to tackle TAGs. In terms of efficiency, the GNN adapter introduces only a few trainable parameters and can be trained with low computation costs. The entire framework is trained using auto-regression on node text (next token prediction). Once trained, GraphAdapter can be seamlessly fine-tuned with task-specific prompts for various downstream tasks. Through extensive experiments across multiple real-world TAGs, GraphAdapter based on Llama 2 gains an average improvement of approximately 5\% in terms of node classification. Furthermore, GraphAdapter can also adapt to other language models, including RoBERTa, GPT-2. The promising results demonstrate that GNNs can serve as effective adapters for LLMs in TAG modeling. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Comments: Accepted by WWW'24

arXiv:2402.10213 [pdf, other]

Clustering Inductive Biases with Unrolled Networks

Authors: Jonathan Huml, Abiy Tasissa, Demba Ba

Abstract: The classical sparse coding (SC) model represents visual stimuli as a linear combination of a handful of learned basis functions that are Gabor-like when trained on natural image data. However, the Gabor-like filters learned by classical sparse coding far overpredict well-tuned simple cell receptive field profiles observed empirically. While neurons fire sparsely, neuronal populations are also org… ▽ More The classical sparse coding (SC) model represents visual stimuli as a linear combination of a handful of learned basis functions that are Gabor-like when trained on natural image data. However, the Gabor-like filters learned by classical sparse coding far overpredict well-tuned simple cell receptive field profiles observed empirically. While neurons fire sparsely, neuronal populations are also organized in physical space by their sensitivity to certain features. In V1, this organization is a smooth progression of orientations along the cortical sheet. A number of subsequent models have either discarded the sparse dictionary learning framework entirely or whose updates have yet to take advantage of the surge in unrolled, neural dictionary learning architectures. A key missing theme of these updates is a stronger notion of \emph{structured sparsity}. We propose an autoencoder architecture (WLSC) whose latent representations are implicitly, locally organized for spectral clustering through a Laplacian quadratic form of a bipartite graph, which generates a diverse set of artificial receptive fields that match primate data in V1 as faithfully as recent contrastive frameworks like Local Low Dimensionality, or LLD \citep{lld} that discard sparse dictionary learning. By unifying sparse and smooth coding in models of the early visual cortex through our autoencoder, we also show that our regularization can be interpreted as early-stage specialization of receptive fields to certain classes of stimuli; that is, we induce a weak clustering bias for later stages of cortex where functional and spatial segregation (i.e. topography) are known to occur. The results show an imperative for \emph{spatial regularization} of both the receptive fields and firing rates to begin to describe feature disentanglement in V1 and beyond. △ Less

Submitted 29 November, 2023; originally announced February 2024.

arXiv:2311.00447 [pdf, other]

On the Opportunities of Green Computing: A Survey

Authors: You Zhou, Xiujing Lin, Xiang Zhang, Maolin Wang, Gangwei Jiang, Huakang Lu, Yupeng Wu, Kai Zhang, Zhe Yang, Kehang Wang, Yongduo Sui, Fengwei Jia, Zuoli Tang, Yao Zhao, Hongxuan Zhang, Tiannuo Yang, Weibo Chen, Yunong Mao, Yi Li, De Bao, Yu Li, Hongrui Liao, Ting Liu, Jingwen Liu, Jinchi Guo , et al. (16 additional authors not shown)

Abstract: Artificial Intelligence (AI) has achieved significant advancements in technology and research with the development over several decades, and is widely used in many areas including computing vision, natural language processing, time-series analysis, speech synthesis, etc. During the age of deep learning, especially with the arise of Large Language Models, a large majority of researchers' attention… ▽ More Artificial Intelligence (AI) has achieved significant advancements in technology and research with the development over several decades, and is widely used in many areas including computing vision, natural language processing, time-series analysis, speech synthesis, etc. During the age of deep learning, especially with the arise of Large Language Models, a large majority of researchers' attention is paid on pursuing new state-of-the-art (SOTA) results, resulting in ever increasing of model size and computational complexity. The needs for high computing power brings higher carbon emission and undermines research fairness by preventing small or medium-sized research institutions and companies with limited funding in participating in research. To tackle the challenges of computing resources and environmental impact of AI, Green Computing has become a hot research topic. In this survey, we give a systematic overview of the technologies used in Green Computing. We propose the framework of Green Computing and devide it into four key components: (1) Measures of Greenness, (2) Energy-Efficient AI, (3) Energy-Efficient Computing Systems and (4) AI Use Cases for Sustainability. For each components, we discuss the research progress made and the commonly used techniques to optimize the AI efficiency. We conclude that this new research direction has the potential to address the conflicts between resource constraints and AI development. We encourage more researchers to put attention on this direction and make AI more environmental friendly. △ Less

Submitted 8 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

Comments: 113 pages, 18 figures

arXiv:2310.00420 [pdf, other]

An Efficient Algorithm for Clustered Multi-Task Compressive Sensing

Authors: Alexander Lin, Demba Ba

Abstract: This paper considers clustered multi-task compressive sensing, a hierarchical model that solves multiple compressive sensing tasks by finding clusters of tasks that leverage shared information to mutually improve signal reconstruction. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. The main bottleneck involves repeated matri… ▽ More This paper considers clustered multi-task compressive sensing, a hierarchical model that solves multiple compressive sensing tasks by finding clusters of tasks that leverage shared information to mutually improve signal reconstruction. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. The main bottleneck involves repeated matrix inversion and log-determinant computation for multiple large covariance matrices. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices. Our approach combines Monte Carlo sampling with iterative linear solvers. Our experiments reveal that compared to the existing baseline, our algorithm can be up to thousands of times faster and an order of magnitude more memory-efficient. △ Less

Submitted 30 September, 2023; originally announced October 2023.

arXiv:2309.02848 [pdf, other]

Prompt-based Node Feature Extractor for Few-shot Learning on Text-Attributed Graphs

Authors: Xuanwen Huang, Kaiqiao Han, Dezheng Bao, Quanjin Tao, Zhisheng Zhang, Yang Yang, Qi Zhu

Abstract: Text-attributed Graphs (TAGs) are commonly found in the real world, such as social networks and citation networks, and consist of nodes represented by textual descriptions. Currently, mainstream machine learning methods on TAGs involve a two-stage modeling approach: (1) unsupervised node feature extraction with pre-trained language models (PLMs); and (2) supervised learning using Graph Neural Netw… ▽ More Text-attributed Graphs (TAGs) are commonly found in the real world, such as social networks and citation networks, and consist of nodes represented by textual descriptions. Currently, mainstream machine learning methods on TAGs involve a two-stage modeling approach: (1) unsupervised node feature extraction with pre-trained language models (PLMs); and (2) supervised learning using Graph Neural Networks (GNNs). However, we observe that these representations, which have undergone large-scale pre-training, do not significantly improve performance with a limited amount of training samples. The main issue is that existing methods have not effectively integrated information from the graph and downstream tasks simultaneously. In this paper, we propose a novel framework called G-Prompt, which combines a graph adapter and task-specific prompts to extract node features. First, G-Prompt introduces a learnable GNN layer (\emph{i.e.,} adaptor) at the end of PLMs, which is fine-tuned to better capture the masked tokens considering graph neighborhood information. After the adapter is trained, G-Prompt incorporates task-specific prompts to obtain \emph{interpretable} node representations for the downstream task. Our experiment results demonstrate that our proposed method outperforms current state-of-the-art (SOTA) methods on few-shot node classification. More importantly, in zero-shot settings, the G-Prompt embeddings can not only provide better task interpretability than vanilla PLMs but also achieve comparable performance with fully-supervised baselines. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: Under review

arXiv:2306.03249 [pdf, other]

Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood Estimation for Latent Gaussian Models

Authors: Alexander Lin, Bahareh Tolooshams, Yves Atchadé, Demba Ba

Abstract: Latent Gaussian models have a rich history in statistics and machine learning, with applications ranging from factor analysis to compressed sensing to time series analysis. The classical method for maximizing the likelihood of these models is the expectation-maximization (EM) algorithm. For problems with high-dimensional latent variables and large datasets, EM scales poorly because it needs to inv… ▽ More Latent Gaussian models have a rich history in statistics and machine learning, with applications ranging from factor analysis to compressed sensing to time series analysis. The classical method for maximizing the likelihood of these models is the expectation-maximization (EM) algorithm. For problems with high-dimensional latent variables and large datasets, EM scales poorly because it needs to invert as many large covariance matrices as the number of data points. We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversion. Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation. In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance. △ Less

Submitted 5 June, 2023; originally announced June 2023.

Comments: 29 pages, 4 figures

Journal ref: International Conference on Machine Learning, 2023

arXiv:2305.18552 [pdf, other]

Learning Linear Groups in Neural Networks

Authors: Emmanouil Theodosis, Karim Helwani, Demba Ba

Abstract: Employing equivariance in neural networks leads to greater parameter efficiency and improved generalization performance through the encoding of domain knowledge in the architecture; however, the majority of existing approaches require an a priori specification of the desired symmetries. We present a neural network architecture, Linear Group Networks (LGNs), for learning linear groups acting on the… ▽ More Employing equivariance in neural networks leads to greater parameter efficiency and improved generalization performance through the encoding of domain knowledge in the architecture; however, the majority of existing approaches require an a priori specification of the desired symmetries. We present a neural network architecture, Linear Group Networks (LGNs), for learning linear groups acting on the weight space of neural networks. Linear groups are desirable due to their inherent interpretability, as they can be represented as finite matrices. LGNs learn groups without any supervision or knowledge of the hidden symmetries in the data and the groups can be mapped to well known operations in machine learning. We use LGNs to learn groups on multiple datasets while considering different downstream tasks; we demonstrate that the linear group structure depends on both the data distribution and the considered task. △ Less

Submitted 29 May, 2023; originally announced May 2023.

arXiv:2302.11162 [pdf, other]

Sparse, Geometric Autoencoder Models of V1

Authors: Jonathan Huml, Abiy Tasissa, Demba Ba

Abstract: The classical sparse coding model represents visual stimuli as a linear combination of a handful of learned basis functions that are Gabor-like when trained on natural image data. However, the Gabor-like filters learned by classical sparse coding far overpredict well-tuned simple cell receptive field (SCRF) profiles. A number of subsequent models have either discarded the sparse dictionary learnin… ▽ More The classical sparse coding model represents visual stimuli as a linear combination of a handful of learned basis functions that are Gabor-like when trained on natural image data. However, the Gabor-like filters learned by classical sparse coding far overpredict well-tuned simple cell receptive field (SCRF) profiles. A number of subsequent models have either discarded the sparse dictionary learning framework entirely or have yet to take advantage of the surge in unrolled, neural dictionary learning architectures. A key missing theme of these updates is a stronger notion of \emph{structured sparsity}. We propose an autoencoder architecture whose latent representations are implicitly, locally organized for spectral clustering, which begets artificial neurons better matched to observed primate data. The weighted-$\ell_1$ (WL) constraint in the autoencoder objective function maintains core ideas of the sparse coding framework, yet also offers a promising path to describe the differentiation of receptive fields in terms of a discriminative hierarchy in future work. △ Less

Submitted 22 February, 2023; originally announced February 2023.

Comments: Symmetry and Geometry in Neural Representations (NeurIPS) 2022

arXiv:2211.09238 [pdf, other]

Learning unfolded networks with a cyclic group structure

Authors: Emmanouil Theodosis, Demba Ba

Abstract: Deep neural networks lack straightforward ways to incorporate domain knowledge and are notoriously considered black boxes. Prior works attempted to inject domain knowledge into architectures implicitly through data augmentation. Building on recent advances on equivariant neural networks, we propose networks that explicitly encode domain knowledge, specifically equivariance with respect to rotation… ▽ More Deep neural networks lack straightforward ways to incorporate domain knowledge and are notoriously considered black boxes. Prior works attempted to inject domain knowledge into architectures implicitly through data augmentation. Building on recent advances on equivariant neural networks, we propose networks that explicitly encode domain knowledge, specifically equivariance with respect to rotations. By using unfolded architectures, a rich framework that originated from sparse coding and has theoretical guarantees, we present interpretable networks with sparse activations. The equivariant unfolded networks compete favorably with baselines, with only a fraction of their parameters, as showcased on (rotated) MNIST and CIFAR-10. △ Less

Submitted 16 November, 2022; originally announced November 2022.

Comments: Accepted as an extended abstract in NeurIPS Workshop on Symmetry and Geometry in Neural Representations

arXiv:2209.14165 [pdf, other]

doi 10.1109/TSP.2023.3278861

Unrolled Compressed Blind-Deconvolution

Authors: Bahareh Tolooshams, Satish Mulleti, Demba Ba, Yonina C. Eldar

Abstract: The problem of sparse multichannel blind deconvolution (S-MBD) arises frequently in many engineering applications such as radar/sonar/ultrasound imaging. To reduce its computational and implementation cost, we propose a compression method that enables blind recovery from much fewer measurements with respect to the full received signal in time. The proposed compression measures the signal through a… ▽ More The problem of sparse multichannel blind deconvolution (S-MBD) arises frequently in many engineering applications such as radar/sonar/ultrasound imaging. To reduce its computational and implementation cost, we propose a compression method that enables blind recovery from much fewer measurements with respect to the full received signal in time. The proposed compression measures the signal through a filter followed by a subsampling, allowing for a significant reduction in implementation cost. We derive theoretical guarantees for the identifiability and recovery of a sparse filter from compressed measurements. Our results allow for the design of a wide class of compression filters. We, then, propose a data-driven unrolled learning framework to learn the compression filter and solve the S-MBD problem. The encoder is a recurrent inference network that maps compressed measurements into an estimate of sparse filters. We demonstrate that our unrolled learning method is more robust to choices of source shapes and has better recovery performance compared to optimization-based methods. Finally, in data-limited applications (fewshot learning), we highlight the superior generalization capability of unrolled learning compared to conventional deep learning. △ Less

Submitted 18 May, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

Comments: Accepted to IEEE TSP

arXiv:2205.08290 [pdf, other]

Literature Review to Collect Conceptual Variables of Scenario Methods for Establishing a Conceptual Scenario Framework

Authors: Young-Min Baek, Esther Cho, Donghwan Shin, Doo-Hwan Bae

Abstract: Over recent decades, scenarios and scenario-based software/system engineering have been actively employed as essential tools to handle intricate problems, validate requirements, and support stakeholders' communication. However, despite the widespread use of scenarios, there have been several challenges for engineers to more willingly utilize scenario-based engineering approaches (i.e., scenario me… ▽ More Over recent decades, scenarios and scenario-based software/system engineering have been actively employed as essential tools to handle intricate problems, validate requirements, and support stakeholders' communication. However, despite the widespread use of scenarios, there have been several challenges for engineers to more willingly utilize scenario-based engineering approaches (i.e., scenario methods) in their projects. First, the term scenario has numerous published definitions, thus lacking in a well-established shared understanding of scenarios and scenario methods. Second, the conceptual basis for engineers developing or employing scenarios is missing. To establish shared understanding and to find common denominators of scenario methods, this study leverages well-defined metamodeling and conceptualization that systematically investigate the concepts under analysis and define core entities and their relations. By conducting a semi-systematic literature review, conceptual variables are collected and conceptualized as a conceptual meta-model. As a result, this study introduces scenario variables (SVs) that represent constructs/semantics of scenario descriptions, according to 4 levels of constructs of a scenario method. To evaluate the comprehensibility and applicability of the defined variables, we analyze five existing scenario methods and their instances in automated driving system (ADS) domains. The results showed that our conceptual model and its constituent scenario variables adequately support the understanding of a scenario method and provide a means for comparative analysis between different scenario methods. △ Less

Submitted 17 May, 2022; originally announced May 2022.

Comments: 22 pages, 7 figures

MSC Class: 68M99 ACM Class: D.2.1

arXiv:2204.06799 [pdf, other]

Environment Imitation: Data-Driven Environment Model Generation Using Imitation Learning for Efficient CPS Goal Verification

Authors: Yong-Jun Shin, Donghwan Shin, Doo-Hwan Bae

Abstract: Cyber-Physical Systems (CPS) continuously interact with their physical environments through software controllers that observe the environments and determine actions. Engineers can verify to what extent the CPS under analysis can achieve given goals by analyzing its Field Operational Test (FOT) logs. However, it is challenging to repeat many FOTs to obtain statistically significant results due to i… ▽ More Cyber-Physical Systems (CPS) continuously interact with their physical environments through software controllers that observe the environments and determine actions. Engineers can verify to what extent the CPS under analysis can achieve given goals by analyzing its Field Operational Test (FOT) logs. However, it is challenging to repeat many FOTs to obtain statistically significant results due to its cost and risk in practice. To address this challenge, simulation-based verification can be a good alternative for efficient CPS goal verification, but it requires an accurate virtual environment model that can replace the real environment that interacts with the CPS in a closed loop. This paper proposes a novel data-driven approach that automatically generates the virtual environment model from a small amount of FOT logs. We formally define the environment model generation problem and solve it using Imitation Learning (IL) algorithms. In addition, we propose three specific use cases of our approach in the evolutionary CPS development. To validate our approach, we conduct a case study using a simplified autonomous vehicle with a lane-keeping system. The case study results show that our approach can generate accurate virtual environment models for CPS goal verification at a low cost through simulations. △ Less

Submitted 14 April, 2022; originally announced April 2022.

arXiv:2204.05746 [pdf, other]

doi 10.1109/TIFS.2023.3347894

BABD: A Bitcoin Address Behavior Dataset for Pattern Analysis

Authors: Yuexin Xiang, Yuchen Lei, Ding Bao, Wei Ren, Tiantian Li, Qingqing Yang, Wenmao Liu, Tianqing Zhu, Kim-Kwang Raymond Choo

Abstract: Cryptocurrencies are no longer just the preferred option for cybercriminal activities on darknets, due to the increasing adoption in mainstream applications. This is partly due to the transparency associated with the underpinning ledgers, where any individual can access the record of a transaction record on the public ledger. In this paper, we build a dataset comprising Bitcoin transactions betwee… ▽ More Cryptocurrencies are no longer just the preferred option for cybercriminal activities on darknets, due to the increasing adoption in mainstream applications. This is partly due to the transparency associated with the underpinning ledgers, where any individual can access the record of a transaction record on the public ledger. In this paper, we build a dataset comprising Bitcoin transactions between 12 July 2019 and 26 May 2021. This dataset (hereafter referred to as BABD-13) contains 13 types of Bitcoin addresses, 5 categories of indicators with 148 features, and 544,462 labeled data, which is the largest labeled Bitcoin address behavior dataset publicly available to our knowledge. We then use our proposed dataset on common machine learning models, namely: k-nearest neighbors algorithm, decision tree, random forest, multilayer perceptron, and XGBoost. The results show that the accuracy rates of these machine learning models for the multi-classification task on our proposed dataset are between 93.24% and 97.13%. We also analyze the proposed features and their relationships from the experiments, and propose a k-hop subgraph generation algorithm to extract a k-hop subgraph from the entire Bitcoin transaction graph constructed by the directed heterogeneous multigraph starting from a specific Bitcoin address node (e.g., a known transaction associated with a criminal investigation). Besides, we initially analyze the behavior patterns of different types of Bitcoin addresses according to the extracted features. △ Less

Submitted 5 May, 2022; v1 submitted 10 April, 2022; originally announced April 2022.

Comments: 14 pages, 4 figures

MSC Class: 68-11 ACM Class: H.2.8

Journal ref: in IEEE Transactions on Information Forensics and Security, vol. 19, pp. 2171-2185, 2024

arXiv:2202.12808 [pdf, other]

High-Dimensional Sparse Bayesian Learning without Covariance Matrices

Authors: Alexander Lin, Andrew H. Song, Berkin Bilgic, Demba Ba

Abstract: Sparse Bayesian learning (SBL) is a powerful framework for tackling the sparse coding problem. However, the most popular inference algorithms for SBL become too expensive for high-dimensional settings, due to the need to store and compute a large covariance matrix. We introduce a new inference scheme that avoids explicit construction of the covariance matrix by solving multiple linear systems in p… ▽ More Sparse Bayesian learning (SBL) is a powerful framework for tackling the sparse coding problem. However, the most popular inference algorithms for SBL become too expensive for high-dimensional settings, due to the need to store and compute a large covariance matrix. We introduce a new inference scheme that avoids explicit construction of the covariance matrix by solving multiple linear systems in parallel to obtain the posterior moments for SBL. Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm. On several simulations, our method scales better than existing approaches in computation time and memory, especially for structured dictionaries capable of fast matrix-vector multiplication. △ Less

Submitted 25 February, 2022; originally announced February 2022.

Comments: 5 pages

Journal ref: IEEE ICASSP 2022

arXiv:2110.04683 [pdf, other]

Mixture Model Auto-Encoders: Deep Clustering through Dictionary Learning

Authors: Alexander Lin, Andrew H. Song, Demba Ba

Abstract: State-of-the-art approaches for clustering high-dimensional data utilize deep auto-encoder architectures. Many of these networks require a large number of parameters and suffer from a lack of interpretability, due to the black-box nature of the auto-encoders. We introduce Mixture Model Auto-Encoders (MixMate), a novel architecture that clusters data by performing inference on a generative model. D… ▽ More State-of-the-art approaches for clustering high-dimensional data utilize deep auto-encoder architectures. Many of these networks require a large number of parameters and suffer from a lack of interpretability, due to the black-box nature of the auto-encoders. We introduce Mixture Model Auto-Encoders (MixMate), a novel architecture that clusters data by performing inference on a generative model. Derived from the perspective of sparse dictionary learning and mixture models, MixMate comprises several auto-encoders, each tasked with reconstructing data in a distinct cluster, while enforcing sparsity in the latent space. Through experiments on various image datasets, we show that MixMate achieves competitive performance compared to state-of-the-art deep clustering algorithms, while using orders of magnitude fewer parameters. △ Less

Submitted 25 February, 2022; v1 submitted 9 October, 2021; originally announced October 2021.

Comments: 5 pages, 3 figures

Journal ref: IEEE ICASSP 2022

arXiv:2109.11066 [pdf, other]

A two-step machine learning approach for crop disease detection: an application of GAN and UAV technology

Authors: Aaditya Prasad, Nikhil Mehta, Matthew Horak, Wan D. Bae

Abstract: Automated plant diagnosis is a technology that promises large increases in cost-efficiency for agriculture. However, multiple problems reduce the effectiveness of drones, including the inverse relationship between resolution and speed and the lack of adequate labeled training data. This paper presents a two-step machine learning approach that analyzes low-fidelity and high-fidelity images in seque… ▽ More Automated plant diagnosis is a technology that promises large increases in cost-efficiency for agriculture. However, multiple problems reduce the effectiveness of drones, including the inverse relationship between resolution and speed and the lack of adequate labeled training data. This paper presents a two-step machine learning approach that analyzes low-fidelity and high-fidelity images in sequence, preserving efficiency as well as accuracy. Two data-generators are also used to minimize class imbalance in the high-fidelity dataset and to produce low-fidelity data that is representative of UAV images. The analysis of applications and methods is conducted on a database of high-fidelity apple tree images which are corrupted with class imbalance. The application begins by generating high-fidelity data using generative networks and then uses this novel data alongside the original high-fidelity data to produce low-fidelity images. A machine-learning identifier identifies plants and labels them as potentially diseased or not. A machine learning classifier is then given the potentially diseased plant images and returns actual diagnoses for these plants. The results show an accuracy of 96.3% for the high-fidelity system and a 75.5% confidence level for our low-fidelity system. Our drone technology shows promising results in accuracy when compared to labor-based methods of diagnosis. △ Less

Submitted 18 September, 2021; originally announced September 2021.

Comments: 13 pages, 5 figures Preprint of an article submitted for consideration in the International Journal on Artificial Intelligence Tools, 2021, World Scientific Publishing Company, https://www.worldscientific.com/worldscinet/ijait

ACM Class: I.2.6; I.2.10

arXiv:2106.00058 [pdf, other]

Stable and Interpretable Unrolled Dictionary Learning

Authors: Bahareh Tolooshams, Demba Ba

Abstract: The dictionary learning problem, representing data as a combination of a few atoms, has long stood as a popular method for learning representations in statistics and signal processing. The most popular dictionary learning algorithm alternates between sparse coding and dictionary update steps, and a rich literature has studied its theoretical convergence. The success of dictionary learning relies o… ▽ More The dictionary learning problem, representing data as a combination of a few atoms, has long stood as a popular method for learning representations in statistics and signal processing. The most popular dictionary learning algorithm alternates between sparse coding and dictionary update steps, and a rich literature has studied its theoretical convergence. The success of dictionary learning relies on access to a "good" initial estimate of the dictionary and the ability of the sparse coding step to provide an unbiased estimate of the code. The growing popularity of unrolled sparse coding networks has led to the empirical finding that backpropagation through such networks performs dictionary learning. We offer the theoretical analysis of these empirical results through PUDLE, a Provable Unrolled Dictionary LEarning method. We provide conditions on the network initialization and data distribution sufficient to recover and preserve the support of the latent code. Additionally, we address two challenges; first, the vanilla unrolled sparse coding computes a biased code estimate, and second, gradients during backpropagated learning can become unstable. We show approaches to reduce the bias of the code estimate in the forward pass, and that of the dictionary estimate in the backward pass. We propose strategies to resolve the learning instability by tuning network parameters and modifying the loss function. Overall, we highlight the impact of loss, unrolling, and backpropagation on convergence. We complement our findings through synthetic and image denoising experiments. Finally, we demonstrate PUDLE's interpretability, a driving factor in designing deep networks based on iterative optimizations, by building a mathematical relation between network weights, its output, and the training set. △ Less

Submitted 2 August, 2022; v1 submitted 31 May, 2021; originally announced June 2021.

Comments: Published in Transactions on Machine Learning Research (TMLR) (08/2022)

arXiv:2105.10439 [pdf, other]

doi 10.1109/TSP.2022.3186185

Covariance-Free Sparse Bayesian Learning

Authors: Alexander Lin, Andrew H. Song, Berkin Bilgic, Demba Ba

Abstract: Sparse Bayesian learning (SBL) is a powerful framework for tackling the sparse coding problem while also providing uncertainty quantification. The most popular inference algorithms for SBL exhibit prohibitively large computational costs for high-dimensional problems due to the need to maintain a large covariance matrix. To resolve this issue, we introduce a new method for accelerating SBL inferenc… ▽ More Sparse Bayesian learning (SBL) is a powerful framework for tackling the sparse coding problem while also providing uncertainty quantification. The most popular inference algorithms for SBL exhibit prohibitively large computational costs for high-dimensional problems due to the need to maintain a large covariance matrix. To resolve this issue, we introduce a new method for accelerating SBL inference -- named covariance-free expectation maximization (CoFEM) -- that avoids explicit computation of the covariance matrix. CoFEM solves multiple linear systems to obtain unbiased estimates of the posterior statistics needed by SBL. This is accomplished by exploiting innovations from numerical linear algebra such as preconditioned conjugate gradient and a little-known diagonal estimation rule. For a large class of compressed sensing matrices, we provide theoretical justifications for why our method scales well in high-dimensional settings. Through simulations, we show that CoFEM can be up to thousands of times faster than existing baselines without sacrificing coding accuracy. Through applications to calcium imaging deconvolution and multi-contrast MRI reconstruction, we show that CoFEM enables SBL to tractably tackle high-dimensional sparse coding problems of practical interest. △ Less

Submitted 8 April, 2022; v1 submitted 21 May, 2021; originally announced May 2021.

Comments: 13 pages

arXiv:2104.13894 [pdf, other]

Weighed $\ell_1$ on the simplex: Compressive sensing meets locality

Authors: Abiy Tasissa, Pranay Tankala, Demba Ba

Abstract: Sparse manifold learning algorithms combine techniques in manifold learning and sparse optimization to learn features that could be utilized for downstream tasks. The standard setting of compressive sensing can not be immediately applied to this setup. Due to the intrinsic geometric structure of data, dictionary atoms might be redundant and do not satisfy the restricted isometry property or cohere… ▽ More Sparse manifold learning algorithms combine techniques in manifold learning and sparse optimization to learn features that could be utilized for downstream tasks. The standard setting of compressive sensing can not be immediately applied to this setup. Due to the intrinsic geometric structure of data, dictionary atoms might be redundant and do not satisfy the restricted isometry property or coherence condition. In addition, manifold learning emphasizes learning local geometry which is not reflected in a standard $\ell_1$ minimization problem. We propose weighted $\ell_0$ and weighted $\ell_1$ metrics that encourage representation via neighborhood atoms suited for dictionary based manifold learning. Assuming that the data is generated from Delaunay triangulation, we show the equivalence of weighted $\ell_1$ and weighted $\ell_0$. We discuss an optimization program that learns the dictionaries and sparse coefficients and demonstrate the utility of our regularization on synthetic and real datasets. △ Less

Submitted 28 April, 2021; originally announced April 2021.

Comments: arXiv admin note: text overlap with arXiv:2012.02134

arXiv:2104.00530 [pdf, other]

doi 10.1109/LSP.2021.3127471

Gaussian Process Convolutional Dictionary Learning

Authors: Andrew H. Song, Bahareh Tolooshams, Demba Ba

Abstract: Convolutional dictionary learning (CDL), the problem of estimating shift-invariant templates from data, is typically conducted in the absence of a prior/structure on the templates. In data-scarce or low signal-to-noise ratio (SNR) regimes, learned templates overfit the data and lack smoothness, which can affect the predictive performance of downstream tasks. To address this limitation, we propose… ▽ More Convolutional dictionary learning (CDL), the problem of estimating shift-invariant templates from data, is typically conducted in the absence of a prior/structure on the templates. In data-scarce or low signal-to-noise ratio (SNR) regimes, learned templates overfit the data and lack smoothness, which can affect the predictive performance of downstream tasks. To address this limitation, we propose GPCDL, a convolutional dictionary learning framework that enforces priors on templates using Gaussian Processes (GPs). With the focus on smoothness, we show theoretically that imposing a GP prior is equivalent to Wiener filtering the learned templates, thereby suppressing high-frequency components and promoting smoothness. We show that the algorithm is a simple extension of the classical iteratively reweighted least squares algorithm, independent of the choice of GP kernels. This property allows one to experiment flexibly with different smoothness assumptions. Through simulation, we show that GPCDL learns smooth dictionaries with better accuracy than the unregularized alternative across a range of SNRs. Through an application to neural spiking data, we show that GPCDL learns a more accurate and visually-interpretable smooth dictionary, leading to superior predictive performance compared to non-regularized CDL, as well as parametric alternatives. △ Less

Submitted 24 November, 2021; v1 submitted 28 March, 2021; originally announced April 2021.

Comments: IEEE Signal Processing Letters (2021)

arXiv:2102.07003 [pdf, other]

On the convergence of group-sparse autoencoders

Authors: Emmanouil Theodosis, Bahareh Tolooshams, Pranay Tankala, Abiy Tasissa, Demba Ba

Abstract: Recent approaches in the theoretical analysis of model-based deep learning architectures have studied the convergence of gradient descent in shallow ReLU networks that arise from generative models whose hidden layers are sparse. Motivated by the success of architectures that impose structured forms of sparsity, we introduce and study a group-sparse autoencoder that accounts for a variety of genera… ▽ More Recent approaches in the theoretical analysis of model-based deep learning architectures have studied the convergence of gradient descent in shallow ReLU networks that arise from generative models whose hidden layers are sparse. Motivated by the success of architectures that impose structured forms of sparsity, we introduce and study a group-sparse autoencoder that accounts for a variety of generative models, and utilizes a group-sparse ReLU activation function to force the non-zero units at a given layer to occur in blocks. For clustering models, inputs that result in the same group of active units belong to the same cluster. We proceed to analyze the gradient dynamics of a shallow instance of the proposed autoencoder, trained with data adhering to a group-sparse generative model. In this setting, we theoretically prove the convergence of the network parameters to a neighborhood of the generating matrix. We validate our model through numerical analysis and highlight the superior performance of networks with a group-sparse ReLU compared to networks that utilize traditional ReLUs, both in sparse coding and in parameter recovery tasks. We also provide real data experiments to corroborate the simulated results, and emphasize the clustering capabilities of structured sparsity models. △ Less

Submitted 21 January, 2022; v1 submitted 13 February, 2021; originally announced February 2021.

arXiv:2012.02134 [pdf, other]

K-Deep Simplex: Deep Manifold Learning via Local Dictionaries

Authors: Pranay Tankala, Abiy Tasissa, James M. Murphy, Demba Ba

Abstract: We propose K-Deep Simplex (KDS) which, given a set of data points, learns a dictionary comprising synthetic landmarks, along with representation coefficients supported on a simplex. KDS integrates manifold learning and sparse coding/dictionary learning: reconstruction term, as in classical dictionary learning, and a novel local weighted $\ell_1$ penalty that encourages each data point to represent… ▽ More We propose K-Deep Simplex (KDS) which, given a set of data points, learns a dictionary comprising synthetic landmarks, along with representation coefficients supported on a simplex. KDS integrates manifold learning and sparse coding/dictionary learning: reconstruction term, as in classical dictionary learning, and a novel local weighted $\ell_1$ penalty that encourages each data point to represent itself as a convex combination of nearby landmarks. We solve the proposed optimization program using alternating minimization and design an efficient, interpretable autoencoder using algorithm enrolling. We theoretically analyze the proposed program by relating the weighted $\ell_1$ penalty in KDS to a weighted $\ell_0$ program. Assuming that the data are generated from a Delaunay triangulation, we prove the equivalence of the weighted $\ell_1$ and weighted $\ell_0$ programs. If the representation coefficients are given, we prove that the resulting dictionary is unique. Further, we show that low-dimensional representations can be efficiently obtained from the covariance of the coefficient matrix. We apply KDS to the unsupervised clustering problem and prove theoretical performance guarantees. Experiments show that the algorithm is highly efficient and performs competitively on synthetic and real data sets. △ Less

Submitted 14 January, 2023; v1 submitted 3 December, 2020; originally announced December 2020.

Comments: 14 pages, 8 figures

arXiv:2010.11391 [pdf, ps, other]

Unfolding Neural Networks for Compressive Multichannel Blind Deconvolution

Authors: Bahareh Tolooshams, Satish Mulleti, Demba Ba, Yonina C. Eldar

Abstract: We propose a learned-structured unfolding neural network for the problem of compressive sparse multichannel blind-deconvolution. In this problem, each channel's measurements are given as convolution of a common source signal and sparse filter. Unlike prior works where the compression is achieved either through random projections or by applying a fixed structured compression matrix, this paper prop… ▽ More We propose a learned-structured unfolding neural network for the problem of compressive sparse multichannel blind-deconvolution. In this problem, each channel's measurements are given as convolution of a common source signal and sparse filter. Unlike prior works where the compression is achieved either through random projections or by applying a fixed structured compression matrix, this paper proposes to learn the compression matrix from data. Given the full measurements, the proposed network is trained in an unsupervised fashion to learn the source and estimate sparse filters. Then, given the estimated source, we learn a structured compression operator while optimizing for signal reconstruction and sparse filter recovery. The efficient structure of the compression allows its practical hardware implementation. The proposed neural network is an autoencoder constructed based on an unfolding approach: upon training, the encoder maps the compressed measurements into an estimate of sparse filters using the compression operator and the source, and the linear convolutional decoder reconstructs the full measurements. We demonstrate that our method is superior to classical structured compressive sparse multichannel blind-deconvolution methods in terms of accuracy and speed of sparse filter recovery. △ Less

Submitted 11 February, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

Comments: Accepted to 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2021)

arXiv:2006.09534 [pdf, other]

Towards improving discriminative reconstruction via simultaneous dense and sparse coding

Authors: Abiy Tasissa, Emmanouil Theodosis, Bahareh Tolooshams, Demba Ba

Abstract: Discriminative features extracted from the sparse coding model have been shown to perform well for classification. Recent deep learning architectures have further improved reconstruction in inverse problems by considering new dense priors learned from data. We propose a novel dense and sparse coding model that integrates both representation capability and discriminative features. The model studies… ▽ More Discriminative features extracted from the sparse coding model have been shown to perform well for classification. Recent deep learning architectures have further improved reconstruction in inverse problems by considering new dense priors learned from data. We propose a novel dense and sparse coding model that integrates both representation capability and discriminative features. The model studies the problem of recovering a dense vector $\mathbf{x}$ and a sparse vector $\mathbf{u}$ given measurements of the form $\mathbf{y} = \mathbf{A}\mathbf{x}+\mathbf{B}\mathbf{u}$. Our first analysis proposes a geometric condition based on the minimal angle between spanning subspaces corresponding to the matrices $\mathbf{A}$ and $\mathbf{B}$ that guarantees unique solution to the model. The second analysis shows that, under mild assumptions, a convex program recovers the dense and sparse components. We validate the effectiveness of the model on simulated data and propose a dense and sparse autoencoder (DenSaE) tailored to learning the dictionaries from the dense and sparse model. We demonstrate that (i) DenSaE denoises natural images better than architectures derived from the sparse coding model ($\mathbf{B}\mathbf{u}$), (ii) in the presence of noise, training the biases in the latter amounts to implicitly learning the $\mathbf{A}\mathbf{x} + \mathbf{B}\mathbf{u}$ model, (iii) $\mathbf{A}$ and $\mathbf{B}$ capture low- and high-frequency contents, respectively, and (iv) compared to the sparse coding model, DenSaE offers a balance between discriminative power and representation. △ Less

Submitted 13 December, 2022; v1 submitted 16 June, 2020; originally announced June 2020.

Comments: 24 pages

arXiv:2005.02325 [pdf]

Digraph of Senegal s local languages: issues, challenges and prospects of their transliteration

Authors: Elhadji Mamadou Nguer, Diop Sokhna Bao, Yacoub Ahmed Fall, Mouhamadou Khoule

Abstract: The local languages in Senegal, like those of West African countries in general, are written based on two alphabets: supplemented Arabic alphabet (called Ajami) and Latin alphabet. Each writing has its own applications. Ajami writing is generally used by people educated in Koranic schools for communication, business, literature (religious texts, poetry, etc.), traditional religious medicine, etc.… ▽ More The local languages in Senegal, like those of West African countries in general, are written based on two alphabets: supplemented Arabic alphabet (called Ajami) and Latin alphabet. Each writing has its own applications. Ajami writing is generally used by people educated in Koranic schools for communication, business, literature (religious texts, poetry, etc.), traditional religious medicine, etc. Writing with Latin characters is used for localization of ICT (Web, dictionaries, Windows and Google tools translated in Wolof, etc.), the translation of legal texts (commercial code and constitution translated in Wolof) and religious ones (Quran and Bible in Wolof), book edition, etc. To facilitate both populations general access to knowledge, it is useful to set up transliteration tools between these two scriptures. This work falls within the framework of the implementation of project for a collaborative online dictionary Wolof (Nguer E. M., Khoule M, Thiam M. N., Mbaye B. T., Thiare O., Cisse M. T., Mangeot M. 2014), which will involve people using Ajami writing. Our goal will consist, on the one hand in raising the issues related to the transliteration and the challenges that this will raise, and on the other one, presenting the perspectives. △ Less

Submitted 5 May, 2020; originally announced May 2020.

Journal ref: LTC 2015

arXiv:1910.14627 [pdf, other]

An Automatic Design Framework of Swarm Pattern Formation based on Multi-objective Genetic Programming

Authors: Zhun Fan, Zhaojun Wang, Xiaomin Zhu, Bingliang Hu, Anmin Zou, Dongwei Bao

Abstract: Most existing swarm pattern formation methods depend on a predefined gene regulatory network (GRN) structure that requires designers' priori knowledge, which is difficult to adapt to complex and changeable environments. To dynamically adapt to the complex and changeable environments, we propose an automatic design framework of swarm pattern formation based on multi-objective genetic programming. T… ▽ More Most existing swarm pattern formation methods depend on a predefined gene regulatory network (GRN) structure that requires designers' priori knowledge, which is difficult to adapt to complex and changeable environments. To dynamically adapt to the complex and changeable environments, we propose an automatic design framework of swarm pattern formation based on multi-objective genetic programming. The proposed framework does not need to define the structure of the GRN-based model in advance, and it applies some basic network motifs to automatically structure the GRN-based model. In addition, a multi-objective genetic programming (MOGP) combines with NSGA-II, namely MOGP-NSGA-II, to balance the complexity and accuracy of the GRN-based model. In evolutionary process, an MOGP-NSGA-II and differential evolution (DE) are applied to optimize the structures and parameters of the GRN-based model in parallel. Simulation results demonstrate that the proposed framework can effectively evolve some novel GRN-based models, and these GRN-based models not only have a simpler structure and a better performance, but also are robust to the complex and changeable environments. △ Less

Submitted 1 November, 2019; v1 submitted 31 October, 2019; originally announced October 2019.

arXiv:1910.12727 [pdf]

Layer Pruning for Accelerating Very Deep Neural Networks

Authors: Weiwei Zhang, Changsheng chen, Xuechun Wu, Jialin Gao, Di Bao, Jiwei Li, Xi Zhou

Abstract: In this paper, we propose an adaptive pruning method. This method can cut off the channel and layer adaptively. The proportion of the layer and the channel to be cut is learned adaptively. The pruning method proposed in this paper can reduce half of the parameters, and the accuracy will not decrease or even be higher than baseline. In this paper, we propose an adaptive pruning method. This method can cut off the channel and layer adaptively. The proportion of the layer and the channel to be cut is learned adaptively. The pruning method proposed in this paper can reduce half of the parameters, and the accuracy will not decrease or even be higher than baseline. △ Less

Submitted 28 October, 2019; originally announced October 2019.

Comments: v2

arXiv:1908.09258 [pdf, other]

RandNet: deep learning with compressed measurements of images

Authors: Thomas Chang, Bahareh Tolooshams, Demba Ba

Abstract: Principal component analysis, dictionary learning, and auto-encoders are all unsupervised methods for learning representations from a large amount of training data. In all these methods, the higher the dimensions of the input data, the longer it takes to learn. We introduce a class of neural networks, termed RandNet, for learning representations using compressed random measurements of data of inte… ▽ More Principal component analysis, dictionary learning, and auto-encoders are all unsupervised methods for learning representations from a large amount of training data. In all these methods, the higher the dimensions of the input data, the longer it takes to learn. We introduce a class of neural networks, termed RandNet, for learning representations using compressed random measurements of data of interest, such as images. RandNet extends the convolutional recurrent sparse auto-encoder architecture to dense networks and, more importantly, to the case when the input data are compressed random measurements of the original data. Compressing the input data makes it possible to fit a larger number of batches in memory during training. Moreover, in the case of sparse measurements,training is more efficient computationally. We demonstrate that, in unsupervised settings, RandNet performs dictionary learning using compressed data. In supervised settings, we show that RandNet can classify MNIST images with minimal loss in accuracy, despite being trained with random projections of the images that result in a 50% reduction in size. Overall, our results provide a general principled framework for training neural networks using compressed data. △ Less

Submitted 25 August, 2019; originally announced August 2019.

Comments: The first two authors contributed equally to this work

arXiv:1907.09881 [pdf, other]

Convolutional Dictionary Learning in Hierarchical Networks

Authors: Javier Zazo, Bahareh Tolooshams, Demba Ba

Abstract: Filter banks are a popular tool for the analysis of piecewise smooth signals such as natural images. Motivated by the empirically observed properties of scale and detail coefficients of images in the wavelet domain, we propose a hierarchical deep generative model of piecewise smooth signals that is a recursion across scales: the low pass scale coefficients at one layer are obtained by filtering th… ▽ More Filter banks are a popular tool for the analysis of piecewise smooth signals such as natural images. Motivated by the empirically observed properties of scale and detail coefficients of images in the wavelet domain, we propose a hierarchical deep generative model of piecewise smooth signals that is a recursion across scales: the low pass scale coefficients at one layer are obtained by filtering the scale coefficients at the next layer, and adding a high pass detail innovation obtained by filtering a sparse vector. This recursion describes a linear dynamic system that is a non-Gaussian Markov process across scales and is closely related to multilayer-convolutional sparse coding (ML-CSC) generative model for deep networks, except that our model allows for deeper architectures, and combines sparse and non-sparse signal representations. We propose an alternating minimization algorithm for learning the filters in this hierarchical model given observations at layer zero, e.g., natural images. The algorithm alternates between a coefficient-estimation step and a filter update step. The coefficient update step performs sparse (detail) and smooth (scale) coding and, when unfolded, leads to a deep neural network. We use MNIST to demonstrate the representation capabilities of the model, and its derived features (coefficients) for classification. △ Less

Submitted 23 July, 2019; originally announced July 2019.

arXiv:1907.09063 [pdf, other]

doi 10.1109/TSP.2020.2986897

Fast Convolutional Dictionary Learning off the Grid

Authors: Andrew H. Song, Francisco J. Flores, Demba Ba

Abstract: Given a continuous-time signal that can be modeled as the superposition of localized, time-shifted events from multiple sources, the goal of Convolutional Dictionary Learning (CDL) is to identify the location of the events--by Convolutional Sparse Coding (CSC)--and learn the template for each source--by Convolutional Dictionary Update (CDU). In practice, because we observe samples of the continuou… ▽ More Given a continuous-time signal that can be modeled as the superposition of localized, time-shifted events from multiple sources, the goal of Convolutional Dictionary Learning (CDL) is to identify the location of the events--by Convolutional Sparse Coding (CSC)--and learn the template for each source--by Convolutional Dictionary Update (CDU). In practice, because we observe samples of the continuous-time signal on a uniformly-sampled grid in discrete time, classical CSC methods can only produce estimates of the times when the events occur on this grid, which degrades the performance of the CDU. We introduce a CDL framework that significantly reduces the errors arising from performing the estimation in discrete time. Specifically, we construct an expanded dictionary that comprises, not only discrete-time shifts of the templates, but also interpolated variants, obtained by bandlimited interpolation, that account for continuous-time shifts. For CSC, we develop a novel computationally efficient CSC algorithm, termed Convolutional Orthogonal Matching Pursuit with interpolated dictionary (COMP-INTERP). We benchmarked COMP-INTERP to Contiunuous Basis Pursuit (CBP), the state-of-the-art CSC algorithm for estimating off-the-grid events, and demonstrate, on simulated data, that 1) COMP-INTERP achieves a similar level of accuracy, and 2) is two orders of magnitude faster. For CDU, we derive a novel procedure to update the templates given sparse codes that can occur both on and off the discrete-time grid. We also show that 3) dictionary update with the overcomplete dictionary yields more accurate templates. Finally, we apply the algorithms to the spike sorting problem on electrophysiology recording and show their competitive performance. △ Less

Submitted 21 July, 2019; originally announced July 2019.

Journal ref: IEEE Transactions on Signal Processing 2020

arXiv:1907.03211 [pdf, other]

Convolutional dictionary learning based auto-encoders for natural exponential-family distributions

Authors: Bahareh Tolooshams, Andrew H. Song, Simona Temereanca, Demba Ba

Abstract: We introduce a class of auto-encoder neural networks tailored to data from the natural exponential family (e.g., count data). The architectures are inspired by the problem of learning the filters in a convolutional generative model with sparsity constraints, often referred to as convolutional dictionary learning (CDL). Our work is the first to combine ideas from convolutional generative models and… ▽ More We introduce a class of auto-encoder neural networks tailored to data from the natural exponential family (e.g., count data). The architectures are inspired by the problem of learning the filters in a convolutional generative model with sparsity constraints, often referred to as convolutional dictionary learning (CDL). Our work is the first to combine ideas from convolutional generative models and deep learning for data that are naturally modeled with a non-Gaussian distribution (e.g., binomial and Poisson). This perspective provides us with a scalable and flexible framework that can be re-purposed for a wide range of tasks and assumptions on the generative model. Specifically, the iterative optimization procedure for solving CDL, an unsupervised task, is mapped to an unfolded and constrained neural network, with iterative adjustments to the inputs to account for the generative distribution. We also show that the framework can easily be extended for discriminative training, appropriate for a supervised task. We demonstrate 1) that fitting the generative model to learn, in an unsupervised fashion, the latent stimulus that underlies neural spiking data leads to better goodness-of-fit compared to other baselines, 2) competitive performance compared to state-of-the-art algorithms for supervised Poisson image denoising, with significantly fewer parameters, and 3) gradient dynamics of shallow binomial auto-encoder. △ Less

Submitted 28 June, 2020; v1 submitted 6 July, 2019; originally announced July 2019.

Journal ref: International Conference on Machine Learning (ICML) 2020

arXiv:1904.08827 [pdf, other]

doi 10.1109/TNNLS.2020.3005348

Deep Residual Autoencoders for Expectation Maximization-inspired Dictionary Learning

Authors: Bahareh Tolooshams, Sourav Dey, Demba Ba

Abstract: We introduce a neural-network architecture, termed the constrained recurrent sparse autoencoder (CRsAE), that solves convolutional dictionary learning problems, thus establishing a link between dictionary learning and neural networks. Specifically, we leverage the interpretation of the alternating-minimization algorithm for dictionary learning as an approximate Expectation-Maximization algorithm t… ▽ More We introduce a neural-network architecture, termed the constrained recurrent sparse autoencoder (CRsAE), that solves convolutional dictionary learning problems, thus establishing a link between dictionary learning and neural networks. Specifically, we leverage the interpretation of the alternating-minimization algorithm for dictionary learning as an approximate Expectation-Maximization algorithm to develop autoencoders that enable the simultaneous training of the dictionary and regularization parameter (ReLU bias). The forward pass of the encoder approximates the sufficient statistics of the E-step as the solution to a sparse coding problem, using an iterative proximal gradient algorithm called FISTA. The encoder can be interpreted either as a recurrent neural network or as a deep residual network, with two-sided ReLU non-linearities in both cases. The M-step is implemented via a two-stage back-propagation. The first stage relies on a linear decoder applied to the encoder and a norm-squared loss. It parallels the dictionary update step in dictionary learning. The second stage updates the regularization parameter by applying a loss function to the encoder that includes a prior on the parameter motivated by Bayesian statistics. We demonstrate in an image-denoising task that CRsAE learns Gabor-like filters, and that the EM-inspired approach for learning biases is superior to the conventional approach. In an application to recordings of electrical activity from the brain, we demonstrate that CRsAE learns realistic spike templates and speeds up the process of identifying spike times by 900x compared to algorithms based on convex optimization. △ Less

Submitted 18 October, 2020; v1 submitted 18 April, 2019; originally announced April 2019.

Journal ref: in IEEE Transactions on Neural Networks and Learning Systems, pp. 1-15, 2020

arXiv:1810.09920 [pdf, other]

Clustering Time Series with Nonlinear Dynamics: A Bayesian Non-Parametric and Particle-Based Approach

Authors: Alexander Lin, Yingzhuo Zhang, Jeremy Heng, Stephen A. Allsop, Kay M. Tye, Pierre E. Jacob, Demba Ba

Abstract: We propose a general statistical framework for clustering multiple time series that exhibit nonlinear dynamics into an a-priori-unknown number of sub-groups. Our motivation comes from neuroscience, where an important problem is to identify, within a large assembly of neurons, subsets that respond similarly to a stimulus or contingency. Upon modeling the multiple time series as the output of a Diri… ▽ More We propose a general statistical framework for clustering multiple time series that exhibit nonlinear dynamics into an a-priori-unknown number of sub-groups. Our motivation comes from neuroscience, where an important problem is to identify, within a large assembly of neurons, subsets that respond similarly to a stimulus or contingency. Upon modeling the multiple time series as the output of a Dirichlet process mixture of nonlinear state-space models, we derive a Metropolis-within-Gibbs algorithm for full Bayesian inference that alternates between sampling cluster assignments and sampling parameter values that form the basis of the clustering. The Metropolis step employs recent innovations in particle-based methods. We apply the framework to clustering time series acquired from the prefrontal cortex of mice in an experiment designed to characterize the neural underpinnings of fear. △ Less

Submitted 4 March, 2019; v1 submitted 23 October, 2018; originally announced October 2018.

Journal ref: International Conference on Artificial Intelligence and Statistics (AISTATS 2019)

arXiv:1810.02906 [pdf, other]

Network Distance Based on Laplacian Flows on Graphs

Authors: Dianbin Bao, Kisung You, Lizhen Lin

Abstract: Distance plays a fundamental role in measuring similarity between objects. Various visualization techniques and learning tasks in statistics and machine learning such as shape matching, classification, dimension reduction and clustering often rely on some distance or similarity measure. It is of tremendous importance to have a distance that can incorporate the underlying structure of the object. I… ▽ More Distance plays a fundamental role in measuring similarity between objects. Various visualization techniques and learning tasks in statistics and machine learning such as shape matching, classification, dimension reduction and clustering often rely on some distance or similarity measure. It is of tremendous importance to have a distance that can incorporate the underlying structure of the object. In this paper, we focus on proposing such a distance between network objects. Our key insight is to define a distance based on the long term diffusion behavior of the whole network. We first introduce a dynamic system on graphs called Laplacian flow. Based on this Laplacian flow, a new version of diffusion distance between networks is proposed. We will demonstrate the utility of the distance and its advantage over various existing distances through explicit examples. The distance is also applied to subsequent learning tasks such as clustering network objects. △ Less

Submitted 5 October, 2018; originally announced October 2018.

arXiv:1807.04734 [pdf, other]

Scalable Convolutional Dictionary Learning with Constrained Recurrent Sparse Auto-encoders

Authors: Bahareh Tolooshams, Sourav Dey, Demba Ba

Abstract: Given a convolutional dictionary underlying a set of observed signals, can a carefully designed auto-encoder recover the dictionary in the presence of noise? We introduce an auto-encoder architecture, termed constrained recurrent sparse auto-encoder (CRsAE), that answers this question in the affirmative. Given an input signal and an approximate dictionary, the encoder finds a sparse approximation… ▽ More Given a convolutional dictionary underlying a set of observed signals, can a carefully designed auto-encoder recover the dictionary in the presence of noise? We introduce an auto-encoder architecture, termed constrained recurrent sparse auto-encoder (CRsAE), that answers this question in the affirmative. Given an input signal and an approximate dictionary, the encoder finds a sparse approximation using FISTA. The decoder reconstructs the signal by applying the dictionary to the output of the encoder. The encoder and decoder in CRsAE parallel the sparse-coding and dictionary update steps in optimization-based alternating-minimization schemes for dictionary learning. As such, the parameters of the encoder and decoder are not independent, a constraint which we enforce for the first time. We derive the back-propagation algorithm for CRsAE. CRsAE is a framework for blind source separation that, only knowing the number of sources (dictionary elements), and assuming sparsely-many can overlap, is able to separate them. We demonstrate its utility in the context of spike sorting, a source separation problem in computational neuroscience. We demonstrate the ability of CRsAE to recover the underlying dictionary and characterize its sensitivity as a function of SNR. △ Less

Submitted 12 July, 2018; originally announced July 2018.

arXiv:1807.01958 [pdf, other]

Deeply-Sparse Signal rePresentations ($\text{D}\text{S}^2\text{P}$)

Authors: Demba Ba

Abstract: A recent line of work shows that a deep neural network with ReLU nonlinearities arises from a finite sequence of cascaded sparse coding models, the outputs of which, except for the last element in the cascade, are sparse and unobservable. That is, intermediate outputs deep in the cascade are sparse, hence the title of this manuscript. We show here, using techniques from the dictionary learning lit… ▽ More A recent line of work shows that a deep neural network with ReLU nonlinearities arises from a finite sequence of cascaded sparse coding models, the outputs of which, except for the last element in the cascade, are sparse and unobservable. That is, intermediate outputs deep in the cascade are sparse, hence the title of this manuscript. We show here, using techniques from the dictionary learning literature that, if the measurement matrices in the cascaded sparse coding model (a) satisfy RIP and (b) all have sparse columns except for the last, they can be recovered with high probability. We propose two algorithms for this purpose: one that recovers the matrices in a forward sequence, and another that recovers them in a backward sequence. The method of choice in deep learning to solve this problem is by training an auto-encoder. Our algorithms provide a sound alternative, with theoretical guarantees, as well upper bounds on sample complexity. The theory shows that the learning complexity of the forward algorithm depends on the number of hidden units at the deepest layer and the number of active neurons at that layer (sparsity). In addition, the theory relates the number of hidden units in successive layers, thus giving a practical prescription for designing deep ReLU neural networks. Because it puts fewer restrictions on the architecture, the backward algorithm requires more data. We demonstrate the deep dictionary learning algorithm via simulations. Finally, we use a coupon-collection argument to conjecture a lower bound on sample complexity that gives some insight as to why deep networks require more data to train than shallow ones. △ Less

Submitted 24 April, 2020; v1 submitted 5 July, 2018; originally announced July 2018.

arXiv:1805.07300 [pdf, other]

Multitaper Spectral Estimation HDP-HMMs for EEG Sleep Inference

Authors: Leon Chlon, Andrew Song, Sandya Subramanian, Hugo Soulat, John Tauber, Demba Ba, Michael Prerau

Abstract: Electroencephalographic (EEG) monitoring of neural activity is widely used for sleep disorder diagnostics and research. The standard of care is to manually classify 30-second epochs of EEG time-domain traces into 5 discrete sleep stages. Unfortunately, this scoring process is subjective and time-consuming, and the defined stages do not capture the heterogeneous landscape of healthy and clinical ne… ▽ More Electroencephalographic (EEG) monitoring of neural activity is widely used for sleep disorder diagnostics and research. The standard of care is to manually classify 30-second epochs of EEG time-domain traces into 5 discrete sleep stages. Unfortunately, this scoring process is subjective and time-consuming, and the defined stages do not capture the heterogeneous landscape of healthy and clinical neural dynamics. This motivates the search for a data-driven and principled way to identify the number and composition of salient, reoccurring brain states present during sleep. To this end, we propose a Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM), combined with wide-sense stationary (WSS) time series spectral estimation to construct a generative model for personalized subject sleep states. In addition, we employ multitaper spectral estimation to further reduce the large variance of the spectral estimates inherent to finite-length EEG measurements. By applying our method to both simulated and human sleep data, we arrive at three main results: 1) a Bayesian nonparametric automated algorithm that recovers general temporal dynamics of sleep, 2) identification of subject-specific "microstates" within canonical sleep stages, and 3) discovery of stage-dependent sub-oscillations with shared spectral signatures across subjects. △ Less

Submitted 18 May, 2018; originally announced May 2018.

arXiv:1710.01821 [pdf, other]

Classification of Local Field Potentials using Gaussian Sequence Model

Authors: Taposh Banerjee, John Choi, Bijan Pesaran, Demba Ba, Vahid Tarokh

Abstract: A problem of classification of local field potentials (LFPs), recorded from the prefrontal cortex of a macaque monkey, is considered. An adult macaque monkey is trained to perform a memory-based saccade. The objective is to decode the eye movement goals from the LFP collected during a memory period. The LFP classification problem is modeled as that of classification of smooth functions embedded in… ▽ More A problem of classification of local field potentials (LFPs), recorded from the prefrontal cortex of a macaque monkey, is considered. An adult macaque monkey is trained to perform a memory-based saccade. The objective is to decode the eye movement goals from the LFP collected during a memory period. The LFP classification problem is modeled as that of classification of smooth functions embedded in Gaussian noise. It is then argued that using minimax function estimators as features would lead to consistent LFP classifiers. The theory of Gaussian sequence models allows us to represent minimax estimators as finite dimensional objects. The LFP classifier resulting from this mathematical endeavor is a spectrum based technique, where Fourier series coefficients of the LFP data, followed by appropriate shrinkage and thresholding, are used as features in a linear discriminant classifier. The classifier is then applied to the LFP data to achieve high decoding accuracy. The function classification approach taken in the paper also provides a systematic justification for using Fourier series, with shrinkage and thresholding, as features for the problem, as opposed to using the power spectrum. It also suggests that phase information is crucial to the decision making. △ Less

Submitted 27 November, 2017; v1 submitted 4 October, 2017; originally announced October 2017.

arXiv:1709.09723 [pdf, other]

Estimating a Separably-Markov Random Field (SMuRF) from Binary Observations

Authors: Yingzhuo Zhang, Noa Malem-Shinitski, Stephen A Allsop, Kay Tye, Demba Ba

Abstract: A fundamental problem in neuroscience is to characterize the dynamics of spiking from the neurons in a circuit that is involved in learning about a stimulus or a contingency. A key limitation of current methods to analyze neural spiking data is the need to collapse neural activity over time or trials, which may cause the loss of information pertinent to understanding the function of a neuron or ci… ▽ More A fundamental problem in neuroscience is to characterize the dynamics of spiking from the neurons in a circuit that is involved in learning about a stimulus or a contingency. A key limitation of current methods to analyze neural spiking data is the need to collapse neural activity over time or trials, which may cause the loss of information pertinent to understanding the function of a neuron or circuit. We introduce a new method that can determine not only the trial-to-trial dynamics that accompany the learning of a contingency by a neuron, but also the latency of this learning with respect to the onset of a conditioned stimulus. The backbone of the method is a separable two-dimensional (2D) random field (RF) model of neural spike rasters, in which the joint conditional intensity function of a neuron over time and trials depends on two latent Markovian state sequences that evolve separately but in parallel. Classical tools to estimate state-space models cannot be applied readily to our 2D separable RF model. We develop efficient statistical and computational tools to estimate the parameters of the separable 2D RF model. We apply these to data collected from neurons in the pre-frontal cortex (PFC) in an experiment designed to characterize the neural underpinnings of the associative learning of fear in mice. Overall, the separable 2D RF model provides a detailed, interpretable, characterization of the dynamics of neural spiking that accompany the learning of a contingency. △ Less

Submitted 27 September, 2017; originally announced September 2017.

arXiv:1709.04631 [pdf, other]

Empirical Evaluation of Mutation-based Test Prioritization Techniques

Authors: Donghwan Shin, Shin Yoo, Mike Papadakis, Doo-Hwan Bae

Abstract: We propose a new test case prioritization technique that combines both mutation-based and diversity-based approaches. Our diversity-aware mutation-based technique relies on the notion of mutant distinguishment, which aims to distinguish one mutant's behavior from another, rather than from the original program. We empirically investigate the relative cost and effectiveness of the mutation-based pri… ▽ More We propose a new test case prioritization technique that combines both mutation-based and diversity-based approaches. Our diversity-aware mutation-based technique relies on the notion of mutant distinguishment, which aims to distinguish one mutant's behavior from another, rather than from the original program. We empirically investigate the relative cost and effectiveness of the mutation-based prioritization techniques (i.e., using both the traditional mutant kill and the proposed mutant distinguishment) with 352 real faults and 553,477 developer-written test cases. The empirical evaluation considers both the traditional and the diversity-aware mutation criteria in various settings: single-objective greedy, hybrid, and multi-objective optimization. The results show that there is no single dominant technique across all the studied faults. To this end, \rev{we we show when and the reason why each one of the mutation-based prioritization criteria performs poorly, using a graphical model called Mutant Distinguishment Graph (MDG) that demonstrates the distribution of the fault detecting test cases with respect to mutant kills and distinguishment. △ Less

Submitted 23 January, 2018; v1 submitted 14 September, 2017; originally announced September 2017.

arXiv:1601.06466 [pdf, other]

A Theoretical Framework for Understanding Mutation-Based Testing Methods

Authors: Donghwan Shin, Doo-Hwan Bae

Abstract: In the field of mutation analysis, mutation is the systematic generation of mutated programs (i.e., mutants) from an original program. The concept of mutation has been widely applied to various testing problems, including test set selection, fault localization, and program repair. However, surprisingly little focus has been given to the theoretical foundation of mutation-based testing methods, mak… ▽ More In the field of mutation analysis, mutation is the systematic generation of mutated programs (i.e., mutants) from an original program. The concept of mutation has been widely applied to various testing problems, including test set selection, fault localization, and program repair. However, surprisingly little focus has been given to the theoretical foundation of mutation-based testing methods, making it difficult to understand, organize, and describe various mutation-based testing methods. This paper aims to consider a theoretical framework for understanding mutation-based testing methods. While there is a solid testing framework for general testing, this is incongruent with mutation-based testing methods, because it focuses on the correctness of a program for a test, while the essence of mutation-based testing concerns the differences between programs (including mutants) for a test. In this paper, we begin the construction of our framework by defining a novel testing factor, called a test differentiator, to transform the paradigm of testing from the notion of correctness to the notion of difference. We formally define behavioral differences of programs for a set of tests as a mathematical vector, called a d-vector. We explore the multi-dimensional space represented by d-vectors, and provide a graphical model for describing the space. Based on our framework and formalization, we interpret existing mutation-based fault localization methods and mutant set minimization as applications, and identify novel implications for future work. △ Less

Submitted 24 January, 2016; originally announced January 2016.

Comments: To be appear in ICST 2016

ACM Class: D.2.5, F.1.0

arXiv:1106.0365 [pdf, ps, other]

Lower Bounds for Sparse Recovery

Authors: Khanh Do Ba, Piotr Indyk, Eric Price, David P. Woodruff

Abstract: We consider the following k-sparse recovery problem: design an m x n matrix A, such that for any signal x, given Ax we can efficiently recover x' satisfying ||x-x'||_1 <= C min_{k-sparse} x"} ||x-x"||_1. It is known that there exist matrices A with this property that have only O(k log (n/k)) rows. In this paper we show that this bound is tight. Our bound holds even for the more general /rand… ▽ More We consider the following k-sparse recovery problem: design an m x n matrix A, such that for any signal x, given Ax we can efficiently recover x' satisfying ||x-x'||_1 <= C min_{k-sparse} x"} ||x-x"||_1. It is known that there exist matrices A with this property that have only O(k log (n/k)) rows. In this paper we show that this bound is tight. Our bound holds even for the more general /randomized/ version of the problem, where A is a random variable and the recovery algorithm is required to work for any fixed x with constant probability (over A). △ Less

Submitted 2 June, 2011; v1 submitted 2 June, 2011; originally announced June 2011.

Comments: 11 pages. Appeared at SODA 2010

arXiv:0904.0292 [pdf, ps, other]

Sublinear Time Algorithms for Earth Mover's Distance

Authors: Khanh Do Ba, Huy L Nguyen, Huy N Nguyen, Ronitt Rubinfeld

Abstract: We study the problem of estimating the Earth Mover's Distance (EMD) between probability distributions when given access only to samples. We give closeness testers and additive-error estimators over domains in $[0, Δ]^d$, with sample complexities independent of domain size - permitting the testability even of continuous distributions over infinite domains. Instead, our algorithms depend on other… ▽ More We study the problem of estimating the Earth Mover's Distance (EMD) between probability distributions when given access only to samples. We give closeness testers and additive-error estimators over domains in $[0, Δ]^d$, with sample complexities independent of domain size - permitting the testability even of continuous distributions over infinite domains. Instead, our algorithms depend on other parameters, such as the diameter of the domain space, which may be significantly smaller. We also prove lower bounds showing the dependencies on these parameters to be essentially optimal. Additionally, we consider whether natural classes of distributions exist for which there are algorithms with better dependence on the dimension, and show that for highly clusterable data, this is indeed the case. Lastly, we consider a variant of the EMD, defined over tree metrics instead of the usual L1 metric, and give optimal algorithms. △ Less

Submitted 1 April, 2009; originally announced April 2009.

Comments: 12 pages

Showing 1–45 of 45 results for author: Bae, D