subscribe to arXiv mailings

How Clustering Affects the Convergence of Decentralized Optimization over Networks: A Monte-Carlo-based Approach

Authors: Mohammadreza Doostmohammadian, Shahaboddin Kharazmi, Hamid R. Rabiee

Abstract: Decentralized algorithms have gained substantial interest owing to advancements in cloud computing, Internet of Things (IoT), intelligent transportation networks, and parallel processing over sensor networks. The convergence of such algorithms is directly related to specific properties of the underlying network topology. Specifically, the clustering coefficient is known to affect, for example, the… ▽ More Decentralized algorithms have gained substantial interest owing to advancements in cloud computing, Internet of Things (IoT), intelligent transportation networks, and parallel processing over sensor networks. The convergence of such algorithms is directly related to specific properties of the underlying network topology. Specifically, the clustering coefficient is known to affect, for example, the controllability/observability and the epidemic growth over networks. In this work, we study the effects of the clustering coefficient on the convergence rate of networked optimization approaches. In this regard, we model the structure of large-scale distributed systems by random scale-free (SF) and clustered scale-free (CSF) networks and compare the convergence rate by tuning the network clustering coefficient. This is done by keeping other relevant network properties (such as power-law degree distribution, number of links, and average degree) unchanged. Monte-Carlo-based simulations are used to compare the convergence rate over many trials of SF graph topologies. Furthermore, to study the convergence rate over real case studies, we compare the clustering coefficient of some real-world networks with the eigenspectrum of the underlying network (as a measure of convergence rate). The results interestingly show higher convergence rate over low-clustered networks. This is significant as one can improve the learning rate of many existing decentralized machine-learning scenarios by tuning the network clustering. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: SNAM Journal

arXiv:2406.05279 [pdf, other]

SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings

Authors: MohammadAli SadraeiJavaeri, Ehsaneddin Asgari, Alice Carolyn McHardy, Hamid Reza Rabiee

Abstract: Soft prompt tuning techniques have recently gained traction as an effective strategy for the parameter-efficient tuning of pretrained language models, particularly minimizing the required adjustment of model parameters. Despite their growing use, achieving optimal tuning with soft prompts, especially for smaller datasets, remains a substantial challenge. This study makes two contributions in this… ▽ More Soft prompt tuning techniques have recently gained traction as an effective strategy for the parameter-efficient tuning of pretrained language models, particularly minimizing the required adjustment of model parameters. Despite their growing use, achieving optimal tuning with soft prompts, especially for smaller datasets, remains a substantial challenge. This study makes two contributions in this domain: (i) we introduce SuperPos-Prompt, a new reparameterization technique employing the superposition of multiple pretrained vocabulary embeddings to improve the learning of soft prompts. Our experiments across several GLUE and SuperGLUE benchmarks consistently highlight SuperPos-Prompt's superiority over Residual Prompt tuning, exhibiting an average score increase of $+6.4$ in T5-Small and $+5.0$ in T5-Base along with a faster convergence. Remarkably, SuperPos-Prompt occasionally outperforms even full fine-tuning methods. (ii) Additionally, we demonstrate enhanced performance and rapid convergence by omitting dropouts from the frozen network, yielding consistent improvements across various scenarios and tuning methods. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2406.00249 [pdf, other]

Privacy Challenges in Meta-Learning: An Investigation on Model-Agnostic Meta-Learning

Authors: Mina Rafiei, Mohammadmahdi Maheri, Hamid R. Rabiee

Abstract: Meta-learning involves multiple learners, each dedicated to specific tasks, collaborating in a data-constrained setting. In current meta-learning methods, task learners locally learn models from sensitive data, termed support sets. These task learners subsequently share model-related information, such as gradients or loss values, which is computed using another part of the data termed query set, w… ▽ More Meta-learning involves multiple learners, each dedicated to specific tasks, collaborating in a data-constrained setting. In current meta-learning methods, task learners locally learn models from sensitive data, termed support sets. These task learners subsequently share model-related information, such as gradients or loss values, which is computed using another part of the data termed query set, with a meta-learner. The meta-learner employs this information to update its meta-knowledge. Despite the absence of explicit data sharing, privacy concerns persist. This paper examines potential data leakage in a prominent metalearning algorithm, specifically Model-Agnostic Meta-Learning (MAML). In MAML, gradients are shared between the metalearner and task-learners. The primary objective is to scrutinize the gradient and the information it encompasses about the task dataset. Subsequently, we endeavor to propose membership inference attacks targeting the task dataset containing support and query sets. Finally, we explore various noise injection methods designed to safeguard the privacy of task data and thwart potential attacks. Experimental results demonstrate the effectiveness of these attacks on MAML and the efficacy of proper noise injection methods in countering them. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2405.08031

HGTDR: Advancing Drug Repurposing with Heterogeneous Graph Transformers

Authors: Ali Gharizadeh, Karim Abbasi, Amin Ghareyazi, Mohammad R. K. Mofrad, Hamid R. Rabiee

Abstract: Motivation: Drug repurposing is a viable solution for reducing the time and cost associated with drug development. However, thus far, the proposed drug repurposing approaches still need to meet expectations. Therefore, it is crucial to offer a systematic approach for drug repurposing to achieve cost savings and enhance human lives. In recent years, using biological network-based methods for drug r… ▽ More Motivation: Drug repurposing is a viable solution for reducing the time and cost associated with drug development. However, thus far, the proposed drug repurposing approaches still need to meet expectations. Therefore, it is crucial to offer a systematic approach for drug repurposing to achieve cost savings and enhance human lives. In recent years, using biological network-based methods for drug repurposing has generated promising results. Nevertheless, these methods have limitations. Primarily, the scope of these methods is generally limited concerning the size and variety of data they can effectively handle. Another issue arises from the treatment of heterogeneous data, which needs to be addressed or converted into homogeneous data, leading to a loss of information. A significant drawback is that most of these approaches lack end-to-end functionality, necessitating manual implementation and expert knowledge in certain stages. Results: We propose a new solution, HGTDR (Heterogeneous Graph Transformer for Drug Repurposing), to address the challenges associated with drug repurposing. HGTDR is a three-step approach for knowledge graph-based drug re-purposing: 1) constructing a heterogeneous knowledge graph, 2) utilizing a heterogeneous graph transformer network, and 3) computing relationship scores using a fully connected network. By leveraging HGTDR, users gain the ability to manipulate input graphs, extract information from diverse entities, and obtain their desired output. In the evaluation step, we demonstrate that HGTDR performs comparably to previous methods. Furthermore, we review medical studies to validate our method's top ten drug repurposing suggestions, which have exhibited promising results. We also demon-strated HGTDR's capability to predict other types of relations through numerical and experimental validation, such as drug-protein and disease-protein inter-relations. △ Less

Submitted 18 May, 2024; v1 submitted 12 May, 2024; originally announced May 2024.

Comments: The paper has been archived without having permission from all authors. Please withdraw

arXiv:2405.07452

PLA-SGCN: Protein-Ligand Binding Affinity Prediction by Integrating Similar Pairs and Semi-supervised Graph Convolutional Network

Authors: Karim Abbasi, Parvin Razzaghi, Amin Ghareyazi, Hamid R. Rabiee

Abstract: The protein-ligand binding affinity (PLA) prediction goal is to predict whether or not the ligand could bind to a protein sequence. Recently, in PLA prediction, deep learning has received much attention. Two steps are involved in deep learning-based approaches: feature extraction and task prediction step. Many deep learning-based approaches concentrate on introducing new feature extraction network… ▽ More The protein-ligand binding affinity (PLA) prediction goal is to predict whether or not the ligand could bind to a protein sequence. Recently, in PLA prediction, deep learning has received much attention. Two steps are involved in deep learning-based approaches: feature extraction and task prediction step. Many deep learning-based approaches concentrate on introducing new feature extraction networks or integrating auxiliary knowledge like protein-protein interaction networks or gene ontology knowledge. Then, a task prediction network is designed simply using some fully connected layers. This paper aims to integrate retrieved similar hard protein-ligand pairs in PLA prediction (i.e., task prediction step) using a semi-supervised graph convolutional network (GCN). Hard protein-ligand pairs are retrieved for each input query sample based on the manifold smoothness constraint. Then, a graph is learned automatically in which each node is a protein-ligand pair, and each edge represents the similarity between pairs. In other words, an end-to-end framework is proposed that simultaneously retrieves hard similar samples, learns protein-ligand descriptor, learns the graph topology of the input sample with retrieved similar hard samples (learn adjacency matrix), and learns a semi-supervised GCN to predict the binding affinity (as task predictor). The training step adjusts the parameter values, and in the inference step, the learned model is fine-tuned for each input sample. To evaluate the proposed approach, it is applied to the four well-known PDBbind, Davis, KIBA, and BindingDB datasets. The results show that the proposed method significantly performs better than the comparable approaches. △ Less

Submitted 18 May, 2024; v1 submitted 12 May, 2024; originally announced May 2024.

Comments: The paper has been archived without permission from all authors. Please withdraw

arXiv:2403.03018 [pdf, other]

CRISPR: Ensemble Model

Authors: Mohammad Rostami, Amin Ghariyazi, Hamed Dashti, Mohammad Hossein Rohban, Hamid R. Rabiee

Abstract: Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a gene editing technology that has revolutionized the fields of biology and medicine. However, one of the challenges of using CRISPR is predicting the on-target efficacy and off-target sensitivity of single-guide RNAs (sgRNAs). This is because most existing methods are trained on separate datasets with different genes and cells,… ▽ More Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a gene editing technology that has revolutionized the fields of biology and medicine. However, one of the challenges of using CRISPR is predicting the on-target efficacy and off-target sensitivity of single-guide RNAs (sgRNAs). This is because most existing methods are trained on separate datasets with different genes and cells, which limits their generalizability. In this paper, we propose a novel ensemble learning method for sgRNA design that is accurate and generalizable. Our method combines the predictions of multiple machine learning models to produce a single, more robust prediction. This approach allows us to learn from a wider range of data, which improves the generalizability of our model. We evaluated our method on a benchmark dataset of sgRNA designs and found that it outperformed existing methods in terms of both accuracy and generalizability. Our results suggest that our method can be used to design sgRNAs with high sensitivity and specificity, even for new genes or cells. This could have important implications for the clinical use of CRISPR, as it would allow researchers to design more effective and safer treatments for a variety of diseases. △ Less

Submitted 5 March, 2024; originally announced March 2024.

arXiv:2311.15227 [pdf]

Epidemic modeling and flattening the infection curve in social networks

Authors: Mohammadreza Doostmohammadian, Soraya Doustmohamadian, Najmeh Doostmohammadian, Azam Doustmohammadian, Houman Zarrabi, Hamid R. Rabiee

Abstract: The main goal of this paper is to model the epidemic and flattening the infection curve of the social networks. Flattening the infection curve implies slowing down the spread of the disease and reducing the infection rate via social-distancing, isolation (quarantine) and vaccination. The nan-pharmaceutical methods are a much simpler and efficient way to control the spread of epidemic and infection… ▽ More The main goal of this paper is to model the epidemic and flattening the infection curve of the social networks. Flattening the infection curve implies slowing down the spread of the disease and reducing the infection rate via social-distancing, isolation (quarantine) and vaccination. The nan-pharmaceutical methods are a much simpler and efficient way to control the spread of epidemic and infection rate. By specifying a target group with high centrality for isolation and quarantine one can reach a much flatter infection curve (related to Corona for example) without adding extra costs to health services. The aim of this research is, first, modeling the epidemic and, then, giving strategies and structural algorithms for targeted vaccination or targeted non-pharmaceutical methods for reducing the peak of the viral disease and flattening the infection curve. These methods are more efficient for nan-pharmaceutical interventions as finding the target quarantine group flattens the infection curve much easier. For this purpose, a few number of particular nodes with high centrality are isolated and the infection curve is analyzed. Our research shows meaningful results for flattening the infection curve only by isolating a few number of targeted nodes in the social network. The proposed methods are independent of the type of the disease and are effective for any viral disease, e.g., Covid-19. △ Less

Submitted 26 November, 2023; originally announced November 2023.

Comments: in Persian language. Journal of Modelling in Engineering 2023

arXiv:2310.18225 [pdf, other]

Distributed Delay-Tolerant Strategies for Equality-Constraint Sum-Preserving Resource Allocation

Authors: Mohammadreza Doostmohammadian, Alireza Aghasi, Maria Vrakopoulou, Hamid R. Rabiee, Usman A. Khan, Themistoklis Charalambou

Abstract: This paper proposes two nonlinear dynamics to solve constrained distributed optimization problem for resource allocation over a multi-agent network. In this setup, coupling constraint refers to resource-demand balance which is preserved at all-times. The proposed solutions can address various model nonlinearities, for example, due to quantization and/or saturation. Further, it allows to reach fast… ▽ More This paper proposes two nonlinear dynamics to solve constrained distributed optimization problem for resource allocation over a multi-agent network. In this setup, coupling constraint refers to resource-demand balance which is preserved at all-times. The proposed solutions can address various model nonlinearities, for example, due to quantization and/or saturation. Further, it allows to reach faster convergence or to robustify the solution against impulsive noise or uncertainties. We prove convergence over weakly connected networks using convex analysis and Lyapunov theory. Our findings show that convergence can be reached for general sign-preserving odd nonlinearity. We further propose delay-tolerant mechanisms to handle general bounded heterogeneous time-varying delays over the communication network of agents while preserving all-time feasibility. This work finds application in CPU scheduling and coverage control among others. This paper advances the state-of-the-art by addressing (i) possible nonlinearity on the agents/links, meanwhile handling (ii) resource-demand feasibility at all times, (iii) uniform-connectivity instead of all-time connectivity, and (iv) possible heterogeneous and time-varying delays. To our best knowledge, no existing work addresses contributions (i)-(iv) altogether. Simulations and comparative analysis are provided to corroborate our contributions. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Journal ref: SCL 2023

arXiv:2310.12594 [pdf, other]

Infection Curve Flattening via Targeted Interventions and Self-Isolation

Authors: Mohammadreza Doostmohammadian, Houman Zarrabi, Azam Doustmohammadian, Hamid R. Rabiee

Abstract: Understanding the impact of network clustering and small-world properties on epidemic spread can be crucial in developing effective strategies for managing and controlling infectious diseases. Particularly in this work, we study the impact of these network features on targeted intervention (e.g., self-isolation and quarantine). The targeted individuals for self-isolation are based on centrality me… ▽ More Understanding the impact of network clustering and small-world properties on epidemic spread can be crucial in developing effective strategies for managing and controlling infectious diseases. Particularly in this work, we study the impact of these network features on targeted intervention (e.g., self-isolation and quarantine). The targeted individuals for self-isolation are based on centrality measures and node influence metrics. Compared to our previous works on scale-free networks, small-world networks are considered in this paper. Small-world networks resemble real-world social and human networks. In this type of network, most nodes are not directly connected but can be reached through a few intermediaries (known as the small-worldness property). Real social networks, such as friendship networks, also exhibit this small-worldness property, where most people are connected through a relatively small number of intermediaries. We particularly study the epidemic curve flattening by centrality-based interventions/isolation over small-world networks. Our results show that high clustering while having low small-worldness (higher shortest path characteristics) implies flatter infection curves. In reality, a flatter infection curve implies that the number of new cases of a disease is spread out over a longer period of time, rather than a sharp and sudden increase in cases (a peak in epidemic). In turn, this reduces the strain on healthcare resources and helps to relieve the healthcare services. △ Less

Submitted 19 October, 2023; originally announced October 2023.

Journal ref: SNAM 2023

arXiv:2310.04855 [pdf, other]

Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data

Authors: S. M. F. Sani, Seyed Abbas Hosseini, Hamid R. Rabiee

Abstract: Often, recommendation systems employ continuous training, leading to a self-feedback loop bias in which the system becomes biased toward its previous recommendations. Recent studies have attempted to mitigate this bias by collecting small amounts of unbiased data. While these studies have successfully developed less biased models, they ignore the crucial fact that the recommendations generated by… ▽ More Often, recommendation systems employ continuous training, leading to a self-feedback loop bias in which the system becomes biased toward its previous recommendations. Recent studies have attempted to mitigate this bias by collecting small amounts of unbiased data. While these studies have successfully developed less biased models, they ignore the crucial fact that the recommendations generated by the model serve as the training data for subsequent training sessions. To address this issue, we propose a framework that learns an unbiased estimator using a small amount of uniformly collected data and focuses on generating improved training data for subsequent training iterations. To accomplish this, we view recommendation as a contextual multi-arm bandit problem and emphasize on exploring items that the model has a limited understanding of. We introduce a new offline sequential training schema that simulates real-world continuous training scenarios in recommendation systems, offering a more appropriate framework for studying self-feedback bias. We demonstrate the superiority of our model over state-of-the-art debiasing methods by conducting extensive experiments using the proposed training schema. △ Less

Submitted 7 October, 2023; originally announced October 2023.

arXiv:2310.01696 [pdf, other]

DANI: Fast Diffusion Aware Network Inference with Preserving Topological Structure Property

Authors: Maryam Ramezani, Aryan Ahadinia, Erfan Farhadi, Hamid R. Rabiee

Abstract: The fast growth of social networks and their data access limitations in recent years has led to increasing difficulty in obtaining the complete topology of these networks. However, diffusion information over these networks is available, and many algorithms have been proposed to infer the underlying networks using this information. The previously proposed algorithms only focus on inferring more lin… ▽ More The fast growth of social networks and their data access limitations in recent years has led to increasing difficulty in obtaining the complete topology of these networks. However, diffusion information over these networks is available, and many algorithms have been proposed to infer the underlying networks using this information. The previously proposed algorithms only focus on inferring more links and ignore preserving the critical topological characteristics of the underlying social networks. In this paper, we propose a novel method called DANI to infer the underlying network while preserving its structural properties. It is based on the Markov transition matrix derived from time series cascades, as well as the node-node similarity that can be observed in the cascade behavior from a structural point of view. In addition, the presented method has linear time complexity (increases linearly with the number of nodes, number of cascades, and square of the average length of cascades), and its distributed version in the MapReduce framework is also scalable. We applied the proposed approach to both real and synthetic networks. The experimental results showed that DANI has higher accuracy and lower run time while maintaining structural properties, including modular structure, degree distribution, connected components, density, and clustering coefficients, than well-known network inference methods. △ Less

Submitted 2 October, 2023; originally announced October 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:1706.00941

arXiv:2307.13766 [pdf, other]

ClusterSeq: Enhancing Sequential Recommender Systems with Clustering based Meta-Learning

Authors: Mohammmadmahdi Maheri, Reza Abdollahzadeh, Bardia Mohammadi, Mina Rafiei, Jafar Habibi, Hamid R. Rabiee

Abstract: In practical scenarios, the effectiveness of sequential recommendation systems is hindered by the user cold-start problem, which arises due to limited interactions for accurately determining user preferences. Previous studies have attempted to address this issue by combining meta-learning with user and item-side information. However, these approaches face inherent challenges in modeling user prefe… ▽ More In practical scenarios, the effectiveness of sequential recommendation systems is hindered by the user cold-start problem, which arises due to limited interactions for accurately determining user preferences. Previous studies have attempted to address this issue by combining meta-learning with user and item-side information. However, these approaches face inherent challenges in modeling user preference dynamics, particularly for "minor users" who exhibit distinct preferences compared to more common or "major users." To overcome these limitations, we present a novel approach called ClusterSeq, a Meta-Learning Clustering-Based Sequential Recommender System. ClusterSeq leverages dynamic information in the user sequence to enhance item prediction accuracy, even in the absence of side information. This model preserves the preferences of minor users without being overshadowed by major users, and it capitalizes on the collective knowledge of users within the same cluster. Extensive experiments conducted on various benchmark datasets validate the effectiveness of ClusterSeq. Empirical results consistently demonstrate that ClusterSeq outperforms several state-of-the-art meta-learning recommenders. Notably, compared to existing meta-learning methods, our proposed approach achieves a substantial improvement of 16-39% in Mean Reciprocal Rank (MRR). △ Less

Submitted 25 July, 2023; originally announced July 2023.

arXiv:2303.09173 [pdf, other]

Network-based Control of Epidemic via Flattening the Infection Curve: High-Clustered vs. Low-Clustered Social Networks

Authors: Mohammadreza Doostmohammadian, Hamid R. Rabiee

Abstract: Recent studies in network science and control have shown a meaningful relationship between the epidemic processes (e.g., COVID-19 spread) and some network properties. This paper studies how such network properties, namely clustering coefficient and centrality measures (or node influence metrics), affect the spread of viruses and the growth of epidemics over scale-free networks. The results can be… ▽ More Recent studies in network science and control have shown a meaningful relationship between the epidemic processes (e.g., COVID-19 spread) and some network properties. This paper studies how such network properties, namely clustering coefficient and centrality measures (or node influence metrics), affect the spread of viruses and the growth of epidemics over scale-free networks. The results can be used to target individuals (the nodes in the network) to \textit{flatten the infection curve}. This so-called flattening of the infection curve is to reduce the health service costs and burden to the authorities/governments. Our Monte-Carlo simulation results show that clustered networks are, in general, easier to flatten the infection curve, i.e., with the same connectivity and the same number of isolated individuals they result in more flattened curves. Moreover, distance-based centrality measures, which target the nodes based on their average network distance to other nodes (and not the node degrees), are better choices for targeting individuals for isolation/vaccination. △ Less

Submitted 16 March, 2023; originally announced March 2023.

Comments: Published in Social network analysis and mining

arXiv:2212.03176 [pdf, other]

Domain Adaptation and Generalization on Functional Medical Images: A Systematic Survey

Authors: Gita Sarafraz, Armin Behnamnia, Mehran Hosseinzadeh, Ali Balapour, Amin Meghrazi, Hamid R. Rabiee

Abstract: Machine learning algorithms have revolutionized different fields, including natural language processing, computer vision, signal processing, and medical data processing. Despite the excellent capabilities of machine learning algorithms in various tasks and areas, the performance of these models mainly deteriorates when there is a shift in the test and training data distributions. This gap occurs d… ▽ More Machine learning algorithms have revolutionized different fields, including natural language processing, computer vision, signal processing, and medical data processing. Despite the excellent capabilities of machine learning algorithms in various tasks and areas, the performance of these models mainly deteriorates when there is a shift in the test and training data distributions. This gap occurs due to the violation of the fundamental assumption that the training and test data are independent and identically distributed (i.i.d). In real-world scenarios where collecting data from all possible domains for training is costly and even impossible, the i.i.d assumption can hardly be satisfied. The problem is even more severe in the case of medical images and signals because it requires either expensive equipment or a meticulous experimentation setup to collect data, even for a single domain. Additionally, the decrease in performance may have severe consequences in the analysis of medical records. As a result of such problems, the ability to generalize and adapt under distribution shifts (domain generalization (DG) and domain adaptation (DA)) is essential for the analysis of medical data. This paper provides the first systematic review of DG and DA on functional brain signals to fill the gap of the absence of a comprehensive study in this era. We provide detailed explanations and categorizations of datasets, approaches, and architectures used in DG and DA on functional brain images. We further address the attention-worthy future tracks in this field. △ Less

Submitted 4 December, 2022; originally announced December 2022.

Comments: 41 pages, 8 figures

arXiv:2209.09681 [pdf, other]

doi 10.1371/journal.pone.0277887

SCGG: A Deep Structure-Conditioned Graph Generative Model

Authors: Faezeh Faez, Negin Hashemi Dijujin, Mahdieh Soleymani Baghshah, Hamid R. Rabiee

Abstract: Deep learning-based graph generation approaches have remarkable capacities for graph data modeling, allowing them to solve a wide range of real-world problems. Making these methods able to consider different conditions during the generation procedure even increases their effectiveness by empowering them to generate new graph samples that meet the desired criteria. This paper presents a conditional… ▽ More Deep learning-based graph generation approaches have remarkable capacities for graph data modeling, allowing them to solve a wide range of real-world problems. Making these methods able to consider different conditions during the generation procedure even increases their effectiveness by empowering them to generate new graph samples that meet the desired criteria. This paper presents a conditional deep graph generation method called SCGG that considers a particular type of structural conditions. Specifically, our proposed SCGG model takes an initial subgraph and autoregressively generates new nodes and their corresponding edges on top of the given conditioning substructure. The architecture of SCGG consists of a graph representation learning network and an autoregressive generative model, which is trained end-to-end. Using this model, we can address graph completion, a rampant and inherently difficult problem of recovering missing nodes and their associated edges of partially observed graphs. Experimental results on both synthetic and real-world datasets demonstrate the superiority of our method compared with state-of-the-art baselines. △ Less

Submitted 20 September, 2022; originally announced September 2022.

arXiv:2209.07148 [pdf, ps, other]

Semi-supervised Batch Learning From Logged Data

Authors: Gholamali Aminian, Armin Behnamnia, Roberto Vega, Laura Toni, Chengchun Shi, Hamid R. Rabiee, Omar Rivasplata, Miguel R. D. Rodrigues

Abstract: Off-policy learning methods are intended to learn a policy from logged data, which includes context, action, and feedback (cost or reward) for each sample point. In this work, we build on the counterfactual risk minimization framework, which also assumes access to propensity scores. We propose learning methods for problems where feedback is missing for some samples, so there are samples with feedb… ▽ More Off-policy learning methods are intended to learn a policy from logged data, which includes context, action, and feedback (cost or reward) for each sample point. In this work, we build on the counterfactual risk minimization framework, which also assumes access to propensity scores. We propose learning methods for problems where feedback is missing for some samples, so there are samples with feedback and samples missing-feedback in the logged data. We refer to this type of learning as semi-supervised batch learning from logged data, which arises in a wide range of application domains. We derive a novel upper bound for the true risk under the inverse propensity score estimator to address this kind of learning problem. Using this bound, we propose a regularized semi-supervised batch learning method with logged data where the regularization term is feedback-independent and, as a result, can be evaluated using the logged missing-feedback data. Consequently, even though feedback is only present for some samples, a learning policy can be learned by leveraging the missing-feedback samples. The results of experiments derived from benchmark datasets indicate that these algorithms achieve policies with better performance in comparison with logging policies. △ Less

Submitted 18 February, 2024; v1 submitted 15 September, 2022; originally announced September 2022.

Comments: 46 pages,

arXiv:2202.09914 [pdf, other]

SOInter: A Novel Deep Energy Based Interpretation Method for Explaining Structured Output Models

Authors: S. Fatemeh Seyyedsalehi, Mahdieh Soleymani, Hamid R. Rabiee

Abstract: We propose a novel interpretation technique to explain the behavior of structured output models, which learn mappings between an input vector to a set of output variables simultaneously. Because of the complex relationship between the computational path of output variables in structured models, a feature can affect the value of output through other ones. We focus on one of the outputs as the targe… ▽ More We propose a novel interpretation technique to explain the behavior of structured output models, which learn mappings between an input vector to a set of output variables simultaneously. Because of the complex relationship between the computational path of output variables in structured models, a feature can affect the value of output through other ones. We focus on one of the outputs as the target and try to find the most important features utilized by the structured model to decide on the target in each locality of the input space. In this paper, we assume an arbitrary structured output model is available as a black box and argue how considering the correlations between output variables can improve the explanation performance. The goal is to train a function as an interpreter for the target output variable over the input space. We introduce an energy-based training process for the interpreter function, which effectively considers the structural information incorporated into the model to be explained. The effectiveness of the proposed method is confirmed using a variety of simulated and real data sets. △ Less

Submitted 20 February, 2022; originally announced February 2022.

arXiv:2201.11808 [pdf, other]

LAP: An Attention-Based Module for Concept Based Self-Interpretation and Knowledge Injection in Convolutional Neural Networks

Authors: Rassa Ghavami Modegh, Ahmad Salimi, Alireza Dizaji, Hamid R. Rabiee

Abstract: Despite the state-of-the-art performance of deep convolutional neural networks, they are susceptible to bias and malfunction in unseen situations. Moreover, the complex computation behind their reasoning is not human-understandable to develop trust. External explainer methods have tried to interpret network decisions in a human-understandable way, but they are accused of fallacies due to their ass… ▽ More Despite the state-of-the-art performance of deep convolutional neural networks, they are susceptible to bias and malfunction in unseen situations. Moreover, the complex computation behind their reasoning is not human-understandable to develop trust. External explainer methods have tried to interpret network decisions in a human-understandable way, but they are accused of fallacies due to their assumptions and simplifications. On the other side, the inherent self-interpretability of models, while being more robust to the mentioned fallacies, cannot be applied to the already trained models. In this work, we propose a new attention-based pooling layer, called Local Attention Pooling (LAP), that accomplishes self-interpretability and the possibility for knowledge injection without performance loss. The module is easily pluggable into any convolutional neural network, even the already trained ones. We have defined a weakly supervised training scheme to learn the distinguishing features in decision-making without depending on experts' annotations. We verified our claims by evaluating several LAP-extended models on two datasets, including ImageNet. The proposed framework offers more valid human-understandable and faithful-to-the-model interpretations than the commonly used white-box explainer methods. △ Less

Submitted 24 October, 2023; v1 submitted 27 January, 2022; originally announced January 2022.

MSC Class: 68T07; 68T99 (Primary) 68T45 (Secondary)

arXiv:2112.01131 [pdf, other]

FNR: A Similarity and Transformer-Based Approach to Detect Multi-Modal Fake News in Social Media

Authors: Faeze Ghorbanpour, Maryam Ramezani, Mohammad A. Fazli, Hamid R. Rabiee

Abstract: The availability and interactive nature of social media have made them the primary source of news around the globe. The popularity of social media tempts criminals to pursue their immoral intentions by producing and disseminating fake news using seductive text and misleading images. Therefore, verifying social media news and spotting fakes is crucial. This work aims to analyze multi-modal features… ▽ More The availability and interactive nature of social media have made them the primary source of news around the globe. The popularity of social media tempts criminals to pursue their immoral intentions by producing and disseminating fake news using seductive text and misleading images. Therefore, verifying social media news and spotting fakes is crucial. This work aims to analyze multi-modal features from texts and images in social media for detecting fake news. We propose a Fake News Revealer (FNR) method that utilizes transform learning to extract contextual and semantic features and contrastive loss to determine the similarity between image and text. We applied FNR on two real social media datasets. The results show the proposed method achieves higher accuracies in detecting fake news compared to the previous works. △ Less

Submitted 2 December, 2021; originally announced December 2021.

Comments: 10 pages, 11 figures, 4 tables and 20 references

arXiv:2111.03297 [pdf, other]

doi 10.1109/TETC.2021.3102041

RC-RNN: Reconfigurable Cache Architecture for Storage Systems Using Recurrent Neural Networks

Authors: Shahriar Ebrahimi, Reza Salkhordeh, Seyed Ali Osia, Ali Taheri, Hamid Reza Rabiee, Hossein Asadi

Abstract: Solid-State Drives (SSDs) have significant performance advantages over traditional Hard Disk Drives (HDDs) such as lower latency and higher throughput. Significantly higher price per capacity and limited lifetime, however, prevents designers to completely substitute HDDs by SSDs in enterprise storage systems. SSD-based caching has recently been suggested for storage systems to benefit from higher… ▽ More Solid-State Drives (SSDs) have significant performance advantages over traditional Hard Disk Drives (HDDs) such as lower latency and higher throughput. Significantly higher price per capacity and limited lifetime, however, prevents designers to completely substitute HDDs by SSDs in enterprise storage systems. SSD-based caching has recently been suggested for storage systems to benefit from higher performance of SSDs while minimizing the overall cost. While conventional caching algorithms such as Least Recently Used (LRU) provide high hit ratio in processors, due to the highly random behavior of Input/Output (I/O) workloads, they hardly provide the required performance level for storage systems. In addition to poor performance, inefficient algorithms also shorten SSD lifetime with unnecessary cache replacements. Such shortcomings motivate us to benefit from more complex non-linear algorithms to achieve higher cache performance and extend SSD lifetime. In this paper, we propose RC-RNN, the first reconfigurable SSD-based cache architecture for storage systems that utilizes machine learning to identify performance-critical data pages for I/O caching. The proposed architecture uses Recurrent Neural Networks (RNN) to characterize ongoing workloads and optimize itself towards higher cache performance while improving SSD lifetime. RC-RNN attempts to learn characteristics of the running workload to predict its behavior and then uses the collected information to identify performance-critical data pages to fetch into the cache. Experimental results show that RC-RNN characterizes workloads with an accuracy up to 94.6% for SNIA I/O workloads. RC-RNN can perform similarly to the optimal cache algorithm by an accuracy of 95% on average, and outperforms previous SSD caching architectures by providing up to 7x higher hit ratio and decreasing cache replacements by up to 2x. △ Less

Submitted 5 November, 2021; originally announced November 2021.

Comments: Date of Publication: 09 August 2021

Journal ref: IEEE Transactions on Emerging Topics in Computing (2021)

arXiv:2110.03800 [pdf, other]

doi 10.1145/3487553.3524721

CCGG: A Deep Autoregressive Model for Class-Conditional Graph Generation

Authors: Yassaman Ommi, Matin Yousefabadi, Faezeh Faez, Amirmojtaba Sabour, Mahdieh Soleymani Baghshah, Hamid R. Rabiee

Abstract: Graph data structures are fundamental for studying connected entities. With an increase in the number of applications where data is represented as graphs, the problem of graph generation has recently become a hot topic. However, despite its significance, conditional graph generation that creates graphs with desired features is relatively less explored in previous studies. This paper addresses the… ▽ More Graph data structures are fundamental for studying connected entities. With an increase in the number of applications where data is represented as graphs, the problem of graph generation has recently become a hot topic. However, despite its significance, conditional graph generation that creates graphs with desired features is relatively less explored in previous studies. This paper addresses the problem of class-conditional graph generation that uses class labels as generation constraints by introducing the Class Conditioned Graph Generator (CCGG). We built CCGG by injecting the class information as an additional input into a graph generator model and including a classification loss in its total loss along with a gradient passing trick. Our experiments show that CCGG outperforms existing conditional graph generation methods on various datasets. It also manages to maintain the quality of the generated graphs in terms of distribution-based evaluation metrics. △ Less

Submitted 25 April, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

arXiv:2109.09329 [pdf, other]

Distributed Detection and Mitigation of Biasing Attacks over Multi-Agent Networks

Authors: Mohammadreza Doostmohammadian, Houman Zarrabi, Hamid R. Rabiee, Usman A. Khan, Themistoklis Charalambous

Abstract: This paper proposes a distributed attack detection and mitigation technique based on distributed estimation over a multi-agent network, where the agents take partial system measurements susceptible to (possible) biasing attacks. In particular, we assume that the system is not locally observable via the measurements in the direct neighborhood of any agent. First, for performance analysis in the att… ▽ More This paper proposes a distributed attack detection and mitigation technique based on distributed estimation over a multi-agent network, where the agents take partial system measurements susceptible to (possible) biasing attacks. In particular, we assume that the system is not locally observable via the measurements in the direct neighborhood of any agent. First, for performance analysis in the attack-free case, we show that the proposed distributed estimation is unbiased with bounded mean-square deviation in steady-state. Then, we propose a residual-based strategy to locally detect possible attacks at agents. In contrast to the deterministic thresholds in the literature assuming an upper bound on the noise support, we define the thresholds on the residuals in a probabilistic sense. After detecting and isolating the attacked agent, a system-digraph-based mitigation strategy is proposed to replace the attacked measurement with a new observationally-equivalent one to recover potential observability loss. We adopt a graph-theoretic method to classify the agents based on their measurements, to distinguish between the agents recovering the system rank-deficiency and the ones recovering output-connectivity of the system digraph. The attack detection/mitigation strategy is specifically described for each type, which is of polynomial-order complexity for large-scale applications. Illustrative simulations support our theoretical results. △ Less

Submitted 20 September, 2021; originally announced September 2021.

Comments: Accepted TNSE

arXiv:2105.10641 [pdf, other]

Analysis of Contractions in System Graphs: Application to State Estimation

Authors: Mohammadreza Doostmohammadian, Themistoklis Charalambous, Miadreza Shafie-khah, Hamid R. Rabiee, Usman A. Khan

Abstract: Observability and estimation are closely tied to the system structure, which can be visualized as a system graph--a graph that captures the inter-dependencies within the state variables. For example, in social system graphs such inter-dependencies represent the social interactions of different individuals. It was recently shown that contractions, a key concept from graph theory, in the system grap… ▽ More Observability and estimation are closely tied to the system structure, which can be visualized as a system graph--a graph that captures the inter-dependencies within the state variables. For example, in social system graphs such inter-dependencies represent the social interactions of different individuals. It was recently shown that contractions, a key concept from graph theory, in the system graph are critical to system observability, as (at least) one state measurement in every contraction is necessary for observability. Thus, the size and number of contractions are critical in recovering for loss of observability. In this paper, the correlation between the average-size/number of contractions and the global clustering coefficient (GCC) of the system graph is studied. Our empirical results show that estimating systems with high GCC requires fewer measurements, and in case of measurement failure, there are fewer possible options to find substitute measurement that recovers the system's observability. This is significant as by tuning the GCC, we can improve the observability properties of large-scale engineered networks, such as social networks and smart grid. △ Less

Submitted 22 May, 2021; originally announced May 2021.

arXiv:2104.07613 [pdf, other]

SINA-BERT: A pre-trained Language Model for Analysis of Medical Texts in Persian

Authors: Nasrin Taghizadeh, Ehsan Doostmohammadi, Elham Seifossadat, Hamid R. Rabiee, Maedeh S. Tahaei

Abstract: We have released Sina-BERT, a language model pre-trained on BERT (Devlin et al., 2018) to address the lack of a high-quality Persian language model in the medical domain. SINA-BERT utilizes pre-training on a large-scale corpus of medical contents including formal and informal texts collected from a variety of online resources in order to improve the performance on health-care related tasks. We emp… ▽ More We have released Sina-BERT, a language model pre-trained on BERT (Devlin et al., 2018) to address the lack of a high-quality Persian language model in the medical domain. SINA-BERT utilizes pre-training on a large-scale corpus of medical contents including formal and informal texts collected from a variety of online resources in order to improve the performance on health-care related tasks. We employ SINA-BERT to complete following representative tasks: categorization of medical questions, medical sentiment analysis, and medical question retrieval. For each task, we have developed Persian annotated data sets for training and evaluation and learnt a representation for the data of each task especially complex and long medical questions. With the same architecture being used across tasks, SINA-BERT outperforms BERT-based models that were previously made available in the Persian language. △ Less

Submitted 15 April, 2021; originally announced April 2021.

arXiv:2104.03597 [pdf, other]

doi 10.1007/978-3-030-87240-3_68

GKD: Semi-supervised Graph Knowledge Distillation for Graph-Independent Inference

Authors: Mahsa Ghorbani, Mojtaba Bahrami, Anees Kazi, Mahdieh SoleymaniBaghshah, Hamid R. Rabiee, Nassir Navab

Abstract: The increased amount of multi-modal medical data has opened the opportunities to simultaneously process various modalities such as imaging and non-imaging data to gain a comprehensive insight into the disease prediction domain. Recent studies using Graph Convolutional Networks (GCNs) provide novel semi-supervised approaches for integrating heterogeneous modalities while investigating the patients'… ▽ More The increased amount of multi-modal medical data has opened the opportunities to simultaneously process various modalities such as imaging and non-imaging data to gain a comprehensive insight into the disease prediction domain. Recent studies using Graph Convolutional Networks (GCNs) provide novel semi-supervised approaches for integrating heterogeneous modalities while investigating the patients' associations for disease prediction. However, when the meta-data used for graph construction is not available at inference time (e.g., coming from a distinct population), the conventional methods exhibit poor performance. To address this issue, we propose a novel semi-supervised approach named GKD based on knowledge distillation. We train a teacher component that employs the label-propagation algorithm besides a deep neural network to benefit from the graph and non-graph modalities only in the training phase. The teacher component embeds all the available information into the soft pseudo-labels. The soft pseudo-labels are then used to train a deep student network for disease prediction of unseen test data for which the graph modality is unavailable. We perform our experiments on two public datasets for diagnosing Autism spectrum disorder, and Alzheimer's disease, along with a thorough analysis on synthetic multi-modal datasets. According to these experiments, GKD outperforms the previous graph-based deep learning methods in terms of accuracy, AUC, and Macro F1. △ Less

Submitted 8 April, 2021; originally announced April 2021.

arXiv:2103.10056 [pdf, other]

Dementia Severity Classification under Small Sample Size and Weak Supervision in Thick Slice MRI

Authors: Reza Shirkavand, Sana Ayromlou, Soroush Farghadani, Maedeh-sadat Tahaei, Fattane Pourakpour, Bahareh Siahlou, Zeynab Khodakarami, Mohammad H. Rohban, Mansoor Fatehi, Hamid R. Rabiee

Abstract: Early detection of dementia through specific biomarkers in MR images plays a critical role in developing support strategies proactively. Fazekas scale facilitates an accurate quantitative assessment of the severity of white matter lesions and hence the disease. Imaging Biomarkers of dementia are multiple and comprehensive documentation of them is time-consuming. Therefore, any effort to automatica… ▽ More Early detection of dementia through specific biomarkers in MR images plays a critical role in developing support strategies proactively. Fazekas scale facilitates an accurate quantitative assessment of the severity of white matter lesions and hence the disease. Imaging Biomarkers of dementia are multiple and comprehensive documentation of them is time-consuming. Therefore, any effort to automatically extract these biomarkers will be of clinical value while reducing inter-rater discrepancies. To tackle this problem, we propose to classify the disease severity based on the Fazekas scale through the visual biomarkers, namely the Periventricular White Matter (PVWM) and the Deep White Matter (DWM) changes, in the real-world setting of thick-slice MRI. Small training sample size and weak supervision in form of assigning severity labels to the whole MRI stack are among the main challenges. To combat the mentioned issues, we have developed a deep learning pipeline that employs self-supervised representation learning, multiple instance learning, and appropriate pre-processing steps. We use pretext tasks such as non-linear transformation, local shuffling, in- and out-painting for self-supervised learning of useful features in this domain. Furthermore, an attention model is used to determine the relevance of each MRI slice for predicting the Fazekas scale in an unsupervised manner. We show the significant superiority of our method in distinguishing different classes of dementia compared to state-of-the-art methods in our mentioned setting, which improves the macro averaged F1-score of state-of-the-art from 61% to 76% in PVWM, and from 58% to 69.2% in DWM. △ Less

Submitted 18 March, 2021; originally announced March 2021.

Comments: 12 pages, 5 figues

arXiv:2103.00221 [pdf, other]

doi 10.1016/j.media.2021.102272

RA-GCN: Graph Convolutional Network for Disease Prediction Problems with Imbalanced Data

Authors: Mahsa Ghorbani, Anees Kazi, Mahdieh Soleymani Baghshah, Hamid R. Rabiee, Nassir Navab

Abstract: Disease prediction is a well-known classification problem in medical applications. GCNs provide a powerful tool for analyzing the patients' features relative to each other. This can be achieved by modeling the problem as a graph node classification task, where each node is a patient. Due to the nature of such medical datasets, class imbalance is a prevalent issue in the field of disease prediction… ▽ More Disease prediction is a well-known classification problem in medical applications. GCNs provide a powerful tool for analyzing the patients' features relative to each other. This can be achieved by modeling the problem as a graph node classification task, where each node is a patient. Due to the nature of such medical datasets, class imbalance is a prevalent issue in the field of disease prediction, where the distribution of classes is skewed. When the class imbalance is present in the data, the existing graph-based classifiers tend to be biased towards the major class(es) and neglect the samples in the minor class(es). On the other hand, the correct diagnosis of the rare positive cases among all the patients is vital in a healthcare system. In conventional methods, such imbalance is tackled by assigning appropriate weights to classes in the loss function which is still dependent on the relative values of weights, sensitive to outliers, and in some cases biased towards the minor class(es). In this paper, we propose a Re-weighted Adversarial Graph Convolutional Network (RA-GCN) to prevent the graph-based classifier from emphasizing the samples of any particular class. This is accomplished by associating a graph-based neural network to each class, which is responsible for weighting the class samples and changing the importance of each sample for the classifier. Therefore, the classifier adjusts itself and determines the boundary between classes with more attention to the important samples. The parameters of the classifier and weighting networks are trained by an adversarial approach. We show experiments on synthetic and three publicly available medical datasets. RA-GCN demonstrates the superiority compared to recent methods in identifying the patient's status on all three datasets. The detailed analysis is provided as quantitative and qualitative experiments on synthetic datasets. △ Less

Submitted 7 November, 2021; v1 submitted 27 February, 2021; originally announced March 2021.

arXiv:2012.15544 [pdf, other]

Deep Graph Generators: A Survey

Authors: Faezeh Faez, Yassaman Ommi, Mahdieh Soleymani Baghshah, Hamid R. Rabiee

Abstract: Deep generative models have achieved great success in areas such as image, speech, and natural language processing in the past few years. Thanks to the advances in graph-based deep learning, and in particular graph representation learning, deep graph generation methods have recently emerged with new applications ranging from discovering novel molecular structures to modeling social networks. This… ▽ More Deep generative models have achieved great success in areas such as image, speech, and natural language processing in the past few years. Thanks to the advances in graph-based deep learning, and in particular graph representation learning, deep graph generation methods have recently emerged with new applications ranging from discovering novel molecular structures to modeling social networks. This paper conducts a comprehensive survey on deep learning-based graph generation approaches and classifies them into five broad categories, namely, autoregressive, autoencoder-based, RL-based, adversarial, and flow-based graph generators, providing the readers a detailed description of the methods in each class. We also present publicly available source codes, commonly used datasets, and the most widely utilized evaluation metrics. Finally, we highlight the existing challenges and discuss future research directions. △ Less

Submitted 31 December, 2020; originally announced December 2020.

arXiv:2012.06198 [pdf, other]

On the Observability and Controllability of Large-Scale IoT Networks: Reducing Number of Unmatched Nodes via Link Addition

Authors: Mohammadreza Doostmohammadian, Hamid R. Rabiee

Abstract: In this paper, we study large-scale networks in terms of observability and controllability. In particular, we compare the number of unmatched nodes in two main types of Scale-Free (SF) networks: the Barab{á}si-Albert (BA) model and the Holme-Kim (HK) model. Comparing the two models based on theory and simulation, we discuss the possible relation between clustering coefficient and the number of unm… ▽ More In this paper, we study large-scale networks in terms of observability and controllability. In particular, we compare the number of unmatched nodes in two main types of Scale-Free (SF) networks: the Barab{á}si-Albert (BA) model and the Holme-Kim (HK) model. Comparing the two models based on theory and simulation, we discuss the possible relation between clustering coefficient and the number of unmatched nodes. In this direction, we propose a new algorithm to reduce the number of unmatched nodes via link addition. The results are significant as one can reduce the number of unmatched nodes and therefore number of embedded sensors/actuators in, for example, an IoT network. This may significantly reduce the cost of controlling devices or monitoring cost in large-scale systems. △ Less

Submitted 11 December, 2020; originally announced December 2020.

arXiv:2011.11736 [pdf, other]

Accurate and Rapid Diagnosis of COVID-19 Pneumonia with Batch Effect Removal of Chest CT-Scans and Interpretable Artificial Intelligence

Authors: Rassa Ghavami Modegh, Mehrab Hamidi, Saeed Masoudian, Amir Mohseni, Hamzeh Lotfalinezhad, Mohammad Ali Kazemi, Behnaz Moradi, Mahyar Ghafoori, Omid Motamedi, Omid Pournik, Kiara Rezaei-Kalantari, Amirreza Manteghinezhad, Shaghayegh Haghjooy Javanmard, Fateme Abdoli Nezhad, Ahmad Enhesari, Mohammad Saeed Kheyrkhah, Razieh Eghtesadi, Javid Azadbakht, Akbar Aliasgharzadeh, Mohammad Reza Sharif, Ali Khaleghi, Abbas Foroutan, Hossein Ghanaati, Hamed Dashti, Hamid R. Rabiee

Abstract: COVID-19 is a virus with high transmission rate that demands rapid identification of the infected patients to reduce the spread of the disease. The current gold-standard test, Reverse-Transcription Polymerase Chain Reaction (RT-PCR), has a high rate of false negatives. Diagnosing from CT-scan images as a more accurate alternative has the challenge of distinguishing COVID-19 from other pneumonia di… ▽ More COVID-19 is a virus with high transmission rate that demands rapid identification of the infected patients to reduce the spread of the disease. The current gold-standard test, Reverse-Transcription Polymerase Chain Reaction (RT-PCR), has a high rate of false negatives. Diagnosing from CT-scan images as a more accurate alternative has the challenge of distinguishing COVID-19 from other pneumonia diseases. Artificial intelligence can help radiologists and physicians to accelerate the process of diagnosis, increase its accuracy, and measure the severity of the disease. We designed a new interpretable deep neural network to distinguish healthy people, patients with COVID-19, and patients with other pneumonia diseases from axial lung CT-scan images. Our model also detects the infected areas and calculates the percentage of the infected lung volume. We first preprocessed the images to eliminate the batch effects of different devices, and then adopted a weakly supervised method to train the model without having any tags for the infected parts. We trained and evaluated the model on a large dataset of 3359 samples from 6 different medical centers. The model reached sensitivities of 97.75% and 98.15%, and specificities of 87% and 81.03% in separating healthy people from the diseased and COVID-19 from other diseases, respectively. It also demonstrated similar performance for 1435 samples from 6 different medical centers which proves its generalizability. The performance of the model on a large diverse dataset, its generalizability, and interpretability makes it suitable to be used as a reliable diagnostic system. △ Less

Submitted 8 January, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

Comments: 27 pages, 4 figures. Some minor changes have been applied to the text, some fomulae are added to help the descriptions become more clear, two names and two names are corrected (The full version of the names are included)

arXiv:2011.11108 [pdf, other]

Multiresolution Knowledge Distillation for Anomaly Detection

Authors: Mohammadreza Salehi, Niousha Sadjadi, Soroosh Baselizadeh, Mohammad Hossein Rohban, Hamid R. Rabiee

Abstract: Unsupervised representation learning has proved to be a critical component of anomaly detection/localization in images. The challenges to learn such a representation are two-fold. Firstly, the sample size is not often large enough to learn a rich generalizable representation through conventional techniques. Secondly, while only normal samples are available at training, the learned features should… ▽ More Unsupervised representation learning has proved to be a critical component of anomaly detection/localization in images. The challenges to learn such a representation are two-fold. Firstly, the sample size is not often large enough to learn a rich generalizable representation through conventional techniques. Secondly, while only normal samples are available at training, the learned features should be discriminative of normal and anomalous samples. Here, we propose to use the "distillation" of features at various layers of an expert network, pre-trained on ImageNet, into a simpler cloner network to tackle both issues. We detect and localize anomalies using the discrepancy between the expert and cloner networks' intermediate activation values given the input data. We show that considering multiple intermediate hints in distillation leads to better exploiting the expert's knowledge and more distinctive discrepancy compared to solely utilizing the last layer activation values. Notably, previous methods either fail in precise anomaly localization or need expensive region-based training. In contrast, with no need for any special or intensive training procedure, we incorporate interpretability algorithms in our novel framework for the localization of anomalous regions. Despite the striking contrast between some test datasets and ImageNet, we achieve competitive or significantly superior results compared to the SOTA methods on MNIST, F-MNIST, CIFAR-10, MVTecAD, Retinal-OCT, and two Medical datasets on both anomaly detection and localization. △ Less

Submitted 22 November, 2020; originally announced November 2020.

arXiv:2010.01400 [pdf, other]

doi 10.1145/3599237

Joint Inference of Diffusion and Structure in Partially Observed Social Networks Using Coupled Matrix Factorization

Authors: Maryam Ramezani, Aryan Ahadinia, Amirmohammad Ziaei, Hamid R. Rabiee

Abstract: Access to complete data in large-scale networks is often infeasible. Therefore, the problem of missing data is a crucial and unavoidable issue in the analysis and modeling of real-world social networks. However, most of the research on different aspects of social networks does not consider this limitation. One effective way to solve this problem is to recover the missing data as a pre-processing s… ▽ More Access to complete data in large-scale networks is often infeasible. Therefore, the problem of missing data is a crucial and unavoidable issue in the analysis and modeling of real-world social networks. However, most of the research on different aspects of social networks does not consider this limitation. One effective way to solve this problem is to recover the missing data as a pre-processing step. In this paper, a model is learned from partially observed data to infer unobserved diffusion and structure networks. To jointly discover omitted diffusion activities and hidden network structures, we develop a probabilistic generative model called "DiffStru." The interrelations among links of nodes and cascade processes are utilized in the proposed method via learning coupled with low-dimensional latent factors. Besides inferring unseen data, latent factors such as community detection may also aid in network classification problems. We tested different missing data scenarios on simulated independent cascades over LFR networks and real datasets, including Twitter and Memtracker. Experiments on these synthetic and real-world datasets show that the proposed method successfully detects invisible social behaviors, predicts links, and identifies latent features. △ Less

Submitted 22 March, 2023; v1 submitted 3 October, 2020; originally announced October 2020.

arXiv:2008.12959 [pdf, other]

Puzzle-AE: Novelty Detection in Images through Solving Puzzles

Authors: Mohammadreza Salehi, Ainaz Eftekhar, Niousha Sadjadi, Mohammad Hossein Rohban, Hamid R. Rabiee

Abstract: Autoencoder, as an essential part of many anomaly detection methods, is lacking flexibility on normal data in complex datasets. U-Net is proved to be effective for this purpose but overfits on the training data if trained by just using reconstruction error similar to other AE-based frameworks. Puzzle-solving, as a pretext task of self-supervised learning (SSL) methods, has earlier proved its abili… ▽ More Autoencoder, as an essential part of many anomaly detection methods, is lacking flexibility on normal data in complex datasets. U-Net is proved to be effective for this purpose but overfits on the training data if trained by just using reconstruction error similar to other AE-based frameworks. Puzzle-solving, as a pretext task of self-supervised learning (SSL) methods, has earlier proved its ability in learning semantically meaningful features. We show that training U-Nets based on this task is an effective remedy that prevents overfitting and facilitates learning beyond pixel-level features. Shortcut solutions, however, are a big challenge in SSL tasks, including jigsaw puzzles. We propose adversarial robust training as an effective automatic shortcut removal. We achieve competitive or superior results compared to the State of the Art (SOTA) anomaly detection methods on various toy and real-world datasets. Unlike many competitors, the proposed framework is stable, fast, data-efficient, and does not require unprincipled early stopping. △ Less

Submitted 10 February, 2022; v1 submitted 29 August, 2020; originally announced August 2020.

Comments: The paper is under consideration at Computer Vision and Image Understanding

arXiv:2003.05669 [pdf, other]

ARAE: Adversarially Robust Training of Autoencoders Improves Novelty Detection

Authors: Mohammadreza Salehi, Atrin Arya, Barbod Pajoum, Mohammad Otoofi, Amirreza Shaeiri, Mohammad Hossein Rohban, Hamid R. Rabiee

Abstract: Autoencoders (AE) have recently been widely employed to approach the novelty detection problem. Trained only on the normal data, the AE is expected to reconstruct the normal data effectively while fail to regenerate the anomalous data, which could be utilized for novelty detection. However, in this paper, it is demonstrated that this does not always hold. AE often generalizes so perfectly that it… ▽ More Autoencoders (AE) have recently been widely employed to approach the novelty detection problem. Trained only on the normal data, the AE is expected to reconstruct the normal data effectively while fail to regenerate the anomalous data, which could be utilized for novelty detection. However, in this paper, it is demonstrated that this does not always hold. AE often generalizes so perfectly that it can also reconstruct the anomalous data well. To address this problem, we propose a novel AE that can learn more semantically meaningful features. Specifically, we exploit the fact that adversarial robustness promotes learning of meaningful features. Therefore, we force the AE to learn such features by penalizing networks with a bottleneck layer that is unstable against adversarial perturbations. We show that despite using a much simpler architecture in comparison to the prior methods, the proposed AE outperforms or is competitive to state-of-the-art on three benchmark datasets. △ Less

Submitted 24 October, 2020; v1 submitted 12 March, 2020; originally announced March 2020.

arXiv:1909.06868 [pdf, other]

ChOracle: A Unified Statistical Framework for Churn Prediction

Authors: Ali Khodadadi, Seyed Abbas Hosseini, Ehsan Pajouheshgar, Farnam Mansouri, Hamid R. Rabiee

Abstract: User churn is an important issue in online services that threatens the health and profitability of services. Most of the previous works on churn prediction convert the problem into a binary classification task where the users are labeled as churned and non-churned. More recently, some works have tried to convert the user churn prediction problem into the prediction of user return time. In this app… ▽ More User churn is an important issue in online services that threatens the health and profitability of services. Most of the previous works on churn prediction convert the problem into a binary classification task where the users are labeled as churned and non-churned. More recently, some works have tried to convert the user churn prediction problem into the prediction of user return time. In this approach which is more realistic in real world online services, at each time-step the model predicts the user return time instead of predicting a churn label. However, the previous works in this category suffer from lack of generality and require high computational complexity. In this paper, we introduce \emph{ChOracle}, an oracle that predicts the user churn by modeling the user return times to service by utilizing a combination of Temporal Point Processes and Recurrent Neural Networks. Moreover, we incorporate latent variables into the proposed recurrent neural network to model the latent user loyalty to the system. We also develop an efficient approximate variational algorithm for learning parameters of the proposed RNN by using back propagation through time. Finally, we demonstrate the superior performance of ChOracle on a wide variety of real world datasets. △ Less

Submitted 15 September, 2019; originally announced September 2019.

Comments: 12 pages

arXiv:1906.03423 [pdf, other]

doi 10.1145/3341161.3342957

News Labeling as Early as Possible: Real or Fake?

Authors: Maryam Ramezani, Mina Rafiei, Soroush Omranpour, Hamid R. Rabiee

Abstract: Making disguise between real and fake news propagation through online social networks is an important issue in many applications. The time gap between the news release time and detection of its label is a significant step towards broadcasting the real information and avoiding the fake. Therefore, one of the challenging tasks in this area is to identify fake and real news in early stages of propaga… ▽ More Making disguise between real and fake news propagation through online social networks is an important issue in many applications. The time gap between the news release time and detection of its label is a significant step towards broadcasting the real information and avoiding the fake. Therefore, one of the challenging tasks in this area is to identify fake and real news in early stages of propagation. However, there is a trade-off between minimizing the time gap and maximizing accuracy. Despite recent efforts in detection of fake news, there has been no significant work that explicitly incorporates early detection in its model. In this paper, we focus on accurate early labeling of news, and propose a model by considering earliness both in modeling and prediction. The proposed method utilizes recurrent neural networks with a novel loss function, and a new stopping rule. Given the context of news, we first embed it with a class-specific text representation. Then, we utilize the available public profile of users, and speed of news diffusion, for early labeling of the news. Experiments on real datasets demonstrate the effectiveness of our model both in terms of early labelling and accuracy, compared to the state of the art baseline and models. △ Less

Submitted 8 June, 2019; originally announced June 2019.

arXiv:1903.12371 [pdf, ps, other]

doi 10.1109/JSYST.2019.2900027

Cyber-Social Systems: Modeling, Inference, and Optimal Design

Authors: Mohammadreza Doostmohammadian, Hamid R. Rabiee, Usman A. Khan

Abstract: This paper models the cyber-social system as a cyber-network of agents monitoring states of individuals in a social network. The state of each individual is represented by a social node and the interactions among individuals are represented by a social link. In the cyber-network each node represents an agent and the links represent information sharing among agents. Agents make an observation of so… ▽ More This paper models the cyber-social system as a cyber-network of agents monitoring states of individuals in a social network. The state of each individual is represented by a social node and the interactions among individuals are represented by a social link. In the cyber-network each node represents an agent and the links represent information sharing among agents. Agents make an observation of social states and perform distributed inference. In this direction, the contribution of this work is threefold: (i) A novel distributed inference protocol is proposed that makes no assumption on the rank of the underlying social system. This is significant as most protocols in the literature only work on full-rank systems. (ii) A novel agent classification is developed, where it is shown that connectivity requirement on the cyber-network differs for each type. This is particularly important in finding the minimal number of observations and minimal connectivity of the cyber-network as the next contribution. (iii) The cost-optimal design of cyber-network constraint with distributed observability is addressed. This problem is subdivided into sensing cost optimization and networking cost optimization where both are claimed to be NP-hard. We solve both problems for certain types of social networks and find polynomial-order solutions. △ Less

Submitted 29 March, 2019; originally announced March 2019.

Comments: 12 pages, 7 figures

Journal ref: IEEE systems journal, 2019

arXiv:1902.00329 [pdf, other]

Privacy Against Brute-Force Inference Attacks

Authors: Seyed Ali Osia, Borzoo Rassouli, Hamed Haddadi, Hamid R. Rabiee, Deniz Gündüz

Abstract: Privacy-preserving data release is about disclosing information about useful data while retaining the privacy of sensitive data. Assuming that the sensitive data is threatened by a brute-force adversary, we define Guessing Leakage as a measure of privacy, based on the concept of guessing. After investigating the properties of this measure, we derive the optimal utility-privacy trade-off via a line… ▽ More Privacy-preserving data release is about disclosing information about useful data while retaining the privacy of sensitive data. Assuming that the sensitive data is threatened by a brute-force adversary, we define Guessing Leakage as a measure of privacy, based on the concept of guessing. After investigating the properties of this measure, we derive the optimal utility-privacy trade-off via a linear program with any $f$-information adopted as the utility measure, and show that the optimal utility is a concave and piece-wise linear function of the privacy-leakage budget. △ Less

Submitted 1 February, 2019; originally announced February 2019.

arXiv:1811.08812 [pdf, other]

Adversarial Classifier for Imbalanced Problems

Authors: Ehsan Montahaei, Mahsa Ghorbani, Mahdieh Soleymani Baghshah, Hamid R. Rabiee

Abstract: Adversarial approach has been widely used for data generation in the last few years. However, this approach has not been extensively utilized for classifier training. In this paper, we propose an adversarial framework for classifier training that can also handle imbalanced data. Indeed, a network is trained via an adversarial approach to give weights to samples of the majority class such that the… ▽ More Adversarial approach has been widely used for data generation in the last few years. However, this approach has not been extensively utilized for classifier training. In this paper, we propose an adversarial framework for classifier training that can also handle imbalanced data. Indeed, a network is trained via an adversarial approach to give weights to samples of the majority class such that the obtained classification problem becomes more challenging for the discriminator and thus boosts its classification capability. In addition to the general imbalanced classification problems, the proposed method can also be used for problems such as graph representation learning in which it is desired to discriminate similar nodes from dissimilar nodes. Experimental results on imbalanced data classification and on the tasks like graph link prediction show the superiority of the proposed method compared to the state-of-the-art methods. △ Less

Submitted 21 November, 2018; originally announced November 2018.

arXiv:1811.08800 [pdf, other]

doi 10.1145/3341161.3342942

MGCN: Semi-supervised Classification in Multi-layer Graphs with Graph Convolutional Networks

Authors: Mahsa Ghorbani, Mahdieh Soleymani Baghshah, Hamid R. Rabiee

Abstract: Graph embedding is an important approach for graph analysis tasks such as node classification and link prediction. The goal of graph embedding is to find a low dimensional representation of graph nodes that preserves the graph information. Recent methods like Graph Convolutional Network (GCN) try to consider node attributes (if available) besides node relations and learn node embeddings for unsupe… ▽ More Graph embedding is an important approach for graph analysis tasks such as node classification and link prediction. The goal of graph embedding is to find a low dimensional representation of graph nodes that preserves the graph information. Recent methods like Graph Convolutional Network (GCN) try to consider node attributes (if available) besides node relations and learn node embeddings for unsupervised and semi-supervised tasks on graphs. On the other hand, multi-layer graph analysis has been received attention recently. However, the existing methods for multi-layer graph embedding cannot incorporate all available information (like node attributes). Moreover, most of them consider either type of nodes or type of edges, and they do not treat within and between layer edges differently. In this paper, we propose a method called MGCN that utilizes the GCN for multi-layer graphs. MGCN embeds nodes of multi-layer graphs using both within and between layers relations and nodes attributes. We evaluate our method on the semi-supervised node classification task. Experimental results demonstrate the superiority of the proposed method to other multi-layer and single-layer competitors and also show the positive effect of using cross-layer edges. △ Less

Submitted 24 August, 2019; v1 submitted 21 November, 2018; originally announced November 2018.

arXiv:1810.07845 [pdf, other]

On Statistical Learning of Simplices: Unmixing Problem Revisited

Authors: Amir Najafi, Saeed Ilchi, Amir H. Saberi, Seyed Abolfazl Motahari, Babak H. Khalaj, Hamid R. Rabiee

Abstract: We study the sample complexity of learning a high-dimensional simplex from a set of points uniformly sampled from its interior. Learning of simplices is a long studied problem in computer science and has applications in computational biology and remote sensing, mostly under the name of `spectral unmixing'. We theoretically show that a sufficient sample complexity for reliable learning of a $K$-dim… ▽ More We study the sample complexity of learning a high-dimensional simplex from a set of points uniformly sampled from its interior. Learning of simplices is a long studied problem in computer science and has applications in computational biology and remote sensing, mostly under the name of `spectral unmixing'. We theoretically show that a sufficient sample complexity for reliable learning of a $K$-dimensional simplex up to a total-variation error of $ε$ is $O\left(\frac{K^2}ε\log\frac{K}ε\right)$, which yields a substantial improvement over existing bounds. Based on our new theoretical framework, we also propose a heuristic approach for the inference of simplices. Experimental results on synthetic and real-world datasets demonstrate a comparable performance for our method on noiseless samples, while we outperform the state-of-the-art in noisy cases. △ Less

Submitted 12 August, 2020; v1 submitted 17 October, 2018; originally announced October 2018.

Comments: 32 pages

arXiv:1804.01799 [pdf, ps, other]

doi 10.1109/LSP.2018.2824761

Structural cost-optimal design of sensor networks for distributed estimation

Authors: Mohammadreza Doostmohammadian, Hamid R. Rabiee, Usman A. Khan

Abstract: In this letter we discuss cost optimization of sensor networks monitoring structurally full-rank systems under distributed observability constraint. Using structured systems theory, the problem is relaxed into two subproblems: (i) sensing cost optimization and (ii) networking cost optimization. Both problems are reformulated as combinatorial optimization problems. The sensing cost optimization is… ▽ More In this letter we discuss cost optimization of sensor networks monitoring structurally full-rank systems under distributed observability constraint. Using structured systems theory, the problem is relaxed into two subproblems: (i) sensing cost optimization and (ii) networking cost optimization. Both problems are reformulated as combinatorial optimization problems. The sensing cost optimization is shown to have a polynomial order solution. The networking cost optimization is shown to be NP-hard in general, but has a polynomial order solution under specific conditions. A 2-approximation polynomial order relaxation is provided for general networking cost optimization, which is applicable in large-scale system monitoring. △ Less

Submitted 5 April, 2018; originally announced April 2018.

Journal ref: IEEE Signal Processing Letters 2018

arXiv:1802.07244 [pdf, other]

Steering Social Activity: A Stochastic Optimal Control Point Of View

Authors: Ali Zarezade, Abir De, Utkarsh Upadhyay, Hamid R. Rabiee, Manuel Gomez-Rodriguez

Abstract: User engagement in online social networking depends critically on the level of social activity in the corresponding platform--the number of online actions, such as posts, shares or replies, taken by their users. Can we design data-driven algorithms to increase social activity? At a user level, such algorithms may increase activity by helping users decide when to take an action to be more likely to… ▽ More User engagement in online social networking depends critically on the level of social activity in the corresponding platform--the number of online actions, such as posts, shares or replies, taken by their users. Can we design data-driven algorithms to increase social activity? At a user level, such algorithms may increase activity by helping users decide when to take an action to be more likely to be noticed by their peers. At a network level, they may increase activity by incentivizing a few influential users to take more actions, which in turn will trigger additional actions by other users. In this paper, we model social activity using the framework of marked temporal point processes, derive an alternate representation of these processes using stochastic differential equations (SDEs) with jumps and, exploiting this alternate representation, develop two efficient online algorithms with provable guarantees to steer social activity both at a user and at a network level. In doing so, we establish a previously unexplored connection between optimal control of jump SDEs and doubly stochastic marked temporal point processes, which is of independent interest. Finally, we experiment both with synthetic and real data gathered from Twitter and show that our algorithms consistently steer social activity more effectively than the state of the art. △ Less

Submitted 19 February, 2018; originally announced February 2018.

Comments: To appear in JMLR 2018. arXiv admin note: substantial text overlap with arXiv:1610.05773, arXiv:1703.02059

arXiv:1802.03151 [pdf, other]

Deep Private-Feature Extraction

Authors: Seyed Ali Osia, Ali Taheri, Ali Shahin Shamsabadi, Kleomenis Katevas, Hamed Haddadi, Hamid R. Rabiee

Abstract: We present and evaluate Deep Private-Feature Extractor (DPFE), a deep model which is trained and evaluated based on information theoretic constraints. Using the selective exchange of information between a user's device and a service provider, DPFE enables the user to prevent certain sensitive information from being shared with a service provider, while allowing them to extract approved information… ▽ More We present and evaluate Deep Private-Feature Extractor (DPFE), a deep model which is trained and evaluated based on information theoretic constraints. Using the selective exchange of information between a user's device and a service provider, DPFE enables the user to prevent certain sensitive information from being shared with a service provider, while allowing them to extract approved information using their model. We introduce and utilize the log-rank privacy, a novel measure to assess the effectiveness of DPFE in removing sensitive information and compare different models based on their accuracy-privacy tradeoff. We then implement and evaluate the performance of DPFE on smartphones to understand its complexity, resource demands, and efficiency tradeoffs. Our results on benchmark image datasets demonstrate that under moderate resource utilization, DPFE can achieve high accuracy for primary tasks while preserving the privacy of sensitive features. △ Less

Submitted 28 February, 2018; v1 submitted 9 February, 2018; originally announced February 2018.

arXiv:1710.02101 [pdf, ps, other]

Reliable Clustering of Bernoulli Mixture Models

Authors: Amir Najafi, Abolfazl Motahari, Hamid R. Rabiee

Abstract: A Bernoulli Mixture Model (BMM) is a finite mixture of random binary vectors with independent dimensions. The problem of clustering BMM data arises in a variety of real-world applications, ranging from population genetics to activity analysis in social networks. In this paper, we analyze the clusterability of BMMs from a theoretical perspective, when the number of clusters is unknown. In particula… ▽ More A Bernoulli Mixture Model (BMM) is a finite mixture of random binary vectors with independent dimensions. The problem of clustering BMM data arises in a variety of real-world applications, ranging from population genetics to activity analysis in social networks. In this paper, we analyze the clusterability of BMMs from a theoretical perspective, when the number of clusters is unknown. In particular, we stipulate a set of conditions on the sample complexity and dimension of the model in order to guarantee the Probably Approximately Correct (PAC)-clusterability of a dataset. To the best of our knowledge, these findings are the first non-asymptotic bounds on the sample complexity of learning or clustering BMMs. △ Less

Submitted 16 June, 2019; v1 submitted 5 October, 2017; originally announced October 2017.

Comments: 22 pages

arXiv:1710.01727 [pdf, ps, other]

Privacy-Preserving Deep Inference for Rich User Data on The Cloud

Authors: Seyed Ali Osia, Ali Shahin Shamsabadi, Ali Taheri, Kleomenis Katevas, Hamid R. Rabiee, Nicholas D. Lane, Hamed Haddadi

Abstract: Deep neural networks are increasingly being used in a variety of machine learning applications applied to rich user data on the cloud. However, this approach introduces a number of privacy and efficiency challenges, as the cloud operator can perform secondary inferences on the available data. Recently, advances in edge processing have paved the way for more efficient, and private, data processing… ▽ More Deep neural networks are increasingly being used in a variety of machine learning applications applied to rich user data on the cloud. However, this approach introduces a number of privacy and efficiency challenges, as the cloud operator can perform secondary inferences on the available data. Recently, advances in edge processing have paved the way for more efficient, and private, data processing at the source for simple tasks and lighter models, though they remain a challenge for larger, and more complicated models. In this paper, we present a hybrid approach for breaking down large, complex deep models for cooperative, privacy-preserving analytics. We do this by breaking down the popular deep architectures and fine-tune them in a particular way. We then evaluate the privacy benefits of this approach based on the information exposed to the cloud service. We also asses the local inference cost of different layers on a modern handset for mobile applications. Our evaluations show that by using certain kind of fine-tuning and embedding techniques and at a small processing costs, we can greatly reduce the level of information available to unintended tasks applied to the data feature on the cloud, and hence achieving the desired tradeoff between privacy and performance. △ Less

Submitted 11 October, 2017; v1 submitted 4 October, 2017; originally announced October 2017.

Comments: arXiv admin note: substantial text overlap with arXiv:1703.02952

arXiv:1710.00818 [pdf, other]

doi 10.1145/3333028

Continuous-Time Relationship Prediction in Dynamic Heterogeneous Information Networks

Authors: Sina Sajadmanesh, Sogol Bazargani, Jiawei Zhang, Hamid R. Rabiee

Abstract: Online social networks, World Wide Web, media and technological networks, and other types of so-called information networks are ubiquitous nowadays. These information networks are inherently heterogeneous and dynamic. They are heterogeneous as they consist of multi-typed objects and relations, and they are dynamic as they are constantly evolving over time. One of the challenging issues in such het… ▽ More Online social networks, World Wide Web, media and technological networks, and other types of so-called information networks are ubiquitous nowadays. These information networks are inherently heterogeneous and dynamic. They are heterogeneous as they consist of multi-typed objects and relations, and they are dynamic as they are constantly evolving over time. One of the challenging issues in such heterogeneous and dynamic environments is to forecast those relationships in the network that will appear in the future. In this paper, we try to solve the problem of continuous-time relationship prediction in dynamic and heterogeneous information networks. This implies predicting the time it takes for a relationship to appear in the future, given its features that have been extracted by considering both heterogeneity and temporal dynamics of the underlying network. To this end, we first introduce a feature extraction framework that combines the power of meta-path-based modeling and recurrent neural networks to effectively extract features suitable for relationship prediction regarding heterogeneity and dynamicity of the networks. Next, we propose a supervised non-parametric approach, called Non-Parametric Generalized Linear Model (NP-GLM), which infers the hidden underlying probability distribution of the relationship building time given its features. We then present a learning algorithm to train NP-GLM and an inference method to answer time-related queries. Extensive experiments conducted on synthetic data and three real-world datasets, namely Delicious, MovieLens, and DBLP, demonstrate the effectiveness of NP-GLM in solving continuous-time relationship prediction problem vis-a-vis competitive baselines △ Less

Submitted 19 May, 2019; v1 submitted 30 September, 2017; originally announced October 2017.

Comments: To appear in ACM Transactions on Knowledge Discovery from Data

Report number: 44

Journal ref: ACM Transactions on Knowledge Discovery from Data, July 2019

arXiv:1709.03855 [pdf, other]

doi 10.1109/LSP.2017.2749265

Distributed Estimation Recovery under Sensor Failure

Authors: Mohammadreza Doostmohammadian, Hamid R. Rabiee, Houman Zarrabi, Usman A. Khan

Abstract: Single time-scale distributed estimation of dynamic systems via a network of sensors/estimators is addressed in this letter. In single time-scale distributed estimation, the two fusion steps, consensus and measurement exchange, are implemented only once, in contrast to, e.g., a large number of consensus iterations at every step of the system dynamics. We particularly discuss the problem of failure… ▽ More Single time-scale distributed estimation of dynamic systems via a network of sensors/estimators is addressed in this letter. In single time-scale distributed estimation, the two fusion steps, consensus and measurement exchange, are implemented only once, in contrast to, e.g., a large number of consensus iterations at every step of the system dynamics. We particularly discuss the problem of failure in the sensor/estimator network and how to recover for distributed estimation by adding new sensor measurements from equivalent states. We separately discuss the recovery for two types of sensors, namely αand βsensors. We propose polynomial order algorithms to find equivalent state nodes in graph representation of system to recover for distributed observability. The polynomial order solution is particularly significant for large-scale systems. △ Less

Submitted 12 September, 2017; originally announced September 2017.

Comments: IEEE signal processing letters

arXiv:1709.03846 [pdf, other]

Observational Equivalence in System Estimation: Contractions in Complex Networks

Authors: Mohammadreza Doostmohammadian, Hamid R. Rabiee, Houman Zarrabi, Usman Khan

Abstract: Observability of complex systems/networks is the focus of this paper, which is shown to be closely related to the concept of contraction. Indeed, for observable network tracking it is necessary/sufficient to have one node in each contraction measured. Therefore, nodes in a contraction are equivalent to recover for loss of observability, implying that contraction size is a key factor for observabil… ▽ More Observability of complex systems/networks is the focus of this paper, which is shown to be closely related to the concept of contraction. Indeed, for observable network tracking it is necessary/sufficient to have one node in each contraction measured. Therefore, nodes in a contraction are equivalent to recover for loss of observability, implying that contraction size is a key factor for observability recovery. Here, using a polynomial order contraction detection algorithm, we analyze the distribution of contractions, studying its relation with key network properties. Our results show that contraction size is related to network clustering coefficient and degree heterogeneity. Particularly, in networks with power-law degree distribution, if the clustering coefficient is high there are less contractions with smaller size on average. The implication is that estimation/tracking of such systems requires less number of measurements, while their observational recovery is more restrictive in case of sensor failure. Further, in Small-World networks higher degree heterogeneity implies that there are more contractions with smaller size on average. Therefore, the estimation of representing system requires more measurements, and also the recovery of measurement failure is more limited. These results imply that one can tune the properties of synthetic networks to alleviate their estimation/observability recovery. △ Less

Submitted 12 September, 2017; originally announced September 2017.

Comments: IEEE Transactions on Network Science and Engineering

arXiv:1706.06783 [pdf, ps, other]

NPGLM: A Non-Parametric Method for Temporal Link Prediction

Authors: Sina Sajadmanesh, Jiawei Zhang, Hamid R. Rabiee

Abstract: In this paper, we try to solve the problem of temporal link prediction in information networks. This implies predicting the time it takes for a link to appear in the future, given its features that have been extracted at the current network snapshot. To this end, we introduce a probabilistic non-parametric approach, called "Non-Parametric Generalized Linear Model" (NP-GLM), which infers the hidden… ▽ More In this paper, we try to solve the problem of temporal link prediction in information networks. This implies predicting the time it takes for a link to appear in the future, given its features that have been extracted at the current network snapshot. To this end, we introduce a probabilistic non-parametric approach, called "Non-Parametric Generalized Linear Model" (NP-GLM), which infers the hidden underlying probability distribution of the link advent time given its features. We then present a learning algorithm for NP-GLM and an inference method to answer time-related queries. Extensive experiments conducted on both synthetic data and real-world Sina Weibo social network demonstrate the effectiveness of NP-GLM in solving temporal link prediction problem vis-a-vis competitive baselines. △ Less

Submitted 21 June, 2017; originally announced June 2017.

Comments: 7 pages, 5 figures, 3 tables

Showing 1–50 of 71 results for author: Rabiee, H R