\NewEnviron

doweprint[1]

Edge Classification on Graphs:
New Directions in Topological Imbalance

Xueqi Cheng Vanderbilt University xueqi.cheng@vanderbilt.edu , Yu Wang¹¹footnotemark: 1 Vanderbilt University yu.wang.1@vanderbilt.edu , Yunchao (Lance) Liu Vanderbilt University yunchao.liu@vanderbilt.edu , Yuying Zhao Vanderbilt University yuying.zhao@vanderbilt.edu , Charu C. Aggarwal IBM T. J. Watson Research Center charu@us.ibm.com and Tyler Derr Vanderbilt University tyler.derr@vanderbilt.edu

(June 2024)

Abstract.

Recent years have witnessed the remarkable success of applying Graph machine learning (GML) to node/graph classification and link prediction. However, edge classification task that enjoys numerous real-world applications such as social network analysis and cybersecurity, has not seen significant advancement with the progress of GML. To address this gap, our study pioneers a comprehensive approach to edge classification. We identify a novel ‘Topological Imbalance Issue’, which arises from the skewed distribution of edges across different classes, affecting the local subgraph of each edge and harming the performance of edge classifications. Inspired by the recent studies in node classification that the performance discrepancy exists with varying local structural patterns, we aim to investigate if the performance discrepancy in topological imbalanced edge classification tasks can also be mitigated by characterizing the local class distribution variance. To overcome this challenge, we introduce Topological Entropy (TE), a novel topological-based metric that measures the topological imbalance for each edge. Our empirical studies confirm that TE effectively measures local class distribution variance, and indicate that prioritizing edges with high TE values can help address the issue of topological imbalance. Inspired by this observation, we develop two strategies - Topological Reweighting and TE Wedge-based Mixup - to adaptively focus training on (synthetic) edges based on their TEs. While topological reweighting directly manipulates training edge weights according to TE, our wedge-based mixup interpolates synthetic edges between high TE wedges. To further enhance performance, we integrate these strategies into a novel topological imbalance strategy for edge classification: TopoEdge. Extensive experiments on real-world datasets demonstrate the efficacy of our proposed strategies. Additionally, our curated datasets and designed experimental settings establish a new benchmark for future edge classification research, particularly in addressing imbalance issues. Our code and data are available at https://github.com/XueqiC/TopoEdge.

Edge classification, topological imbalance, graph neural network

^†^†doi: XXXXXXX.XXXXXXX^†^†ccs: Computing methodologies Supervised learning

1. Introduction

Graph Machine Learning (GML) has recently achieved unprecedented success in numerous real-world tasks over graph-structured data, including node/graph classification (Yu et al., 2020; Wu et al., 2021; Errica et al., 2019) and link prediction (Zhang and Chen, 2018; Ai et al., 2022; Wang et al., 2023b). However, the edge classification task, which is crucial for many practical applications, such as categorizing protein interactions or social media connections (Kwak et al., 2010; Pandey et al., 2019; Jha et al., 2022), remains extensively under-explored within the GML community.

Previous approaches for edge classification in GML aim to classify edges based on the graph structure and node/edge features. Evolving from the pioneering heuristic-based edge classification method (Aggarwal et al., 2016) which assigns edge labels based on the labels of their similar counterparts, recent work has focused on learning powerful edge embeddings to better leverage the structural and attribute information in graphs. One direction is the shallow embedding techniques (Bielak et al., 2022; Wang et al., 2020b, 2023a) where the random walk and deep auto-encoding methods have been adopted to capture graph structural information. Another direction lies in applying Graph Neural Network (GNN) (Kipf and Welling, 2016; Veličković et al., 2017; Tang et al., 2019) to learn edge embeddings by either firstly learning node-level embeddings aggregated from their local subgraphs or directly generating edge embeddings (Kim et al., 2019; Gong and Cheng, 2019) Despite their effectiveness, they all neglect the imbalance in edge classification, which could significantly compromise the classification performance and restrict their practical utility. For example, failure to recognize fraudulent behaviors in e-commerce networks, which are much less commonly seen, may result in enormous monetary losses.

Even though numerous strategies have been developed for addressing the imbalance problem, most of them are designed for general machine learning tasks (Guo et al., 2008; Chawla et al., 2002) or specifically for node/graph classification (Wang et al., 2022; Zhao et al., 2021; Wang et al., 2021a) in GML with no tailored modification towards edge classification. As seen in Figure 1, applying one of the most well-established imbalance mitigation strategies, quantity reweighting, surprisingly results in inferior performance on three datasets across two GNN backbones to the original one. This motivates us to go beyond considering the edge imbalance issue merely from the quantitative perspective and more focus on the topological perspective. Actually, in node classification, previous works (Zhu et al., 2021; Mao et al., 2023) have discovered a strong correlation between the local topology pattern of a node and its GNNs’ classification performance, which leads to the performance discrepancy among nodes with varying local topology. In parallel, research in imbalanced node/graph classification (Chen et al., 2021; Song et al., 2022; Wang et al., 2022) also discovered a novel imbalance at the topology level and found its close correlation to the node/graph classification performance. Arising from the surprising findings in Figure 1 and further based on experiences from prior works (Zhu et al., 2021; Mao et al., 2023; Chen et al., 2021; Song et al., 2022; Wang et al., 2022), we hypothesize that the topology imbalance issue also exists in edge classification tasks and edges with varying local topology patterns would also exhibit varying classification performance.

To verify the above hypothesis, we first propose a metric, Topological Entropy (TE), to quantify the topological imbalance for edges based on the label distribution of incident edges in their local subgraphs. After empirically verifying the correlation between the training difficulty of edges and their TE values (Figure 2-3), we design a topologically reweighting strategy to adaptively weigh the training of edges based on their TE. In addition, to address the limited supervision provided by few minority edges and further enhance the generalizability of our proposed method, following the mixup strategy in (Zhai et al., 2022), we develop a novel topological wedge-based mixup strategy that generates synthetic edges within wedges that are selected based on high TE edges. Finally, we adaptively integrate the two strategies into a novel topological imbalance strategy for edge classification, TopoEdge, to further boost the performance. To demonstrate the effectiveness of our proposed strategies, we first construct a comprehensive imbalance edge classification evaluation benchmark consisting of diverse real-world edge classification datasets that exhibit skewed label distributions. The results reveal TopoEdge is able to significantly improve edge classification performance across various GNN backbones in various settings.

Our main contributions are summarized as follows:

•

Uncovering a Novel Research Paradigm in Edge Classification: Through a meticulous exploration into the domain of edge classification, we identified a notable and previously unexplored challenge: topological imbalance. To measure this imbalance with precision, we introduced the innovative Topological Entropy (TE) method, which quantitatively assesses the extent of topological imbalance present within edges’ local subgraphs.
•

Algorithmic Design to Address Topological Imbalance: Given the challenges posed by the topological imbalance in edge classification, we pioneered a comprehensive algorithmic solution. This first involves a topological reweighting technique, aiming to enhance the training of high TE edges. Complementing this, we devised a unique TE wedge-based mixup approach, which integrates the generation of synthetic training examples derived from high-entropy edges, thereby reinforcing the training efficacy and alleviating the identified imbalance. Ultimately, we combine the above two strategies into a novel Topological imbalance strategy, TopoEdge, for edge classification.
•

SOTA Performance on Real-world Datasets: To validate the practical implications of our method, we assembled an edge classification benchmark suite with datasets coming from diverse real-world scenarios/applications. Our experiments showcase the ability of the proposed TE-based solutions to help mitigate the imbalance issues in edge classification, and the proposed TopoEdge nearly always surpasses existing methods.

2. Notations and Preliminary

Let $G=(\mathcal{V},\mathcal{E},\mathbf{X},\mathbf{S})$ be an undirected attributed graph, where $\mathcal{V}=\{v_{i}\}_{i=1}^{n}$ is the set of $n$ nodes (i.e., $n=|\mathcal{V}|$ ) and $\mathcal{E}\subseteq\mathcal{V}\times\mathcal{V}$ is the set of $m$ observed training edges (i.e., $m=|\mathcal{E}|$ ) with $e_{ij}$ denoting the edge between the node $v_{i}$ and $v_{j}$ . $\mathbf{X}\in\mathbb{R}^{n\times d}$ represents the node feature matrix with dimension $d$ and $\mathbf{S}\in\mathbb{R}^{m\times d^{\prime}}$ represents the edge feature matrix with dimension $d^{\prime}$ . The observed $m$ training edges compose the adjacency matrix $\mathbf{A}\in\{0,1\}^{n\times n}$ with $\mathbf{A}_{ij}=1$ if an observed edge exists between node $v_{i}$ and $v_{j}$ and $\mathbf{A}_{ij}=0$ otherwise, which serves for the computational graph for message-passing. The diagonal matrix of node degree is notated as $\mathbf{D}\in\mathbb{Z}^{n\times n}$ with the degree of node $v_{i}$ being $d_{i}=\mathbf{D}_{ii}=\sum_{j=1}^{n}\mathbf{A}_{ij}$ . The neighborhood node set of node $v_{i}$ is denoted as $\mathcal{N}_{i}$ . In edge classification, given the edge class label matrix $\mathbf{Y}\in\{0,1\}^{n\times n\times C}$ with $C$ being the total number of edge classes and denote $\{\mathcal{E}_{k}\}_{k=1}^{C}$ as the set of labeled edges in class $k$ , we aim to train an edge classifier based on edges in the training set $\mathcal{E}^{\text{Tr}}$ to correctly classify edges in the validation/test set $\mathcal{E}^{\text{Val/Test}}$ across all the classes.

3. Topological Imbalance and Topological Entropy

In this section, we first empirically demonstrate that considering the edge imbalance issue merely from the quantitative perspective may result in inferior performance in edge classification. This observation motivates the investigate of edge-level imbalance from the topology perspective. Inspired by prior works that have shown varying local subgraph patterns of nodes cause varying classification performance(Mao et al., 2023; Zhu et al., 2021; Chen et al., 2021; Song et al., 2022), we propose a new type of imbalance in edge classification, topological imbalance, which is defined as the imbalanced local subgraph patterns around each edge that can be quantified by local class distribution variance. Hence we develop a metric, Topological Entropy (TE), to quantify the local class distribution variance, and further verify its relationship with performance discrepancy. Based on the observations, we develop Topological Reweight to emphasize edges with higher TE, and empirical experiments show the effectiveness of our proposed method.

3.1. Motivation

To mitigate the imbalance issue in edge classification, the most direct approach is to implement strategies to mitigate quantity imbalance issues. Hence, we first integrate one of the well-established quantity imbalance techniques, quantity reweight ( $R_{q}$ ) (Huang et al., 2016), with a well-known Graph Neural Network (GNN) architecture, the Graph Convolutional Network (GCN) (Kipf and Welling, 2016), and then conduct edge classification across three datasets. More specifically, for edge $e_{ij}\in\mathcal{E}^{Tr}$ that belongs to class $k$ (i.e., $\mathbf{Y}_{ijk}=1$ ), the quantity reweight $w_{ij}^{q}$ that is applied to it during the training phase to emphasize the minority class is defined as follows:

(1)

w_{ij}^{q}=\frac{|\mathcal{E}^{\text{Tr}}|}{|\{\mathcal{E}_{k}\}_{k=1}^{C}|}

where $|\mathcal{E}^{\text{Tr}}|$ is the number of training edges, and $|\{\mathcal{E}_{k}\}_{k=1}^{C}|$ is the number of labeled edges in class $k$ .

As shown in Figure 1(a) and 1(b), the overall Macro-F1 decreases after equipping quantity reweight strategy to backbones, indicating that this strategy alone which merely addresses the global class distribution is insufficient to achieve consistent performance improvement across all classes. This observation compels a thorough examination of the performance disparity among edges through a local lens. To undertake this detailed investigation, as edges are participated by their end nodes, we propose categorizing edges according to the local class distribution (LCD) of their end nodes.

The categorization process begins with the identification of the edge’s end nodes and the computation of the ratio of majority edges within their immediate neighborhoods. Utilizing a predefined threshold, the process categorized each node as mostly majority (M), mostly minority (m), or uncertain (U) based on the corresponding majority ratio. Subsequently, the edge’s category is a combination of its end nodes’ categories, giving a concise representation of the edge’s local class context in the graph. For example, for edge $e_{ij}$ , if $v_{i}$ ’s category is M, $v_{j}$ category is m, then edge $e_{ij}$ ’s category is set as Mm. This nuanced approach effectively captures the edge’s local topological context and allows us to analyze the graph’s structure in detail, revealing insights into the underlying connectivity patterns. To simplify, the algorithm is designed only for binary classification.

Utilizing the categorization framework delineated earlier, we conduct an in-depth analysis to assess the classification performance across different edge categories, examining both the original GCN model and its extension with $R_{q}$ . As illustrated in Figure 2, the incorporation of $R_{q}$ predominantly benefits the minority class, albeit at the expense of diminishing the performance within the majority class for most categories.

More specifically, for the majority class, the most insignificant performance decrease is observed in MM, where the backbone model’s performance is nearly perfect, thus applying $R_{q}$ for edges in this category will negligibly impact overall efficacy. However, in categories characterized by substantial variation in local class distribution (MU, Mm, UU, Um) — with the exception of UU, which comprises a minor quantity of majority edges (counts $\approx 10$ ) — we observe a pronounced decline in performance. Intriguingly, for category mm, the application of $R_{q}$ is conducive to performance enhancement. For the minority class, the most significant improvement is gained when the end nodes’ categories are both $m$ , and then the improvement is decreased across categories MU, Mm, UU, Um, where the end nodes’ local distribution varies a lot. In mm, the $R_{q}$ can no longer improve its performance since the model has reached its latent capacity and performs best.

Based on the above, to strategically design a method on top of $R_{q}$ to increase the overall edge classification performance, we need to further emphasize the edges with large local class distribution variance. This leads to the next section where we design a novel metric to quantify the level of edges’ local class distribution variance.

3.2. Topological Entropy (TE)

Inspired by the recent work in imbalanced node classification (Chen et al., 2021; Song et al., 2022), the local distribution variance can be characterized using the label diffusion method. As label diffusion is based on the receptive field in the computational graph, it is a better approximation of the GNNs message passing scheme, hence the local distribution variance characterized by label diffusion will provide more insight than characterized purely based on label distribution in end nodes’ one-hop neighborhood.

For the metric to quantify the local distribution variance, first we obtain label encodings of each node $v_{i}$ from its incident edges $\{e_{ij}|v_{j}\in\mathcal{N}_{i}\}$ by:

(2)

\mathbf{h}_{i}=\mathbb{E}_{v_{j}\sim\mathcal{N}_{i}}{\mathbf{Y}_{ij}},

where $\mathbf{h}_{i}\in\mathbb{R}^{C}$ denotes the probability distribution of the node $v_{i}$ ’s association to the edge classes according to their incident edges. To consider higher-order neighborhood impact, we obtain the $L^{\text{th}}$ -layer label encodings $\mathbf{H}^{L}$ by message-passing:

(3)

\mathbf{H}^{L}=\widetilde{\mathbf{A}}\mathbf{H}^{L-1},

where $\widetilde{\mathbf{A}}=\mathbf{D}^{-1}\mathbf{A}$ is the normalized adjacency matrix and specifically, $\mathbf{H}^{0}=\mathbf{H}$ , coming from Eq. (2). Since we have the label distribution of $L$ -hop subgraphs of the ending nodes $v_{i},v_{j}$ of each edge $e_{ij}$ , we then can directly average $\mathbf{H}^{L}_{i},\mathbf{H}^{L}_{j}$ to obtain the label distribution $\mathbf{G}_{ij}^{L}\in\mathbb{R}^{C}$ of the local subgraph around that edge $e_{ij}$ .

(4)

\mathbf{G}_{ij}^{L}=\frac{1}{2}(\mathbf{h}^{L}_{i}+\mathbf{h}^{L}_{j}),

where $\mathbf{G}_{ijk}^{L}$ indicates the proportion of the $L$ -hop local subgraph around the edge $e_{ij}$ belonging to the class $k$ . In this way, if the end nodes of an edge have identical and certain class distributions (i.e., MM and mm), the distribution will be highly skewed, and the uncertainty will be small. For the edge with large class distribution variance (i.e., edges in MU, Mm, UU, and Um), the distribution will be more uniform, and the uncertainty will be large.

Hence, after we have a well-established metric $\mathbf{G}_{ij}^{L}$ to quantify the local topology pattern around each edge, we can further quantify its entropy to measure its class uncertainty, defined as:

Definition 0.

Topological Entropy (TE): The uncertainty of the class distribution of the $L^{\text{th}}$ -layer local subgraph around each edge $e_{ij}$ is calculated as:

(5)

\text{TE}_{ij}^{L}=-\sum_{k=1}^{C}\textbf{G}_{ijk}^{L}\log(\textbf{G}_{ijk}^{L})

In this way, the edges with large class distribution variance would have a higher Topological Entropy (TE) value.

To demonstrate the accuracy of Topological Entropy (TE) in reflecting local class distribution variance, we undertook an empirical study on the Epinions dataset. The experiment aimed to confirm that edges within categories marked by low local distribution variance—specifically MM and mm from Figure 2—would exhibit lower TE values. As shown in Figure 3, majority edges are easier to train correctly than minority edges, which aligns with the patterns in Figure 2. Figure 3 also reveals that majority edges with lower TE values (i.e., TE $\leq 0.28$ ) are indeed easy to train correctly, affirming the relatively high validation F1 scores for majority edges in MM and mm as observed in Figure 2. Conversely, minority edges with low TE struggled in training, mirroring their lower validation F1 scores in Figure 2. Additionally, the distribution trends of majority and minority edges across varying TE values further corresponded with the patterns observed (the distribution of majority edges skewed towards lower TE values, and the distribution of minority edges is rather uniform) in Figure 2, confirming the effectiveness of TE in capturing the anticipated variance. These findings affirm the methodological soundness of TE, laying a solid foundation for subsequent methodological development.

4. Topological Reweight

Building upon Section 3.1, we expect to enhance the training weights of labeled edges that have larger TE values, thereby enabling those edges that are more representative of their corresponding classes and hence lead to better performance for unlabeled edges.

For any edge $e_{ij}\in\mathcal{E}^{Tr}$ , we design a TE reweight method to assign edge weight $w^{t}_{ij}$ as:

(6)

w^{t,L}_{ij}=exp(\frac{\text{TE}^{L}_{ij}}{t}),

where $t$ is the temperature controlling the sharpness of the weight distribution and as $t\rightarrow 0$ , the weight distribution becomes more uniform and hence it approaches the unweighted version. Then the training loss $L$ for edge classification equipped with TE reweight is:

(7)

\mathcal{L}^{\text{TE}}=\frac{1}{|\mathcal{E}^{Tr}|}\sum_{e_{ij}\in\mathcal{E}% ^{Tr}}{w_{ij}^{t,L}\mathcal{L}^{\text{CE}}(\mathbf{E}_{ij},\mathbf{Y}_{ij})}

where $\mathbf{E}_{ij}$ is the embedding obtained from any GML model (e.g., a GNN) followed by a classifier and having $\mathcal{L}^{\text{CE}}$ denote the cross entropy loss. This is then the loss function of TE reweight ( $R_{t}$ ).

Based on the insight in Section 3.1, we adaptively integrate TE reweight with the previously introduced quantity reweight method into a novel Topological Reweight ( $R_{tq}$ ) strategy to include both the global class distribution and local class distribution variance into consideration. The final weight can be computed as:

(8)

w_{ij}=\theta w_{ij}^{t,L}+(1-\theta)w_{ij}^{q}

where $\theta$ is a hyperparameter that controls the trade-off in the contribution between topological reweight and quantitative reweight. Correspondingly the training loss becomes:

(9)

\mathcal{L}^{\text{T}}=\frac{1}{|\mathcal{E}^{Tr}|}\sum_{e_{ij}\in\mathcal{E}^% {Tr}}w_{ij}\mathcal{L}^{\text{CE}}(\mathbf{E}_{ij},\mathbf{Y}_{ij})

To validate the effectiveness of the proposed methods, we empirically compare the edge classification performance of quantitative reweight ( $R_{q}$ ), TE reweight ( $R_{t}$ ), and their combined version Topological reweight ( $R_{tq}$ ). First, we visualize the classification performance for majority and minority edges across different categories in Epinions dataset. As shown in Figure 4(a) and Figure 4(b), $R_{tq}$ can indeed improve the performance for edges in MU, Mm, UU, and Um, which aligns with our intuition to improve performance on edges with high local distribution variance.

To empirically measure the classification performance of different methods across all the classes, we follow other imbalance classification work (Chen et al., 2021; Liu et al., 2023; Sun et al., 2022; Ma et al., 2023) and adopt balanced accuracy (average accuracy across different classes) and Macro-F1 (average F1 score across different classes) as the performance metric. As shown in Table 1, both $R_{t}$ and $R_{tq}$ consistently yield enhancements in terms of balanced accuracy and Macro-F1 scores. This outcome empirically substantiates the efficacy of the proposed reweight method against $R_{q}$ where only performance improvement in minority classes can be observed. The fact that applying topology reweight that addresses the local structure disparity performs better than quantity reweight suggests that local class distribution is more important for performance improvement, which is reasonable because in GNN message passing nodes and edges only receive messages from the local subgraph. Also, combining quantity reweight with topology reweight adaptively can take advantage of the information in both global class distribution and local class distribution and mitigate both quantity and topological imbalance issues, hence it depicts the optimal performance.

5. Mixup based on TE

In Section 4, we present the Topological Reweight ( $R_{tq}$ ) mechanism to emphasize the edges according to their TEs. However, merely incorporating topological reweight may not address the limited supervision provided by few minority edges, and thus there is no guaranteed performance improvement (Zhai et al., 2022; Byrd and Lipton, 2019; Sagawa et al., 2019; Gulrajani and Lopez-Paz, 2020; Koh et al., 2021). Moreover, reweighting methods can lead to overfitting (Sagawa et al., 2019), with test performance potentially degrading over numerous epochs without intervention. One method that may address these issues and further enhance the generalizability of our proposed method is mixup, as it generates synthetic samples to change the original training distribution and provides sufficient supervision. Currently, there are some mixup methods in GML tasks (Wang et al., 2021c; Kim et al., 2023; Han et al., 2022; Ma et al., 2024; Li et al., 2022), but none of them are designed to mitigate topological imbalance issues in edge classification. Therefore, to improve the generalizability and effectiveness of our proposed method in topological imbalance edge classification tasks, we designed a TE wedge-based mixup method, where we selected the edges with high entropy and its high entropy neighborhood edges to form a “high TE wedge”, then we generate new sample node by mixup between end nodes of wedges. The overview of the TE wedge-based mixup is shown in Figure 5.

Table 1. Balanced accuracy (B_acc) and Macro-F1 (Macro_F1) across three datasets using the GCN backbone model showing performance of

R_{q}

R_{t}

, and

R_{tq}

Dataset	Metric	GCN	GCN $+{R_{q}}$	GCN $+{R_{t}}$	GCN $+{R_{tq}}$
Bitcoin-alpha	B_acc	0.797 $\pm$ 0.005	0.857 $\pm$ 0.020	0.818 $\pm$ 0.004	0.834 $\pm$ 0.008
Bitcoin-alpha	Macro_F1	0.840 $\pm$ 0.005	0.790 $\pm$ 0.005	0.846 $\pm$ 0.007	0.852 $\pm$ 0.003
Epinions	B_acc	0.731 $\pm$ 0.008	0.768 $\pm$ 0.007	0.756 $\pm$ 0.008	0.769 $\pm$ 0.001
Epinions	Macro_F1	0.745 $\pm$ 0.002	0.720 $\pm$ 0.001	0.756 $\pm$ 0.000	0.753 $\pm$ 0.001
Reddit	B_acc	0.715 $\pm$ 0.003	0.779 $\pm$ 0.012	0.727 $\pm$ 0.035	0.762 $\pm$ 0.006
Reddit	Macro_F1	0.733 $\pm$ 0.000	0.707 $\pm$ 0.002	0.737 $\pm$ 0.007	0.740 $\pm$ 0.000

5.1. Motivation

To circumvent the potential pitfalls of the reweighting method that we stated above, and to accentuate edges with elevated entropy, our strategy mandates refinement beyond mere Topological reweight. A corpus of discussions, encapsulated in (Wiles et al., 2021; Sagawa et al., 2021; Zhai et al., 2022), advocate for the incorporation of additional training samples via data augmentation as a viable enhancement to topological reweighting. Wang et al.(Wang et al., 2021b) further contend that modifications to the loss function can potentially amplify the efficacy of our proposed approach. Given these considerations, we posit:

Can we develop strategies that combine these elements and incorporate our understanding of topological imbalance in edge classification?

Our exploration led us to the random mixup technique. At its core, random mixup formulate synthetic samples via linear interpolation between two randomly selected distinct samples. The central tenet postulates that linear interpolations between feature vectors should logically precipitate linear interpolations of their respective targets (Zhang et al., 2017). However, direct adoptions of this technique between any two random samples present a problem for edge classification since random edge interpolations lack semantic coherence (Zhang et al., 2017), and one way to solve that is to perform mixup in wedges. However, randomly selecting wedges to perform mixup cannot address the observations made in the previous section, as we want to emphasize the high TE edges while preserving semantic coherence. Therefore, we extend our wedge-based mixup to develop a novel Topological wedge-based mixup and formally present it in the next section.

5.2. TE Wedge-based Mixup

In this section, we delineate the underlying intuition and technical intricacies of the wedge-based mixup we propose. We commence by elucidating the rationale and driving forces behind the adoption of the topological wedge-based mixup in edge classification. Subsequently, the mechanics of this approach are elaborated upon. Our methodology hinges on two primary modules. The High TE Wedge Selection Module is employed to curate high TE wedges. Within these selected wedges, a Wedge-mixup Module facilitates node mixup, engendering synthetic nodes and edges and integrating these edges with topological weight, culminating in the final loss function. This evolved loss function supplants the original variant presented in Eq. (9), paving the way for the training of the edge classifier. Below we present the details of each module.

5.2.1. High TE Wedge Selection Module:

We first define the concepts and notions about wedge:

Definition 0.

A wedge $\mathcal{W}_{i,c,j}$ is a set comprising two distinct undirected edges $e_{ci}$ and $e_{cj}$ intersecting at a singular shared node $v_{c}$ , which is referred to as the “centric node”, the node $i$ and node $j$ are called “end nodes:”

(10)

\small\mathcal{W}_{i,c,j}=\{e_{ci},e_{cj}|e_{ci},e_{cj}\in\mathcal{E},v_{i},v_% {c},v_{j}\in\mathcal{V},i\neq j\}

To address the observation we made in the previous section and emphasize edges with high TE, we begin by determining the TE values for edges within the training set. Then we select $K$ number of training edges based on their TE values. Specifically, edges with the highest TE are most likely to be selected. This methodology gives rise to a subset of selected edges, denoted as $\mathcal{E}_{1}\subseteq\mathcal{E}_{Tr}$ , and its associated edge label set $\mathbf{Y}_{1}$ . For each edge $e_{ci}$ within $\mathcal{E}_{1}$ , we discern an edge $e_{cj}$ that is incident with it. In cases where multiple incident edges are present, preference is accorded to those with elevated TE values. Consequently, we identify the incident edge set $\mathcal{E}_{2}\subseteq\mathcal{E}_{Tr}$ , its pertinent edge labels $\mathbf{Y}_{2}$ , and high TE wedge set $\mathcal{W}$ .

5.2.2. Wedge-Mixup Module:

With the aid of the GNN encoder, node embeddings, denoted as $Z$ , are procured. Then for every wedge $\mathcal{W}_{i,c,j}$ , we can generate a new synthetic node $v_{s}$ between node $v_{i}$ and $v_{j}$ with embedding as:

(11)

\small\mathbf{Z}_{v_{s}}=\lambda\mathbf{Z}_{v_{i}}+(1-\lambda)\mathbf{Z}_{v_{j}}

where $\lambda\sim Beta(\alpha,\alpha)$ , and $\alpha$ is a hyperparameter that controls the Beta distribution. In this paper, we set $\alpha$ to be $4.0$ . Then we can get the new synthetic edge $e_{cs}$ ’s edge embedding as $\mathbf{E}^{{}^{\prime}}_{e_{cs}}=\mathbf{Z}_{v_{c}}\|\mathbf{Z}_{v_{s}}$ and the synthetic edge feature embedding can be generated as $\mathbf{S}_{e_{cs}}=\lambda\mathbf{S}_{e_{ci}}+(1-\lambda)\mathbf{S}_{e_{cj}}$ . Then the final embedding for the synthetic edge is defined as $\mathbf{E}_{e_{cs}}=\mathbf{E}^{{}^{\prime}}_{e_{cs}}\|\mathbf{S}_{e_{cs}}$ . Let the synthetic edge set to be $\mathcal{E}_{3}$ . At this point, we can send $\mathbf{E}_{\mathcal{E}_{3}}$ into MLP classifier $f$ to get the outcome $f({\mathbf{E}_{\mathcal{E}_{3}}})$ that can be used for the following-up module. However, directly using this in practice will bring trouble as the golden label can not be linearly interpolated. Since the objective of the TE wedge-based mixup loss is to minimize the differences between synthetic edge prediction outcome and synthetic edge label, and the synthetic edges are generated through linear interpolation between edges in $\mathcal{E}_{1}$ and $\mathcal{E}_{2}$ . For implementation, as suggested in (Zhang et al., 2017), the loss can be computed by linearly combining the loss of edges in $\mathbf{Y}_{1}$ and $\mathbf{Y}_{2}$ . Thus, the loss function for our TE wedge-based mixup is:

(12)

\mathcal{L}^{\text{X}}=\lambda\mathcal{L}^{\text{CE}}(f(\mathbf{E}_{\mathcal{E% }_{3}}),\mathbf{Y}_{1})+(1-\lambda)\mathcal{L}^{\text{CE}}(f(\mathbf{E}_{% \mathcal{E}_{3}},\mathbf{Y}_{2})

where $\mathcal{L}^{\text{CE}}$ is the cross entropy loss. Since incorporating a linear combination of Eq. (12) with original loss into the final loss function deviates from the traditional ERM paradigm, which is congruent with our earlier discourse and is anticipated to bolster the efficiency of the methodology, we can get the final loss function as:

(13)

\mathcal{L}=\mathcal{L}^{\text{Original}}+h\mathcal{L}^{\text{X}}

where $h$ balances the TE wedge-based mixup and original losses.

Table 2. Classification performance of backbones before and after applying random edge-based mixup (

X_{e}

), TE edge-based mixup (

X_{te}

), random wedge-based mixup (

X_{w}

), and TE wedge-based mixup (

X_{tw}

) strategies.

Dataset	Bitcoin-alpha		Epinions		Reddit
Metric	B_acc	Macro_F1	B_acc	Macro_F1	B_acc	Macro_F1
GCN	0.797 $\pm$ 0.005	0.840 $\pm$ 0.005	0.731 $\pm$ 0.008	0.745 $\pm$ 0.002	0.715 $\pm$ 0.003	0.733 $\pm$ 0.000
GCN+ $X_{e}$	0.813 $\pm$ 0.004	0.851 $\pm$ 0.001	0.753 $\pm$ 0.008	0.744 $\pm$ 0.001	0.719 $\pm$ 0.006	0.732 $\pm$ 0.001
GCN+ $X_{te}$	0.818 $\pm$ 0.005	0.851 $\pm$ 0.006	0.744 $\pm$ 0.004	0.748 $\pm$ 0.002	0.720 $\pm$ 0.016	0.732 $\pm$ 0.004
GCN+ $X_{w}$	0.811 $\pm$ 0.005	0.850 $\pm$ 0.002	0.751 $\pm$ 0.004	0.742 $\pm$ 0.002	0.710 $\pm$ 0.011	0.730 $\pm$ 0.005
GCN+ $X_{tw}$	0.828 $\pm$ 0.010	0.852 $\pm$ 0.008	0.754 $\pm$ 0.007	0.749 $\pm$ 0.002	0.728 $\pm$ 0.016	0.737 $\pm$ 0.002

5.3. Empirical Analysis

In this section, we provide an empirical analysis of the effectiveness of TE wedge-based mixups ( $X_{tw}$ ), which do not incorporate a topological reweight term in the final loss function, enabling a fair comparison. We contrast this method with random edge-based mixups ( $X_{e}$ ), TE edge-based mixups ( $X_{te}$ ) where edges with the highest TE scores are preferentially selected for mixups, and standard wedge-based mixups ( $X_{w}$ ).

The results presented in Table 2 underscore the superior performance of our TE wedge-based mixup strategy, which consistently outperforms alternative methods in both balanced accuracy and Macro-F1. This confirms the hypothesis that prioritizing high TE edges can effectively address the issue of topological imbalance.

Moreover, the observation that TE edge-based mixups surpass random edge-based mixups, yet fall short of TE wedge-based mixups, reinforces the value of our approach. While emphasizing high TE edges alone does yield performance improvements, the integration of these edges with a wedge-based strategy — which also tackles semantic coherence — proves to be more advantageous. This affirms the validity of our method and aligns with our initial objectives, demonstrating the effectiveness of combining high TE edge prioritization with random wedge-based mixup techniques.

6. Overall TopoEdge Framework

To further boost the performance, we can adaptively integrate the proposed topological reweight with TE wedge-based mixup as:

(14)

\mathcal{L}^{\text{tX}}=\lambda w_{\mathcal{E}_{3}}\mathcal{L}^{\text{CE}}(f(% \mathbf{E}_{\mathcal{E}_{3}}),\mathbf{Y}_{1})+(1-\lambda)w_{\mathcal{E}_{3}}% \mathcal{L}^{\text{CE}}(f(\mathbf{E}_{\mathcal{E}_{3}},\mathbf{Y}_{2})

Note that the weight for the synthetic edges will be the linear integration between the weights of TE wedges’ end edges. Then similarly we can linearly combine Eq. (14) with Eq. (9) to get the final loss function as:

(15)

\mathcal{L}=\mathcal{L}^{\text{T}}+h\mathcal{L}^{\text{tX}}

Where $h$ is a hyperparameter that scales the topological reweight loss and topological wedge-based mixup loss. This is our final topological imbalance strategy for edge classification: TopoEdge. The pseudo-code of TopoEdge is shown in Algorithm 1, and the overall framework of TopoEdge is shown in Figure 6.

Algorithm 1 Overall TopoEdge Framework

1:Input Edge embedding

\mathbf{E}

, training edges

\mathcal{E}^{Tr}

, training edge labels

\mathbf{Y}

,
classifier

f

, hyperparameters

K,\alpha,t,\theta,h

2:Get the quantity weight from Eq. (1)

3:Get the TE weights for edges in

\mathcal{E}^{Tr}

from Eq. (5) and (6)

4:Get the topological weight from Eq. (8)

5:Get the topological training loss

\mathcal{L}^{\text{T}}

from Eq. (9)

\lambda\sim Beta(\alpha,\alpha)

7:High TE Wedge Selection Module:

•

Select $K$ edges based on TE value to get $\mathcal{E}_{1}$ , $\mathbf{Y}_{1}$
•

Get incident edge set $\mathcal{E}_{2}$ , $\mathbf{Y}_{2}$ , based on TE and $\mathcal{E}_{1}$ ,
to form high TE wedge set $\mathcal{W}$

8:Wedge-Mixup Module:

•

Get synthetic loss $\mathcal{L}^{\text{tX}}$ from Eq. (14)

9:Wedge-loss Module:

•

Get final loss $\mathcal{L}$ from Eq. (15)

10:return

L

7. Experiments

In this section, we provide a comprehensive study of the effectiveness of our methods on real-world edge classification tasks.

7.1. Datasets

In this section, we present the experimental datasets employed in this study. As highlighted earlier, it is evident that issues related to topology imbalance can be observed in edge classification across various applications. Thus, we collect and formulate six benchmark imbalanced real-world graph datasets with relevant features. Below are brief descriptions of each dataset, and the basic statistics of the datasets are provided in Table 3.

•

Bitcoin-alpha (Derr et al., 2017) is a cryptocurrency transaction trust network, where a node represents the user and a directed edge represents ratings (how much one user trusts/distrusts another user). The edge features are text embeddings of user comments. Note that ratings ranging -10 (total distrust) to +10 (total trust) can be discretized into different edge types. In binary classification, trust (ratings 1 to 10) is the majority class ( $93.8\%$ ), while distrust (ratings -10 to -1) is the minority ( $6.3\%$ ). For multi-class classification, weak trust ( $90.3\%$ ), strong trust ( $3.5\%$ ), weak distrust ( $3.3\%$ ), and strong distrust ( $3.0\%$ ) are considered.

•

HSPPI is a collection of homo sapiens protein-protein interaction networks based on String database (Szklarczyk et al., 2023). Nodes represent proteins, and node features are protein sequences (acquired from EBI protein API (Nightingale et al., 2017)) that reflect the protein amino acids sequence information. The two edge types are indirect functional interactions ( $76.3\%$ ) and direct physical interactions ( $23.7\%$ ).

Table 3. Basic statistics of the imbalanced benchmark datasets. The 4 largest frequency ratios in each dataset are represented as

L_{0}

through

L_{3}

Datasets	# Nodes	# Edges	$L_{0}$	$L_{1}$	$L_{2}$	$L_{3}$	# Labels
Bitcoin-alpha	3,784	24,207	90.3%	3.5%	3.3%	3.0%	2 or 4
HSPPI	17,895	1,008,006	76.1%	23.9%	-	-	2
NID	109,802	431,372	62.0%	14.7%	14.3%	3.0%	9
Epinions	81,350	763,538	85.4%	14.6%	-	-	2
Reddit	67,180	901,056	90.5%	9.5%	-	-	2
MAG	40,000	141,725	58.6%	14.2%	13.7%	7.3%	10

•

NID is formulated from NetFlow-based IoT network dataset NF-ToN-IoT (Sarhan et al., 2021), where node represents user’s IP address, and edge represents interactions between users. There are $9$ classes in total, and $62.0\%$ of the interactions are injection, $14.7\%$ are DDos, $14.3\%$ are benign (normal unmalicious flows), and $3.2\%$ are scanning. Edge features are network flow statistics (flow duration time, flow bytes, packets, etc) between users.
•

Epinions (Guha et al., 2004) is a product review website where nodes represent each user and directed edges represent the trust/ distrust relationship from one user to another. Each user has reviews for different products. During the data filtering processing, we first remove the users who never give reviews, and then we concatenate the text embeddings of the source node with the target node to form the edge feature. We eventually have $85.4\%$ trust and $14.6\%$ distrust edges.
•

Reddit (Kumar et al., 2018) is a hyperlink network with communities as nodes and posts between them as edges. The edge labels describe the sentiments of the posts in source communities towards the posts in target communities, where $90.5\%$ are neutral or positive, and $9.5\%$ sentiments are negative. The edge features are text property vectors between the communities.
•

MAG (Sinha et al., 2015) is a citation network where nodes and edges represent the scholars and papers coauthored by scholars, respectively. Edge labels correspond to the field of study for the paper. Edge features are bag-of-words of paper abstracts. In this paper, we adopt the setting in (Wang et al., 2023a), and the majority class ratio is $58.6\%$ .

Table 4. Binary classification results on four real-world datasets indicate that our proposed method can achieve steady improvement in terms of both balanced accuracy and Macro-F1 compared with backbone GNNs, and applying TopoEdge can achieve comparable performance as SOTA baselines.

	Bitcoin-alpha		HSPPI		Epinions		Reddit
	B_acc	Macro_F1	B_acc	Macro_F1	B_acc	Macro_F1	B_acc	Macro_F1
ExtWF	0.500 $\pm$ 0.000	0.484 $\pm$ 0.000	0.497 $\pm$ 0.000	0.432 $\pm$ 0.000	0.500 $\pm$ 0.000	0.476 $\pm$ 0.000	0.552 $\pm$ 0.000	0.570 $\pm$ 0.000
TER+AER (Geometric)	0.871 $\pm$ 0.013	0.870 $\pm$ 0.007	0.598 $\pm$ 0.043	0.514 $\pm$ 0.063	0.764 $\pm$ 0.003	0.783 $\pm$ 0.002	0.789 $\pm$ 0.002	0.778 $\pm$ 0.002
TER+AER (Possion)	0.868 $\pm$ 0.015	0.869 $\pm$ 0.004	0.603 $\pm$ 0.037	0.520 $\pm$ 0.051	0.768 $\pm$ 0.003	0.783 $\pm$ 0.002	0.786 $\pm$ 0.006	0.773 $\pm$ 0.002
GCN w/o TopoEdge	0.797 $\pm$ 0.005	0.840 $\pm$ 0.005	0.655 $\pm$ 0.003	0.667 $\pm$ 0.001	0.731 $\pm$ 0.008	0.745 $\pm$ 0.002	0.715 $\pm$ 0.003	0.733 $\pm$ 0.000
GCN w/ TopoEdge	0.845 $\pm$ 0.004	0.858 $\pm$ 0.004	0.688 $\pm$ 0.009	0.695 $\pm$ 0.004	0.759 $\pm$ 0.002	0.756 $\pm$ 0.003	0.763 $\pm$ 0.001	0.745 $\pm$ 0.003
GAT w/o TopoEdge	0.819 $\pm$ 0.010	0.840 $\pm$ 0.003	0.612 $\pm$ 0.030	0.623 $\pm$ 0.035	0.742 $\pm$ 0.007	0.751 $\pm$ 0.005	0.702 $\pm$ 0.018	0.716 $\pm$ 0.005
GAT w/ TopoEdge	0.861 $\pm$ 0.014	0.858 $\pm$ 0.002	0.641 $\pm$ 0.021	0.646 $\pm$ 0.022	0.744 $\pm$ 0.007	0.753 $\pm$ 0.005	0.758 $\pm$ 0.008	0.739 $\pm$ 0.002
ChebNet w/o TopoEdge	0.803 $\pm$ 0.002	0.830 $\pm$ 0.002	0.530 $\pm$ 0.010	0.507 $\pm$ 0.026	0.751 $\pm$ 0.009	0.771 $\pm$ 0.002	0.707 $\pm$ 0.010	0.728 $\pm$ 0.006
ChebNet w/ TopoEdge	0.865 $\pm$ 0.006	0.865 $\pm$ 0.006	0.695 $\pm$ 0.002	0.693 $\pm$ 0.005	0.782 $\pm$ 0.009	0.783 $\pm$ 0.001	0.758 $\pm$ 0.008	0.746 $\pm$ 0.003

Table 5. Multi-class classification results suggest that applying our proposed TopoEdge can achieve steady improvement (for both balanced accuracy and Macro-F1) with various GNN backbones over SOTA baselines.

	Bitcoin-alpha		NID		MAG
	B_acc	Macro_F1	B_acc	Macro_F1	B_acc	Macro_F1
ExtWF	0.250 $\pm$ 0.000	0.237 $\pm$ 0.000	0.238 $\pm$ 0.000	0.246 $\pm$ 0.000	0.197 $\pm$ 0.000	0.175 $\pm$ 0.000
TER+AER (Geometric)	0.348 $\pm$ 0.060	0.329 $\pm$ 0.057	0.270 $\pm$ 0.071	0.167 $\pm$ 0.063	0.541 $\pm$ 0.031	0.647 $\pm$ 0.050
TER+AER (Possion)	0.381 $\pm$ 0.100	0.352 $\pm$ 0.100	0.316 $\pm$ 0.006	0.242 $\pm$ 0.009	0.534 $\pm$ 0.026	0.627 $\pm$ 0.039
GCN w/o TopoEdge	0.521 $\pm$ 0.009	0.559 $\pm$ 0.013	0.490 $\pm$ 0.005	0.501 $\pm$ 0.008	0.570 $\pm$ 0.003	0.606 $\pm$ 0.007
GCN w/ TopoEdge	0.558 $\pm$ 0.005	0.559 $\pm$ 0.005	0.516 $\pm$ 0.025	0.503 $\pm$ 0.004	0.619 $\pm$ 0.009	0.623 $\pm$ 0.009
GAT w/o TopoEdge	0.517 $\pm$ 0.012	0.558 $\pm$ 0.011	0.479 $\pm$ 0.010	0.462 $\pm$ 0.014	0.601 $\pm$ 0.010	0.640 $\pm$ 0.008
GAT w/ TopoEdge	0.556 $\pm$ 0.001	0.583 $\pm$ 0.001	0.528 $\pm$ 0.030	0.490 $\pm$ 0.010	0.678 $\pm$ 0.007	0.652 $\pm$ 0.004
ChebNet w/o TopoEdge	0.514 $\pm$ 0.004	0.547 $\pm$ 0.007	0.532 $\pm$ 0.019	0.510 $\pm$ 0.007	0.624 $\pm$ 0.008	0.658 $\pm$ 0.004
ChebNet w/ TopoEdge	0.542 $\pm$ 0.009	0.567 $\pm$ 0.010	0.556 $\pm$ 0.014	0.534 $\pm$ 0.006	0.690 $\pm$ 0.006	0.698 $\pm$ 0.001

7.2. Experiment Setting

Our training, validation, and testing sets, were set to 1:2:2 for all datasets. ExtWF (Aggarwal et al., 2016) and TER+AER (Wang et al., 2023a) are selected as baselines. ExtWF is a classic heuristic method that performs edge classification based on the dominant labels in similar edge sets, and TER+AER is a newly proposed edge classification method that is based on both high-order proximities in local subgraphs and edge attributes. GCN (Kipf and Welling, 2016), GAT (Veličković et al., 2017), and ChebNet (Tang et al., 2019) are selected as the backbones. We follow the setting in (Huang et al., 2021), where we first use these methods to get node representations. For edge $e_{ij}$ , we concatenate these two learned end node representations $z_{i}$ and $z_{j}$ to compose an edge representation $z_{ij}$ . After that, we train an MLP classifier on the training set and use it to predict the edge label in the testing set. As previous empirical studies do, balanced accuracy and Macro-F1 are selected as the evaluation metrics, which gives insight into the model’s performance over different classes. The temperature hyperparameter $t$ is finely tuned within the range $(0,5]$ . The parameter $K$ is adjusted from $0$ up to the size of the training data, while both $\theta$ and $h$ are tuned within the interval $(0,1)$ .

7.3. Classification Performance Evaluation

The classification results are shown in Table 4 and Table 5. Overall the proposed TopoEdge depicts significant improvement against backbones. Also, applying our strategy on backbones that are not explicitly designed for learning edge representation can achieve a similar level of performance with the most advanced baseline (TER+AER) in binary classification tasks and significantly beat it in multi-class classification. This suggests the effectiveness of our proposed strategies rather than empathizing edge with high distribution variance can mitigate the topological imbalance issues. The balanced accuracy and Macro-F1 score in binary classification is higher than that in multi-class classification, which is because the edges are influenced by more classes during message passing and the aggregated message is of relatively low quality. This again shows the need to address topological imbalance in edge classification. ExtWF performs poorly across datasets because it is based on the dominant edge class in similar edge sets, edges will always tend to be predicted as the majority of edges in a highly imbalanced dataset. TER+AER can depict comparable performance in binary classification as it involves generating edge embedding based on higher-order proximity in local subgraphs, which shows that leveraging local structural patterns can indeed boost performance. However, as this method does not consider the influence of quantity and topological imbalance, hence when uncertainty in the local subgraph arises by switching to multi-class datasets, its performance degrades a lot. This also suggests that compared with focusing on local proximity, emphasizing the variance will help increase the edge classification performance more on the imbalanced dataset.

7.4. Ablation Study

In this section, we detail the ablation studies conducted to evaluate the impact of different components of the TopoEdge framework on edge classification tasks. The results, as illustrated in Figure 7, employing the full TopoEdge framework significantly enhances classification performance compared to using these components separately or the baseline models alone in both tasks. More specifically, the integration of both components in TopoEdge yields the most substantial improvements, demonstrating a synergistic effect that optimizes classification outcomes. Also, both Topological Reweight and TE Wedge-Based Mixup independently improve performance over baseline models, confirming their efficacy. These findings underscore the effectiveness of each component and highlight the benefits of their integration for imbalanced edge classification.

7.5. Influence of Labeled Data Ratio

Generally speaking, the labeling ratios in graph datasets are low, so in this section, we give a further analysis of the classification performance of our proposed method under various labeled training ratios. As shown in Figure 8, our proposed strategies can steadily beat the baseline even when low-labeled training data is available, and TopoEdge (+ $RX$ ) depicts the best performance under an extremely low labeled ratio. This indicates that capturing local distribution variance works even when scare data is available. as scare data may still depict large variance and we can take advantage of that to get better edge representation.

8. Related Work

In this section, we present a summary of the related work in edge classification and imbalanced graph machine learning, which are the two closest directions to the presented work.

Edge Classification: The problem of edge classification in networks was formulated by Aggarwal et al. (Aggarwal et al., 2016) in 2016, where the authors address the edge classification tasks by proposing a structural similarity model using the weighted Jaccard coefficient as the underlying structural proximity metric. In recent years, most work in the field of edge classification has focused on generating embeddings for edges. One direction uses shallow embedding techniques. For example, AttrE2vec (Bielak et al., 2022) aggregates features through random walk, Edge2vec (Wang et al., 2020b) uses deep auto-encoders, and TER+AER (Wang et al., 2023a) generate edge embeddings by encoding local topology structure information while augmenting edge attributes through a feature aggregation scheme. Another direction lies in GNNs, where the most commonly seen approach is to use the end nodes’ embeddings and form edge embeddings via operations such as averaging, Hadamard product, concatenation, or deep neural network (Kipf and Welling, 2016; Veličković et al., 2017; Tang et al., 2019). Converting the original graphs into line graphs and then using GNNs to generate edge representation has also been studied (Wang et al., 2020a). What’s more, even though methods like EGNN (Kim et al., 2019) and EGAT (Gong and Cheng, 2019) are used for node classification, they can generate embeddings for edges directly, without first converting edges into nodes. However, none of these above methods focused on mitigating the topological imbalance in edge classification.

Imbalanced Graph Machine Learning: Imbalanced graph machine learning is an emerging research area addressing imbalance issues in graph data. In recent years, several methods have been proposed (Chen et al., 2021; Liu et al., 2021; Park et al., 2021; Zhao et al., 2021; Liu et al., 2023; Wang et al., 2022; Sun et al., 2022) in this field. One of the early works GraphSMOTE (Zhao et al., 2021) adopts SMOTE (Chawla et al., 2002) oversampling in the node embedding space to synthesize minority nodes and complements the topology with a learnable edge predictor. In the field of topology imbalance, ReNode (Chen et al., 2021) first addresses the topology imbalance issue, i.e., the unequal structure role of labeled nodes in the topology, by down-weighting close-to-boundary labeled nodes. PASTEL (Sun et al., 2022) addresses the under-reaching and over-squashing issue in node classification through structural learning. TOBA (Liu et al., 2023) approaches the source of the class-imbalance bias from a topology-centric perspective. However, none of the existing research focused on an edge-centric perspective. To the best of our knowledge, there are currently no extensive studies related to topology imbalance issues in edge classification. In this paper, we approach the topology imbalance issue in edge classification using a reweighting and resampling method, and consequently future boost the current benchmark performance on edge classification.

9. Conclusion

In this paper, we addressed the previously overlooked issue of topological imbalance in edge classification within graph machine learning (GML). We found that this imbalance significantly hinders model performance across various graphs. To measure these imbalances, we introduced a new metric, Topological Entropy (TE), which quantifies class distribution variance in local subgraphs. Building on this, we developed a topological reweighting approach to adjust the training weights of labeled edges based on their topological properties. Additionally, we introduced a topological wedge-based mixup strategy that generates synthetic training edges through interpolation among high-TE edges. These strategies are integrated into our novel TopoEdge approach for edge classification. Our comprehensive empirical analyses confirm the effectiveness of TopoEdge in various classification tasks. Acknowledging the importance of topological imbalance, we anticipate further dedicated research in this direction. Given the widespread nature of topological imbalances, exploring resolutions across more applications is a promising direction for future research.

References

(1)
Aggarwal et al. (2016) Charu Aggarwal, Gewen He, and Peixiang Zhao. 2016. Edge classification in networks. In 2016 IEEE 32nd ICDE. IEEE.
Ai et al. (2022) Baole Ai, Zhou Qin, Wenting Shen, and Yong Li. 2022. Structure enhanced graph neural networks for link prediction. arXiv preprint arXiv:2201.05293 (2022).
Bielak et al. (2022) Piotr Bielak, Tomasz Kajdanowicz, and Nitesh V Chawla. 2022. Attre2vec: Unsupervised attributed edge representation learning. Information Sciences 592 (2022).
Byrd and Lipton (2019) Jonathon Byrd and Zachary Lipton. 2019. What is the effect of importance weighting in deep learning?. In ICML. PMLR.
Chawla et al. (2002) Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 16 (2002).
Chen et al. (2021) Deli Chen, Yankai Lin, Guangxiang Zhao, Xuancheng Ren, Peng Li, Jie Zhou, and Xu Sun. 2021. Topology-imbalance learning for semi-supervised node classification. NeurIPS 34 (2021).
Derr et al. (2017) Tyler Derr, Chenxing Wang, Suhang Wang, and Jiliang Tang. 2017. Signed node relevance measurements. arXiv preprint arXiv:1710.07236 (2017).
Errica et al. (2019) Federico Errica, Marco Podda, Davide Bacciu, and Alessio Micheli. 2019. A fair comparison of graph neural networks for graph classification. arXiv preprint arXiv:1912.09893 (2019).
Gong and Cheng (2019) Liyu Gong and Qiang Cheng. 2019. Exploiting edge features for graph neural networks. In CVPR.
Guha et al. (2004) Ramanthan Guha, Ravi Kumar, Prabhakar Raghavan, and Andrew Tomkins. 2004. Propagation of trust and distrust. In Proceedings of the 13th international conference on World Wide Web. 403–412.
Gulrajani and Lopez-Paz (2020) Ishaan Gulrajani and David Lopez-Paz. 2020. In search of lost domain generalization. arXiv preprint arXiv:2007.01434 (2020).
Guo et al. (2008) Xinjian Guo, Yilong Yin, Cailing Dong, Gongping Yang, and Guangtong Zhou. 2008. On the class imbalance problem. In 2008 Fourth international conference on natural computation, Vol. 4. IEEE.
Han et al. (2022) Xiaotian Han, Zhimeng Jiang, Ninghao Liu, and Xia Hu. 2022. G-mixup: Graph data augmentation for graph classification. In ICML. PMLR.
Huang et al. (2016) Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. 2016. Learning deep representation for imbalanced classification. In CVPR.
Huang et al. (2021) Junjie Huang, Huawei Shen, Liang Hou, and Xueqi Cheng. 2021. SDGNN: Learning Node Representation for Signed Directed Networks. In AAAI, Vol. 35.
Jha et al. (2022) Kanchan Jha, Sriparna Saha, and Hiteshi Singh. 2022. Prediction of protein–protein interaction using graph neural networks. Scientific Reports 12, 1 (2022).
Kim et al. (2019) Jongmin Kim, Taesup Kim, Sungwoong Kim, and Chang D Yoo. 2019. Edge-labeling graph neural network for few-shot learning. In CVPR.
Kim et al. (2023) Junghurn Kim, Sukwon Yun, and Chanyoung Park. 2023. S-Mixup: Structural Mixup for Graph Neural Networks. In CIKM 2023.
Kipf and Welling (2016) Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
Koh et al. (2021) Pang Wei Koh, Shiori Sagawa, Henrik Marklund, Sang Michael Xie, Marvin Zhang, Akshay Balsubramani, Weihua Hu, Michihiro Yasunaga, Richard Lanas Phillips, Irena Gao, et al. 2021. Wilds: A benchmark of in-the-wild distribution shifts. In ICML. PMLR.
Kumar et al. (2018) Srijan Kumar, William L Hamilton, Jure Leskovec, and Dan Jurafsky. 2018. Community interaction and conflict on the web. In WWW 2018.
Kwak et al. (2010) Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media?. In WWW 2010.
Li et al. (2022) Jiaxing Li, Ke Zhang, Xinyan Pu, and Youyong Kong. 2022. Graph Attention Mixup Transformer for Graph Classification. In NeurIPS. Springer.
Liu et al. (2021) Yang Liu, Xiang Ao, Zidi Qin, Jianfeng Chi, Jinghua Feng, Hao Yang, and Qing He. 2021. Pick and choose: a GNN-based imbalanced learning approach for fraud detection. In WWW 2021.
Liu et al. (2023) Zhining Liu, Zhichen Zeng, Ruizhong Qiu, Hyunsik Yoo, David Zhou, Zhe Xu, Yada Zhu, Kommy Weldemariam, Jingrui He, and Hanghang Tong. 2023. Topological Augmentation for Class-Imbalanced Node Classification. arXiv preprint arXiv:2308.14181 (2023).
Ma et al. (2024) Xinyu Ma, Xu Chu, Yasha Wang, Yang Lin, Junfeng Zhao, Liantao Ma, and Wenwu Zhu. 2024. Fused Gromov-Wasserstein Graph Mixup for Graph-level Classifications. NeurIPS 36 (2024).
Ma et al. (2023) Yihong Ma, Yijun Tian, Nuno Moniz, and Nitesh V Chawla. 2023. Class-Imbalanced Learning on Graphs: A Survey. arXiv preprint arXiv:2304.04300 (2023).
Mao et al. (2023) Haitao Mao, Zhikai Chen, Wei Jin, Haoyu Han, Yao Ma, Tong Zhao, Neil Shah, and Jiliang Tang. 2023. Demystifying Structural Disparity in Graph Neural Networks: Can One Size Fit All? arXiv preprint arXiv:2306.01323 (2023).
Nightingale et al. (2017) Andrew Nightingale, Ricardo Antunes, Emanuele Alpi, Borisas Bursteinas, Leonardo Gonzales, Wudong Liu, Jie Luo, Guoying Qi, Edd Turner, and Maria Martin. 2017. The Proteins API: accessing key integrated protein and genome information. Nucleic acids research 45, W1 (2017).
Pandey et al. (2019) Babita Pandey, Praveen Kumar Bhanodia, Aditya Khamparia, and Devendra Kumar Pandey. 2019. A comprehensive survey of edge prediction in social networks: Techniques, parameters and challenges. Expert Systems with Applications 124 (2019).
Park et al. (2021) Joonhyung Park, Jaeyun Song, and Eunho Yang. 2021. Graphens: Neighbor-aware ego network synthesis for class-imbalanced node classification. In ICLR.
Sagawa et al. (2019) Shiori Sagawa, Pang Wei Koh, Tatsunori B Hashimoto, and Percy Liang. 2019. Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. arXiv preprint arXiv:1911.08731 (2019).
Sagawa et al. (2021) Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, et al. 2021. Extending the WILDS benchmark for unsupervised adaptation. arXiv preprint arXiv:2112.05090 (2021).
Sarhan et al. (2021) Mohanad Sarhan, Siamak Layeghy, Nour Moustafa, and Marius Portmann. 2021. Netflow datasets for machine learning-based network intrusion detection systems. In Big Data Technologies and Applications: 10th EAI International Conference, BDTA 2020, and 13th EAI International Conference on Wireless Internet, WiCON 2020, Virtual Event, December 11, 2020, Proceedings 10. Springer.
Sinha et al. (2015) Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June Hsu, and Kuansan Wang. 2015. An overview of microsoft academic service (mas) and applications. In WWW 2015.
Song et al. (2022) Jaeyun Song, Joonhyung Park, and Eunho Yang. 2022. TAM: topology-aware margin loss for class-imbalanced node classification. In ICML. PMLR.
Sun et al. (2022) Qingyun Sun, Jianxin Li, Haonan Yuan, Xingcheng Fu, Hao Peng, Cheng Ji, Qian Li, and Philip S Yu. 2022. Position-aware structure learning for graph topology-imbalance by relieving under-reaching and over-squashing. In CIKM.
Szklarczyk et al. (2023) Damian Szklarczyk, Rebecca Kirsch, Mikaela Koutrouli, et al. 2023. The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic acids research 51, D1 (2023).
Tang et al. (2019) Shanshan Tang, Bo Li, and Haijun Yu. 2019. ChebNet: Efficient and stable constructions of deep neural networks with rectified power units using chebyshev approximations. arXiv preprint arXiv:1911.05467 (2019).
Veličković et al. (2017) Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
Wang et al. (2020b) Changping Wang, Chaokun Wang, Zheng Wang, Xiaojun Ye, and Philip S Yu. 2020b. Edge2vec: Edge-based social network embedding. TKDD 14, 4 (2020).
Wang et al. (2023a) Hewen Wang, Renchi Yang, Keke Huang, and Xiaokui Xiao. 2023a. Efficient and Effective Edge-wise Graph Representation Learning. In KDD 2023.
Wang et al. (2021b) Ke Alexander Wang, Niladri S Chatterji, Saminul Haque, and Tatsunori Hashimoto. 2021b. Is importance weighting incompatible with interpolating classifiers? arXiv preprint arXiv:2112.12986 (2021).
Wang et al. (2020a) Pengyang Wang, Jiaping Gui, Zhengzhang Chen, Junghwan Rhee, Haifeng Chen, and Yanjie Fu. 2020a. A generic edge-empowered graph convolutional network via node-edge mutual enhancement. In WWW 2020.
Wang et al. (2021a) Yu Wang, Charu Aggarwal, and Tyler Derr. 2021a. Distance-wise prototypical graph neural network in node imbalance classification. arXiv preprint arXiv:2110.12035 (2021).
Wang et al. (2021c) Yiwei Wang, Wei Wang, Yuxuan Liang, Yujun Cai, and Bryan Hooi. 2021c. Mixup for node and graph classification. In WWW 2021.
Wang et al. (2023b) Yu Wang, Tong Zhao, Yuying Zhao, Yunchao Liu, Xueqi Cheng, Neil Shah, and Tyler Derr. 2023b. A Topological Perspective on Demystifying GNN-Based Link Prediction Performance. In ICLR.
Wang et al. (2022) Yu Wang, Yuying Zhao, Neil Shah, and Tyler Derr. 2022. Imbalanced graph classification via graph-of-graph neural networks. In CIKM 2022.
Wiles et al. (2021) Olivia Wiles, Sven Gowal, Florian Stimberg, Sylvestre Alvise-Rebuffi, Ira Ktena, Krishnamurthy Dvijotham, and Taylan Cemgil. 2021. A fine-grained analysis on distribution shift. arXiv preprint arXiv:2110.11328 (2021).
Wu et al. (2021) Bang Wu, Xiangwen Yang, Shirui Pan, and Xingliang Yuan. 2021. Adapting membership inference attacks to GNN for graph classification: Approaches and implications. In ICDM 2021. IEEE.
Yu et al. (2020) Zeping Yu, Rui Cao, Qiyi Tang, Sen Nie, Junzhou Huang, and Shi Wu. 2020. Order matters: Semantic-aware neural networks for binary code similarity detection. In AAAI, Vol. 34.
Zhai et al. (2022) Runtian Zhai, Chen Dan, Zico Kolter, and Pradeep Ravikumar. 2022. Understanding why generalized reweighting does not improve over erm. arXiv preprint arXiv:2201.12293 (2022).
Zhang et al. (2017) Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, and David Lopez-Paz. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017).
Zhang and Chen (2018) Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. NeurIPS 31 (2018).
Zhao et al. (2021) Tianxiang Zhao, Xiang Zhang, and Suhang Wang. 2021. Graphsmote: Imbalanced node classification on graphs with graph neural networks. In WSDM 2021.
Zhu et al. (2021) Jiong Zhu, Ryan A Rossi, Anup Rao, Tung Mai, Nedim Lipka, Nesreen K Ahmed, and Danai Koutra. 2021. Graph neural networks with heterophily. In AAAI.

Edge Classification on Graphs: New Directions in Topological Imbalance

Abstract.

1. Introduction

2. Notations and Preliminary

3. Topological Imbalance and Topological Entropy

3.1. Motivation

3.2. Topological Entropy (TE)

Definition 0.

4. Topological Reweight

5. Mixup based on TE

5.1. Motivation

5.2. TE Wedge-based Mixup

5.2.1. High TE Wedge Selection Module:

Definition 0.

5.2.2. Wedge-Mixup Module:

5.3. Empirical Analysis

6. Overall TopoEdge Framework

7. Experiments

7.1. Datasets

7.2. Experiment Setting

7.3. Classification Performance Evaluation

7.4. Ablation Study

7.5. Influence of Labeled Data Ratio

8. Related Work

9. Conclusion

References

Edge Classification on Graphs:
New Directions in Topological Imbalance