subscribe to arXiv mailings

arXiv:2405.19598 [pdf, other]

Evaluating the Effectiveness and Robustness of Visual Similarity-based Phishing Detection Models

Authors: Fujiao Ji, Kiho Lee, Hyungjoon Koo, Wenhao You, Euijin Choo, Hyoungshick Kim, Doowon Kim

Abstract: Phishing attacks pose a significant threat to Internet users, with cybercriminals elaborately replicating the visual appearance of legitimate websites to deceive victims. Visual similarity-based detection systems have emerged as an effective countermeasure, but their effectiveness and robustness in real-world scenarios have been unexplored. In this paper, we comprehensively scrutinize and evaluate… ▽ More Phishing attacks pose a significant threat to Internet users, with cybercriminals elaborately replicating the visual appearance of legitimate websites to deceive victims. Visual similarity-based detection systems have emerged as an effective countermeasure, but their effectiveness and robustness in real-world scenarios have been unexplored. In this paper, we comprehensively scrutinize and evaluate state-of-the-art visual similarity-based anti-phishing models using a large-scale dataset of 450K real-world phishing websites. Our analysis reveals that while certain models maintain high accuracy, others exhibit notably lower performance than results on curated datasets, highlighting the importance of real-world evaluation. In addition, we observe the real-world tactic of manipulating visual components that phishing attackers employ to circumvent the detection systems. To assess the resilience of existing models against adversarial attacks and robustness, we apply visible and perturbation-based manipulations to website logos, which adversaries typically target. We then evaluate the models' robustness in handling these adversarial samples. Our findings reveal vulnerabilities in several models, emphasizing the need for more robust visual similarity techniques capable of withstanding sophisticated evasion attempts. We provide actionable insights for enhancing the security of phishing defense systems, encouraging proactive actions. To the best of our knowledge, this work represents the first large-scale, systematic evaluation of visual similarity-based models for phishing detection in real-world settings, necessitating the development of more effective and robust defenses. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 12 pages

arXiv:2404.17099 [pdf, other]

Unleashing the Potential of Fractional Calculus in Graph Neural Networks with FROND

Authors: Qiyu Kang, Kai Zhao, Qinxu Ding, Feng Ji, Xuhao Li, Wenfei Liang, Yang Song, Wee Peng Tay

Abstract: We introduce the FRactional-Order graph Neural Dynamical network (FROND), a new continuous graph neural network (GNN) framework. Unlike traditional continuous GNNs that rely on integer-order differential equations, FROND employs the Caputo fractional derivative to leverage the non-local properties of fractional calculus. This approach enables the capture of long-term dependencies in feature update… ▽ More We introduce the FRactional-Order graph Neural Dynamical network (FROND), a new continuous graph neural network (GNN) framework. Unlike traditional continuous GNNs that rely on integer-order differential equations, FROND employs the Caputo fractional derivative to leverage the non-local properties of fractional calculus. This approach enables the capture of long-term dependencies in feature updates, moving beyond the Markovian update mechanisms in conventional integer-order models and offering enhanced capabilities in graph representation learning. We offer an interpretation of the node feature updating process in FROND from a non-Markovian random walk perspective when the feature updating is particularly governed by a diffusion process. We demonstrate analytically that oversmoothing can be mitigated in this setting. Experimentally, we validate the FROND framework by comparing the fractional adaptations of various established integer-order continuous GNNs, demonstrating their consistently improved performance and underscoring the framework's potential as an effective extension to enhance traditional continuous GNNs. The code is available at \url{https://github.com/zknus/ICLR2024-FROND}. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: The Twelfth International Conference on Learning Representations

arXiv:2403.11081 [pdf, other]

Enhanced Index Modulation Aided Non-Orthogonal Multiple Access via Constellation Rotation

Authors: Ronglan Huang, Fei ji, Zeng Hu, Dehuan Wan, Pengcheng Xu, Yun Liu

Abstract: Non-orthogonal multiple access (NOMA) has been widely nominated as an emerging spectral efficiency (SE) multiple access technique for the next generation of wireless communication network. To meet the growing demands in massive connectivity and huge data in transmission, a novel index modulation aided NOMA with the rotation of signal constellation of low power users (IM-NOMA-RC) is developed to th… ▽ More Non-orthogonal multiple access (NOMA) has been widely nominated as an emerging spectral efficiency (SE) multiple access technique for the next generation of wireless communication network. To meet the growing demands in massive connectivity and huge data in transmission, a novel index modulation aided NOMA with the rotation of signal constellation of low power users (IM-NOMA-RC) is developed to the downlink transmission. In the proposed IM-NOMA-RC system, the users are classified into far-user group and near-user group according to their channel conditions, where the rotation constellation based IM operation is performed only on the users who belong to the near-user group that are allocated lower power compared with the far ones to transmit extra information. In the proposed IM-NOMA-RC, all the subcarriers are activated to transmit information to multiple users to achieve higher SE. With the aid of the multiple dimension modulation in IM-NOMA-RC, more users can be supported over an orthogonal resource block. Then, both maximum likelihood (ML) detector and successive interference cancellation (SIC) detector are studied for all the user. Numerical simulation results of the proposed IM-NOMARC scheme are investigate for the ML detector and the SIC detector for each users, which shows that proposed scheme can outperform conventional NOMA. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2402.00689 [pdf, other]

Ocassionally Secure: A Comparative Analysis of Code Generation Assistants

Authors: Ran Elgedawy, John Sadik, Senjuti Dutta, Anuj Gautam, Konstantinos Georgiou, Farzin Gholamrezae, Fujiao Ji, Kyungchan Lim, Qian Liu, Scott Ruoti

Abstract: $ $Large Language Models (LLMs) are being increasingly utilized in various applications, with code generations being a notable example. While previous research has shown that LLMs have the capability to generate both secure and insecure code, the literature does not take into account what factors help generate secure and effective code. Therefore in this paper we focus on identifying and understan… ▽ More $ $Large Language Models (LLMs) are being increasingly utilized in various applications, with code generations being a notable example. While previous research has shown that LLMs have the capability to generate both secure and insecure code, the literature does not take into account what factors help generate secure and effective code. Therefore in this paper we focus on identifying and understanding the conditions and contexts in which LLMs can be effectively and safely deployed in real-world scenarios to generate quality code. We conducted a comparative analysis of four advanced LLMs--GPT-3.5 and GPT-4 using ChatGPT and Bard and Gemini from Google--using 9 separate tasks to assess each model's code generation capabilities. We contextualized our study to represent the typical use cases of a real-life developer employing LLMs for everyday tasks as work. Additionally, we place an emphasis on security awareness which is represented through the use of two distinct versions of our developer persona. In total, we collected 61 code outputs and analyzed them across several aspects: functionality, security, performance, complexity, and reliability. These insights are crucial for understanding the models' capabilities and limitations, guiding future development and practical applications in the field of automated code generation. △ Less

Submitted 1 February, 2024; originally announced February 2024.

Comments: 12 pages, 2 figures

arXiv:2401.04358 [pdf, ps, other]

Message-Passing Receiver for OCDM over Multi-Lag Multi-Doppler Channels

Authors: Yun Liu, Fei Ji, Miaowen Wen, Hua Qing

Abstract: As a new candidate waveform for the next generation wireless communications, orthogonal chirp division multiplexing (OCDM) has attracted growing attention for its ability to achieve full diversity in uncoded transmission, and its robustness to narrow-band interference or impulsive noise. Under high mobility channels with multiple lags and multiple Doppler-shifts (MLMD), the signal suffers doubly s… ▽ More As a new candidate waveform for the next generation wireless communications, orthogonal chirp division multiplexing (OCDM) has attracted growing attention for its ability to achieve full diversity in uncoded transmission, and its robustness to narrow-band interference or impulsive noise. Under high mobility channels with multiple lags and multiple Doppler-shifts (MLMD), the signal suffers doubly selective (DS) fadings in time and frequency domain, and data symbols modulated on orthogonal chirps are interfered by each other. To address the problem of symbol detection of OCDM over MLMD channel, under the assumption that path attenuation factors, delays, and Doppler shifts of the channel are available, we first derive the closed-form channel matrix in Fresnel domain, and then propose a low-complexity method to approximate it as a sparse matrix. Based on the approximated Fresnel-domain channel, we propose a message passing (MP) based detector to estimate the transmit symbols iteratively. Finally, under two MLMD channels (an underspread channel for terrestrial vehicular communication, and an overspread channel for narrow-band underwater acoustic communications), Monte Carlo simulation results and analysis are provided to validate its advantages as a promising detector for OCDM. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: 15 pages, 10 figures

ACM Class: B.4.1

arXiv:2310.16401 [pdf, other]

Graph Neural Networks with a Distribution of Parametrized Graphs

Authors: See Hian Lee, Feng Ji, Kelin Xia, Wee Peng Tay

Abstract: Traditionally, graph neural networks have been trained using a single observed graph. However, the observed graph represents only one possible realization. In many applications, the graph may encounter uncertainties, such as having erroneous or missing edges, as well as edge weights that provide little informative value. To address these challenges and capture additional information previously abs… ▽ More Traditionally, graph neural networks have been trained using a single observed graph. However, the observed graph represents only one possible realization. In many applications, the graph may encounter uncertainties, such as having erroneous or missing edges, as well as edge weights that provide little informative value. To address these challenges and capture additional information previously absent in the observed graph, we introduce latent variables to parameterize and generate multiple graphs. We obtain the maximum likelihood estimate of the network parameters in an Expectation-Maximization (EM) framework based on the multiple graphs. Specifically, we iteratively determine the distribution of the graphs using a Markov Chain Monte Carlo (MCMC) method, incorporating the principles of PAC-Bayesian theory. Numerical experiments demonstrate improvements in performance against baseline models on node classification for heterogeneous graphs and graph regression on chemistry datasets. △ Less

Submitted 2 February, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

arXiv:2310.07141 [pdf, ps, other]

Time and Frequency Offset Estimation and Intercarrier Interference Cancellation for AFDM Systems

Authors: Yuankun Tang, Anjie Zhang, Miaowen Wen, Yu Huang, Fei Ji, Jinming Wen

Abstract: Affine frequency division multiplexing (AFDM) is an emerging multicarrier waveform that offers a potential solution for achieving reliable communications over time-varying channels. This paper proposes two maximum-likelihood (ML) estimators of symbol time offset and carrier frequency offset for AFDM systems. One is called joint ML estimator, which evaluates the arrival time and carrier frequency o… ▽ More Affine frequency division multiplexing (AFDM) is an emerging multicarrier waveform that offers a potential solution for achieving reliable communications over time-varying channels. This paper proposes two maximum-likelihood (ML) estimators of symbol time offset and carrier frequency offset for AFDM systems. One is called joint ML estimator, which evaluates the arrival time and carrier frequency offset by comparing the correlations of samples. Moreover, we propose the other so-called stepwise ML estimator to reduce the complexity. Both proposed estimators exploit the redundant information contained within the chirp-periodic prefix inherent in AFDM symbols, thus dispensing with any additional pilots. To further mitigate the intercarrier interference resulting from the residual frequency offset, we design a mirror-mapping-based scheme for AFDM systems. Numerical results verify the effectiveness of the proposed time and carrier frequency offset estimation criteria and the mirror-mapping-based modulation for AFDM systems. △ Less

Submitted 28 December, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

Comments: accepted by IEEE Wireless Communications and Networking Conference (WCNC) 2024

arXiv:2309.07169 [pdf, other]

Spectral Convergence of Complexon Shift Operators

Authors: Purui Zhang, Xingchao Jian, Feng Ji, Wee Peng Tay, Bihan Wen

Abstract: Topological Signal Processing (TSP) utilizes simplicial complexes to model structures with higher order than vertices and edges. In this paper, we study the transferability of TSP via a generalized higher-order version of graphon, known as complexon. We recall the notion of a complexon as the limit of a simplicial complex sequence [1]. Inspired by the graphon shift operator and message-passing neu… ▽ More Topological Signal Processing (TSP) utilizes simplicial complexes to model structures with higher order than vertices and edges. In this paper, we study the transferability of TSP via a generalized higher-order version of graphon, known as complexon. We recall the notion of a complexon as the limit of a simplicial complex sequence [1]. Inspired by the graphon shift operator and message-passing neural network, we construct a marginal complexon and complexon shift operator (CSO) according to components of all possible dimensions from the complexon. We investigate the CSO's eigenvalues and eigenvectors and relate them to a new family of weighted adjacency matrices. We prove that when a simplicial complex signal sequence converges to a complexon signal, the eigenvalues, eigenspaces, and Fourier transform of the corresponding CSOs converge to that of the limit complexon signal. This conclusion is further verified by two numerical experiments. These results hint at learning transferability on large simplicial complexes or simplicial complex sequences, which generalize the graphon signal processing framework. △ Less

Submitted 5 May, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

Comments: 9 pages, 2 figures

arXiv:2309.05260 [pdf, other]

Generalized Graphon Process: Convergence of Graph Frequencies in Stretched Cut Distance

Authors: Xingchao Jian, Feng Ji, Wee Peng Tay

Abstract: Graphons have traditionally served as limit objects for dense graph sequences, with the cut distance serving as the metric for convergence. However, sparse graph sequences converge to the trivial graphon under the conventional definition of cut distance, which make this framework inadequate for many practical applications. In this paper, we utilize the concepts of generalized graphons and stretche… ▽ More Graphons have traditionally served as limit objects for dense graph sequences, with the cut distance serving as the metric for convergence. However, sparse graph sequences converge to the trivial graphon under the conventional definition of cut distance, which make this framework inadequate for many practical applications. In this paper, we utilize the concepts of generalized graphons and stretched cut distance to describe the convergence of sparse graph sequences. Specifically, we consider a random graph process generated from a generalized graphon. This random graph process converges to the generalized graphon in stretched cut distance. We use this random graph process to model the growing sparse graph, and prove the convergence of the adjacency matrices' eigenvalues. We supplement our findings with experimental validation. Our results indicate the possibility of transfer learning between sparse graphs. △ Less

Submitted 11 September, 2023; originally announced September 2023.

arXiv:2308.09259 [pdf, other]

FRGNN: Mitigating the Impact of Distribution Shift on Graph Neural Networks via Test-Time Feature Reconstruction

Authors: Rui Ding, Jielong Yang, Feng Ji, Xionghu Zhong, Linbo Xie

Abstract: Due to inappropriate sample selection and limited training data, a distribution shift often exists between the training and test sets. This shift can adversely affect the test performance of Graph Neural Networks (GNNs). Existing approaches mitigate this issue by either enhancing the robustness of GNNs to distribution shift or reducing the shift itself. However, both approaches necessitate retrain… ▽ More Due to inappropriate sample selection and limited training data, a distribution shift often exists between the training and test sets. This shift can adversely affect the test performance of Graph Neural Networks (GNNs). Existing approaches mitigate this issue by either enhancing the robustness of GNNs to distribution shift or reducing the shift itself. However, both approaches necessitate retraining the model, which becomes unfeasible when the model structure and parameters are inaccessible. To address this challenge, we propose FR-GNN, a general framework for GNNs to conduct feature reconstruction. FRGNN constructs a mapping relationship between the output and input of a well-trained GNN to obtain class representative embeddings and then uses these embeddings to reconstruct the features of labeled nodes. These reconstructed features are then incorporated into the message passing mechanism of GNNs to influence the predictions of unlabeled nodes at test time. Notably, the reconstructed node features can be directly utilized for testing the well-trained model, effectively reducing the distribution shift and leading to improved test performance. This remarkable achievement is attained without any modifications to the model structure or parameters. We provide theoretical guarantees for the effectiveness of our framework. Furthermore, we conduct comprehensive experiments on various public datasets. The experimental results demonstrate the superior performance of FRGNN in comparison to multiple categories of baseline methods. △ Less

Submitted 13 October, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

arXiv:2306.15421 [pdf, ps, other]

Mutual Information Rate of Gaussian and Truncated Gaussian Inputs on Intensity-Driven Signal Transduction Channels

Authors: Xuan Chen, Fei Ji, Miaowen Wen, Yu Huang, Andrew W. Eckford

Abstract: In this letter, we investigate the mutual information rate (MIR) achieved by an independent identically distributed (IID) Gaussian input on the intensity-driven signal transduction channel. Specifically, the asymptotic expression of the continuous-time MIR is given. Next, aiming at low computational complexity, we also deduce an approximately numerical solution for this MIR. Moreover, the correspo… ▽ More In this letter, we investigate the mutual information rate (MIR) achieved by an independent identically distributed (IID) Gaussian input on the intensity-driven signal transduction channel. Specifically, the asymptotic expression of the continuous-time MIR is given. Next, aiming at low computational complexity, we also deduce an approximately numerical solution for this MIR. Moreover, the corresponding lower and upper bounds that can be used to find the capacity-achieving input distribution parameters are derived in closed-form. Finally, simulation results show the accuracy of our analysis. △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: Accepted for publication in IEEE Communications Letters

arXiv:2306.12042 [pdf, ps, other]

Block-Wise Index Modulation and Receiver Design for High-Mobility OTFS Communications

Authors: Mi Qian, Fei Ji, Yao Ge, Miaowen Wen, Xiang Cheng, H. Vincent Poor

Abstract: As a promising technique for high-mobility wireless communications, orthogonal time frequency space (OTFS) has been proved to enjoy excellent advantages with respect to traditional orthogonal frequency division multiplexing (OFDM). Although multiple studies have considered index modulation (IM) based OTFS (IM-OTFS) schemes to further improve system performance, a challenging and open problem is th… ▽ More As a promising technique for high-mobility wireless communications, orthogonal time frequency space (OTFS) has been proved to enjoy excellent advantages with respect to traditional orthogonal frequency division multiplexing (OFDM). Although multiple studies have considered index modulation (IM) based OTFS (IM-OTFS) schemes to further improve system performance, a challenging and open problem is the development of effective IM schemes and efficient receivers for practical OTFS systems that must operate in the presence of channel delays and Doppler shifts. In this paper, we propose two novel block-wise IM schemes for OTFS systems, named delay-IM with OTFS (DeIM-OTFS) and Doppler-IM with OTFS (DoIM-OTFS), where a block of delay/Doppler resource bins are activated simultaneously. Based on a maximum likelihood (ML) detector, we analyze upper bounds on the average bit error rates for the proposed DeIM-OTFS and DoIM-OTFS schemes, and verify their performance advantages over the existing IM-OTFS systems. We also develop a multi-layer joint symbol and activation pattern detection (MLJSAPD) algorithm and a customized message passing detection (CMPD) algorithm for our proposed DeIMOTFS and DoIM-OTFS systems with low complexity. Simulation results demonstrate that our proposed MLJSAPD and CMPD algorithms can achieve desired performance with robustness to the imperfect channel state information (CSI). △ Less

Submitted 21 June, 2023; originally announced June 2023.

Comments: arXiv admin note: text overlap with arXiv:2210.13454

arXiv:2305.07928 [pdf, other]

AMTSS: An Adaptive Multi-Teacher Single-Student Knowledge Distillation Framework For Multilingual Language Inference

Authors: Qianglong Chen, Feng Ji, Feng-Lin Li, Guohai Xu, Ming Yan, Ji Zhang, Yin Zhang

Abstract: Knowledge distillation is of key importance to launching multilingual pre-trained language models for real applications. To support cost-effective language inference in multilingual settings, we propose AMTSS, an adaptive multi-teacher single-student distillation framework, which allows distilling knowledge from multiple teachers to a single student. We first introduce an adaptive learning strateg… ▽ More Knowledge distillation is of key importance to launching multilingual pre-trained language models for real applications. To support cost-effective language inference in multilingual settings, we propose AMTSS, an adaptive multi-teacher single-student distillation framework, which allows distilling knowledge from multiple teachers to a single student. We first introduce an adaptive learning strategy and teacher importance weight, which enables a student to effectively learn from max-margin teachers and easily adapt to new languages. Moreover, we present a shared student encoder with different projection layers in support of multiple languages, which contributes to largely reducing development and machine cost. Experimental results show that AMTSS gains competitive results on the public XNLI dataset and the realistic industrial dataset AliExpress (AE) in the E-commerce scenario. △ Less

Submitted 13 May, 2023; originally announced May 2023.

arXiv:2305.06899 [pdf, other]

Generalized signals on simplicial complexes

Authors: Feng Ji, Xingchao Jian, Wee Peng Tay, Maosheng Yang

Abstract: Topological signal processing (TSP) over simplicial complexes typically assumes observations associated with the simplicial complexes are real scalars. In this paper, we develop TSP theories for the case where observations belong to general abelian groups, including function spaces that are commonly used to represent time-varying signals. Our approach generalizes the Hodge decomposition and allows… ▽ More Topological signal processing (TSP) over simplicial complexes typically assumes observations associated with the simplicial complexes are real scalars. In this paper, we develop TSP theories for the case where observations belong to general abelian groups, including function spaces that are commonly used to represent time-varying signals. Our approach generalizes the Hodge decomposition and allows for signal processing tasks to be performed on these more complex observations. We propose a unified and flexible framework for TSP that expands its applicability to a wider range of signal processing applications. Numerical results demonstrate the effectiveness of this approach and provide a foundation for future research in this area. △ Less

Submitted 11 November, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

arXiv:2305.00139 [pdf, other]

Leveraging Label Non-Uniformity for Node Classification in Graph Neural Networks

Authors: Feng Ji, See Hian Lee, Hanyang Meng, Kai Zhao, Jielong Yang, Wee Peng Tay

Abstract: In node classification using graph neural networks (GNNs), a typical model generates logits for different class labels at each node. A softmax layer often outputs a label prediction based on the largest logit. We demonstrate that it is possible to infer hidden graph structural information from the dataset using these logits. We introduce the key notion of label non-uniformity, which is derived fro… ▽ More In node classification using graph neural networks (GNNs), a typical model generates logits for different class labels at each node. A softmax layer often outputs a label prediction based on the largest logit. We demonstrate that it is possible to infer hidden graph structural information from the dataset using these logits. We introduce the key notion of label non-uniformity, which is derived from the Wasserstein distance between the softmax distribution of the logits and the uniform distribution. We demonstrate that nodes with small label non-uniformity are harder to classify correctly. We theoretically analyze how the label non-uniformity varies across the graph, which provides insights into boosting the model performance: increasing training samples with high non-uniformity or dropping edges to reduce the maximal cut size of the node set of small non-uniformity. These mechanisms can be easily added to a base GNN model. Experimental results demonstrate that our approach improves the performance of many benchmark base models. △ Less

Submitted 28 April, 2023; originally announced May 2023.

arXiv:2304.03507 [pdf, other]

Distributional Signals for Node Classification in Graph Neural Networks

Authors: Feng Ji, See Hian Lee, Kai Zhao, Wee Peng Tay, Jielong Yang

Abstract: In graph neural networks (GNNs), both node features and labels are examples of graph signals, a key notion in graph signal processing (GSP). While it is common in GSP to impose signal smoothness constraints in learning and estimation tasks, it is unclear how this can be done for discrete node labels. We bridge this gap by introducing the concept of distributional graph signals. In our framework, w… ▽ More In graph neural networks (GNNs), both node features and labels are examples of graph signals, a key notion in graph signal processing (GSP). While it is common in GSP to impose signal smoothness constraints in learning and estimation tasks, it is unclear how this can be done for discrete node labels. We bridge this gap by introducing the concept of distributional graph signals. In our framework, we work with the distributions of node labels instead of their values and propose notions of smoothness and non-uniformity of such distributional graph signals. We then propose a general regularization method for GNNs that allows us to encode distributional smoothness and non-uniformity of the model output in semi-supervised node classification tasks. Numerical experiments demonstrate that our method can significantly improve the performance of most base GNN models in different problem settings. △ Less

Submitted 7 April, 2023; originally announced April 2023.

arXiv:2303.01724 [pdf, other]

Node-Specific Space Selection via Localized Geometric Hyperbolicity in Graph Neural Networks

Authors: See Hian Lee, Feng Ji, Wee Peng Tay

Abstract: Many graph neural networks have been developed to learn graph representations in either Euclidean or hyperbolic space, with all nodes' representations embedded in a single space. However, a graph can have hyperbolic and Euclidean geometries at different regions of the graph. Thus, it is sub-optimal to indifferently embed an entire graph into a single space. In this paper, we explore and analyze tw… ▽ More Many graph neural networks have been developed to learn graph representations in either Euclidean or hyperbolic space, with all nodes' representations embedded in a single space. However, a graph can have hyperbolic and Euclidean geometries at different regions of the graph. Thus, it is sub-optimal to indifferently embed an entire graph into a single space. In this paper, we explore and analyze two notions of local hyperbolicity, describing the underlying local geometry: geometric (Gromov) and model-based, to determine the preferred space of embedding for each node. The two hyperbolicities' distributions are aligned using the Wasserstein metric such that the calculated geometric hyperbolicity guides the choice of the learned model hyperbolicity. As such our model Joint Space Graph Neural Network (JSGNN) can leverage both Euclidean and hyperbolic spaces during learning by allowing node-specific geometry space selection. We evaluate our model on both node classification and link prediction tasks and observe promising performance compared to baseline models. △ Less

Submitted 3 March, 2023; originally announced March 2023.

arXiv:2210.14500 [pdf, other]

Impact and analysis of space-time coupling on slotted MAC in UANs

Authors: Yan Wang, Quansheng Guan, Fei Ji, Weiqi Chen

Abstract: The propagation delay is non-negligible in underwater acoustic networks (UANs) since the propagation speed is five orders of magnitude smaller than the speed of light. In this case, space and time factors are strongly coupled to determine the collisions of packet transmissions. To this end, this paper analyzes the impact of spatial-time coupling on slotted medium access control (MAC). We find that… ▽ More The propagation delay is non-negligible in underwater acoustic networks (UANs) since the propagation speed is five orders of magnitude smaller than the speed of light. In this case, space and time factors are strongly coupled to determine the collisions of packet transmissions. To this end, this paper analyzes the impact of spatial-time coupling on slotted medium access control (MAC). We find that both inter-slot and intra-slot collisions may exist, and the inter-slot collision may span multiple slots. The sending slot dependent interference regions could be an annulus inside the whole transmission range. It is pointed out that there exist collision-free regions when a guard interval larger than a packet duration is used in the slot setting. In this sense, the long slot brings spatial reuse in a transmission range. However, we further find that the successful transmission probabilities and throughput are the same for the slot lengths of one packet duration and two packet durations. Simulation results show that the maximum successful transmission probability and throughput can be achieved by a guard interval less than a packet duration, which is much shorter than the existing slot setting in typical Slotted-ALOHA. Simulations also show that the spatial impact is greater for vertical transmission than for horizontal transmissions due to the longer vertical transmission range in three-dimensional UANs. △ Less

Submitted 6 August, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

arXiv:2210.13454 [pdf, ps, other]

A Novel Block-Wise Index Modulation Scheme for High-Mobility OTFS Communications

Authors: Mi Qian, Yao Ge, Miaowen Wen, Fei Ji

Abstract: As a promising technique for high-mobility wireless communications, orthogonal time frequency space (OTFS) has been proved to enjoy excellent advantages with respect to traditional orthogonal frequency division multiplexing (OFDM). However, a challenging problem is to design efficient systems to further improve the performance. In this paper, we propose a novel block-wise index modulation (IM) sch… ▽ More As a promising technique for high-mobility wireless communications, orthogonal time frequency space (OTFS) has been proved to enjoy excellent advantages with respect to traditional orthogonal frequency division multiplexing (OFDM). However, a challenging problem is to design efficient systems to further improve the performance. In this paper, we propose a novel block-wise index modulation (IM) scheme for OTFS systems, named Doppler-IM with OTFS (DoIM-OTFS), where a block of Doppler resource bins are activated simultaneously. For practical implementation, we develop a low complexity customized message passing (CMP) algorithm for our proposed DoIM-OTFS scheme. Simulation results demonstrate our proposed DoIM-OTFS system outperforms traditional OTFS system without IM. The proposed CMP algorithm can achieve desired performance and robustness to the imperfect channel state information (CSI). △ Less

Submitted 25 October, 2022; originally announced October 2022.

arXiv:2207.11761 [pdf, other]

SGAT: Simplicial Graph Attention Network

Authors: See Hian Lee, Feng Ji, Wee Peng Tay

Abstract: Heterogeneous graphs have multiple node and edge types and are semantically richer than homogeneous graphs. To learn such complex semantics, many graph neural network approaches for heterogeneous graphs use metapaths to capture multi-hop interactions between nodes. Typically, features from non-target nodes are not incorporated into the learning procedure. However, there can be nonlinear, high-orde… ▽ More Heterogeneous graphs have multiple node and edge types and are semantically richer than homogeneous graphs. To learn such complex semantics, many graph neural network approaches for heterogeneous graphs use metapaths to capture multi-hop interactions between nodes. Typically, features from non-target nodes are not incorporated into the learning procedure. However, there can be nonlinear, high-order interactions involving multiple nodes or edges. In this paper, we present Simplicial Graph Attention Network (SGAT), a simplicial complex approach to represent such high-order interactions by placing features from non-target nodes on the simplices. We then use attention mechanisms and upper adjacencies to generate representations. We empirically demonstrate the efficacy of our approach with node classification tasks on heterogeneous graph datasets and further show SGAT's ability in extracting structural information by employing random node features. Numerical experiments indicate that SGAT performs better than other current state-of-the-art heterogeneous graph learning methods. △ Less

Submitted 24 July, 2022; originally announced July 2022.

Comments: Accepted in the 31st International Joint Conference on Artificial Intelligence (IJCAI-ECAI), 2022

arXiv:2206.04498 [pdf, other]

Abstract message passing and distributed graph signal processing

Authors: Feng Ji, Yiqi Lu, Wee Peng Tay, Edwin Chong

Abstract: Graph signal processing is a framework to handle graph structured data. The fundamental concept is graph shift operator, giving rise to the graph Fourier transform. While the graph Fourier transform is a centralized procedure, distributed graph signal processing algorithms are needed to address challenges such as scalability and privacy. In this paper, we develop a theory of distributed graph sign… ▽ More Graph signal processing is a framework to handle graph structured data. The fundamental concept is graph shift operator, giving rise to the graph Fourier transform. While the graph Fourier transform is a centralized procedure, distributed graph signal processing algorithms are needed to address challenges such as scalability and privacy. In this paper, we develop a theory of distributed graph signal processing based on the classical notion of message passing. However, we generalize the definition of a message to permit more abstract mathematical objects. The framework provides an alternative point of view that avoids the iterative nature of existing approaches to distributed graph signal processing. Moreover, our framework facilitates investigating theoretical questions such as solubility of distributed problems. △ Less

Submitted 9 June, 2022; originally announced June 2022.

arXiv:2204.08636 [pdf, ps, other]

Detection Interval for Diffusion Molecular Communication: How Long is Enough?

Authors: Xuan Chen, Miaowen Wen, Fei Ji, Yu Huang, Yuankun Tang, Andrew W. Eckford

Abstract: Molecular communication has a key role to play in future medical applications, including detecting, analyzing, and addressing infectious disease outbreaks. Overcoming inter-symbol interference (ISI) is one of the key challenges in the design of molecular communication systems. In this paper, we propose to optimize the detection interval to minimize the impact of ISI while ensuring the accurate det… ▽ More Molecular communication has a key role to play in future medical applications, including detecting, analyzing, and addressing infectious disease outbreaks. Overcoming inter-symbol interference (ISI) is one of the key challenges in the design of molecular communication systems. In this paper, we propose to optimize the detection interval to minimize the impact of ISI while ensuring the accurate detection of the transmitted information symbol, which is suitable for the absorbing and passive receivers. For tractability, based on the signal-to-interference difference (SID) and signal-to-interference-and-noise amplitude ratio (SINAR), we propose a modified-SINAR (mSINAR) to measure the bit error rate (BER) performance for the molecular communication system with a variable detection interval. Besides, we derive the optimal detection interval in closed form. Using simulation results, we show that the BER performance of our proposed mSINAR scheme is superior to the competing schemes, and achieves similar performance to optimal intervals found by the exhaustive search. △ Less

Submitted 18 April, 2022; originally announced April 2022.

arXiv:2202.08303 [pdf, other]

doi 10.1088/1361-6560/ac8044

OpenKBP-Opt: An international and reproducible evaluation of 76 knowledge-based planning pipelines

Authors: Aaron Babier, Rafid Mahmood, Binghao Zhang, Victor G. L. Alves, Ana Maria Barragán-Montero, Joel Beaudry, Carlos E. Cardenas, Yankui Chang, Zijie Chen, Jaehee Chun, Kelly Diaz, Harold David Eraso, Erik Faustmann, Sibaji Gaj, Skylar Gay, Mary Gronberg, Bingqi Guo, Junjun He, Gerd Heilemann, Sanchit Hira, Yuliang Huang, Fuxin Ji, Dashan Jiang, Jean Carlo Jimenez Giraldo, Hoyeon Lee , et al. (34 additional authors not shown)

Abstract: We establish an open framework for developing plan optimization models for knowledge-based planning (KBP) in radiotherapy. Our framework includes reference plans for 100 patients with head-and-neck cancer and high-quality dose predictions from 19 KBP models that were developed by different research groups during the OpenKBP Grand Challenge. The dose predictions were input to four optimization mode… ▽ More We establish an open framework for developing plan optimization models for knowledge-based planning (KBP) in radiotherapy. Our framework includes reference plans for 100 patients with head-and-neck cancer and high-quality dose predictions from 19 KBP models that were developed by different research groups during the OpenKBP Grand Challenge. The dose predictions were input to four optimization models to form 76 unique KBP pipelines that generated 7600 plans. The predictions and plans were compared to the reference plans via: dose score, which is the average mean absolute voxel-by-voxel difference in dose a model achieved; the deviation in dose-volume histogram (DVH) criterion; and the frequency of clinical planning criteria satisfaction. We also performed a theoretical investigation to justify our dose mimicking models. The range in rank order correlation of the dose score between predictions and their KBP pipelines was 0.50 to 0.62, which indicates that the quality of the predictions is generally positively correlated with the quality of the plans. Additionally, compared to the input predictions, the KBP-generated plans performed significantly better (P<0.05; one-sided Wilcoxon test) on 18 of 23 DVH criteria. Similarly, each optimization model generated plans that satisfied a higher percentage of criteria than the reference plans. Lastly, our theoretical investigation demonstrated that the dose mimicking models generated plans that are also optimal for a conventional planning model. This was the largest international effort to date for evaluating the combination of KBP prediction and optimization models. In the interest of reproducibility, our data and code is freely available at https://github.com/ababier/open-kbp-opt. △ Less

Submitted 16 February, 2022; originally announced February 2022.

Comments: 19 pages, 7 tables, 6 figures

arXiv:2202.03596 [pdf, other]

MOST-Net: A Memory Oriented Style Transfer Network for Face Sketch Synthesis

Authors: Fan Ji, Muyi Sun, Xingqun Qi, Qi Li, Zhenan Sun

Abstract: Face sketch synthesis has been widely used in multi-media entertainment and law enforcement. Despite the recent developments in deep neural networks, accurate and realistic face sketch synthesis is still a challenging task due to the diversity and complexity of human faces. Current image-to-image translation-based face sketch synthesis frequently encounters over-fitting problems when it comes to s… ▽ More Face sketch synthesis has been widely used in multi-media entertainment and law enforcement. Despite the recent developments in deep neural networks, accurate and realistic face sketch synthesis is still a challenging task due to the diversity and complexity of human faces. Current image-to-image translation-based face sketch synthesis frequently encounters over-fitting problems when it comes to small-scale datasets. To tackle this problem, we present an end-to-end Memory Oriented Style Transfer Network (MOST-Net) for face sketch synthesis which can produce high-fidelity sketches with limited data. Specifically, an external self-supervised dynamic memory module is introduced to capture the domain alignment knowledge in the long term. In this way, our proposed model could obtain the domain-transfer ability by establishing the durable relationship between faces and corresponding sketches on the feature level. Furthermore, we design a novel Memory Refinement Loss (MR Loss) for feature alignment in the memory module, which enhances the accuracy of memory slots in an unsupervised manner. Extensive experiments on the CUFS and the CUFSF datasets show that our MOST-Net achieves state-of-the-art performance, especially in terms of the Structural Similarity Index(SSIM). △ Less

Submitted 7 June, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

Comments: 7 pages, 4 figures

arXiv:2202.00846 [pdf, other]

doi 10.1145/3485447.3512097

Adaptive Experimentation with Delayed Binary Feedback

Authors: Zenan Wang, Carlos Carrion, Xiliang Lin, Fuhua Ji, Yongjun Bao, Weipeng Yan

Abstract: Conducting experiments with objectives that take significant delays to materialize (e.g. conversions, add-to-cart events, etc.) is challenging. Although the classical "split sample testing" is still valid for the delayed feedback, the experiment will take longer to complete, which also means spending more resources on worse-performing strategies due to their fixed allocation schedules. Alternative… ▽ More Conducting experiments with objectives that take significant delays to materialize (e.g. conversions, add-to-cart events, etc.) is challenging. Although the classical "split sample testing" is still valid for the delayed feedback, the experiment will take longer to complete, which also means spending more resources on worse-performing strategies due to their fixed allocation schedules. Alternatively, adaptive approaches such as "multi-armed bandits" are able to effectively reduce the cost of experimentation. But these methods generally cannot handle delayed objectives directly out of the box. This paper presents an adaptive experimentation solution tailored for delayed binary feedback objectives by estimating the real underlying objectives before they materialize and dynamically allocating variants based on the estimates. Experiments show that the proposed method is more efficient for delayed feedback compared to various other approaches and is robust in different settings. In addition, we describe an experimentation product powered by this algorithm. This product is currently deployed in the online experimentation platform of JD.com, a large e-commerce company and a publisher of digital ads. △ Less

Submitted 1 February, 2022; originally announced February 2022.

Comments: to be published in Proceedings of the ACM Web Conference 2022 (WWW '22)

arXiv:2108.11695 [pdf, other]

PAENet: A Progressive Attention-Enhanced Network for 3D to 2D Retinal Vessel Segmentation

Authors: Zhuojie Wu, Zijian Wang, Wenxuan Zou, Fan Ji, Hao Dang, Wanting Zhou, Muyi Sun

Abstract: 3D to 2D retinal vessel segmentation is a challenging problem in Optical Coherence Tomography Angiography (OCTA) images. Accurate retinal vessel segmentation is important for the diagnosis and prevention of ophthalmic diseases. However, making full use of the 3D data of OCTA volumes is a vital factor for obtaining satisfactory segmentation results. In this paper, we propose a Progressive Attention… ▽ More 3D to 2D retinal vessel segmentation is a challenging problem in Optical Coherence Tomography Angiography (OCTA) images. Accurate retinal vessel segmentation is important for the diagnosis and prevention of ophthalmic diseases. However, making full use of the 3D data of OCTA volumes is a vital factor for obtaining satisfactory segmentation results. In this paper, we propose a Progressive Attention-Enhanced Network (PAENet) based on attention mechanisms to extract rich feature representation. Specifically, the framework consists of two main parts, the three-dimensional feature learning path and the two-dimensional segmentation path. In the three-dimensional feature learning path, we design a novel Adaptive Pooling Module (APM) and propose a new Quadruple Attention Module (QAM). The APM captures dependencies along the projection direction of volumes and learns a series of pooling coefficients for feature fusion, which efficiently reduces feature dimension. In addition, the QAM reweights the features by capturing four-group cross-dimension dependencies, which makes maximum use of 4D feature tensors. In the two-dimensional segmentation path, to acquire more detailed information, we propose a Feature Fusion Module (FFM) to inject 3D information into the 2D path. Meanwhile, we adopt the Polarized Self-Attention (PSA) block to model the semantic interdependencies in spatial and channel dimensions respectively. Experimentally, our extensive experiments on the OCTA-500 dataset show that our proposed algorithm achieves state-of-the-art performance compared with previous methods. △ Less

Submitted 16 December, 2021; v1 submitted 26 August, 2021; originally announced August 2021.

Comments: Accepted by BIBM 2021

arXiv:2106.07134 [pdf]

Discerning the painter's hand: machine learning on surface topography

Authors: F. Ji, M. S. McMaster, S. Schwab, G. Singh, L. N. Smith, S. Adhikari, M. O'Dwyer, F. Sayed, A. Ingrisano, D. Yoder, E. S. Bolman, I. T. Martin, M. Hinczewski, K. D. Singer

Abstract: Attribution of paintings is a critical problem in art history. This study extends machine learning analysis to surface topography of painted works. A controlled study of positive attribution was designed with paintings produced by a class of art students. The paintings were scanned using a confocal optical profilometer to produce surface data. The surface data were divided into virtual patches and… ▽ More Attribution of paintings is a critical problem in art history. This study extends machine learning analysis to surface topography of painted works. A controlled study of positive attribution was designed with paintings produced by a class of art students. The paintings were scanned using a confocal optical profilometer to produce surface data. The surface data were divided into virtual patches and used to train an ensemble of convolutional neural networks (CNNs) for attribution. Over a range of patch sizes from 0.5 to 60 mm, the resulting attribution was found to be 60 to 96% accurate, and, when comparing regions of different color, was nearly twice as accurate as CNNs using color images of the paintings. Remarkably, short length scales, as small as twice a bristle diameter, were the key to reliably distinguishing among artists. These results show promise for real-world attribution, particularly in the case of workshop practice. △ Less

Submitted 13 June, 2021; originally announced June 2021.

Comments: main text: 24 pages, 6 figures; SI: 6 pages, 4 figures

arXiv:2105.04201 [pdf, other]

REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training

Authors: Fangkai Jiao, Yangyang Guo, Yilin Niu, Feng Ji, Feng-Lin Li, Liqiang Nie

Abstract: Pre-trained Language Models (PLMs) have achieved great success on Machine Reading Comprehension (MRC) over the past few years. Although the general language representation learned from large-scale corpora does benefit MRC, the poor support in evidence extraction which requires reasoning across multiple sentences hinders PLMs from further advancing MRC. To bridge the gap between general PLMs and MR… ▽ More Pre-trained Language Models (PLMs) have achieved great success on Machine Reading Comprehension (MRC) over the past few years. Although the general language representation learned from large-scale corpora does benefit MRC, the poor support in evidence extraction which requires reasoning across multiple sentences hinders PLMs from further advancing MRC. To bridge the gap between general PLMs and MRC, we present REPT, a REtrieval-based Pre-Training approach. In particular, we introduce two self-supervised tasks to strengthen evidence extraction during pre-training, which is further inherited by downstream MRC tasks through the consistent retrieval operation and model architecture. To evaluate our proposed method, we conduct extensive experiments on five MRC datasets that require collecting evidence from and reasoning across multiple sentences. Experimental results demonstrate the effectiveness of our pre-training approach. Moreover, further analysis shows that our approach is able to enhance the capacity of evidence extraction without explicit supervision. △ Less

Submitted 17 May, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

Comments: 14 pages, 3 figures, Findings of ACL 2021

arXiv:2105.01993 [pdf, other]

AdaVQA: Overcoming Language Priors with Adapted Margin Cosine Loss

Authors: Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Feng Ji, Ji Zhang, Alberto Del Bimbo

Abstract: A number of studies point out that current Visual Question Answering (VQA) models are severely affected by the language prior problem, which refers to blindly making predictions based on the language shortcut. Some efforts have been devoted to overcoming this issue with delicate models. However, there is no research to address it from the angle of the answer feature space learning, despite of the… ▽ More A number of studies point out that current Visual Question Answering (VQA) models are severely affected by the language prior problem, which refers to blindly making predictions based on the language shortcut. Some efforts have been devoted to overcoming this issue with delicate models. However, there is no research to address it from the angle of the answer feature space learning, despite of the fact that existing VQA methods all cast VQA as a classification task. Inspired by this, in this work, we attempt to tackle the language prior problem from the viewpoint of the feature space learning. To this end, an adapted margin cosine loss is designed to discriminate the frequent and the sparse answer feature space under each question type properly. As a result, the limited patterns within the language modality are largely reduced, thereby less language priors would be introduced by our method. We apply this loss function to several baseline models and evaluate its effectiveness on two VQA-CP benchmarks. Experimental results demonstrate that our adapted margin cosine loss can greatly enhance the baseline models with an absolute performance gain of 15\% on average, strongly verifying the potential of tackling the language prior problem in VQA from the angle of the answer feature space learning. △ Less

Submitted 5 May, 2021; originally announced May 2021.

arXiv:2103.15532 [pdf, other]

doi 10.1109/ICASSP39728.2021.9413417

Learning on heterogeneous graphs using high-order relations

Authors: See Hian Lee, Feng Ji, Wee Peng Tay

Abstract: A heterogeneous graph consists of different vertices and edges types. Learning on heterogeneous graphs typically employs meta-paths to deal with the heterogeneity by reducing the graph to a homogeneous network, guide random walks or capture semantics. These methods are however sensitive to the choice of meta-paths, with suboptimal paths leading to poor performance. In this paper, we propose an app… ▽ More A heterogeneous graph consists of different vertices and edges types. Learning on heterogeneous graphs typically employs meta-paths to deal with the heterogeneity by reducing the graph to a homogeneous network, guide random walks or capture semantics. These methods are however sensitive to the choice of meta-paths, with suboptimal paths leading to poor performance. In this paper, we propose an approach for learning on heterogeneous graphs without using meta-paths. Specifically, we decompose a heterogeneous graph into different homogeneous relation-type graphs, which are then combined to create higher-order relation-type representations. These representations preserve the heterogeneity of edges and retain their edge directions while capturing the interaction of different vertex types multiple hops apart. This is then complemented with attention mechanisms to distinguish the importance of the relation-type based neighbors and the relation-types themselves. Experiments demonstrate that our model generally outperforms other state-of-the-art baselines in the vertex classification task on three commonly studied heterogeneous graph datasets. △ Less

Submitted 3 March, 2023; v1 submitted 29 March, 2021; originally announced March 2021.

arXiv:2101.01369 [pdf, ps, other]

A Splitting-Detection Joint-Decision Receiver for Ultrasonic Intra-Body Communications

Authors: Qianqian Wang, Quansheng Guan, Julian Cheng, Fei Ji

Abstract: Ultrasonic intra-body communication (IBC) is a promising enabling technology for future healthcare applications, due to low attenuation and medical safety of ultrasonic waves for the human body. A splitting receiver, referred to as the splitting-detection separate-decision (SDSD) receiver, is introduced for ultrasonic pulse-based IBCs, and SDSD can significantly improve bit-error rate (BER) perfor… ▽ More Ultrasonic intra-body communication (IBC) is a promising enabling technology for future healthcare applications, due to low attenuation and medical safety of ultrasonic waves for the human body. A splitting receiver, referred to as the splitting-detection separate-decision (SDSD) receiver, is introduced for ultrasonic pulse-based IBCs, and SDSD can significantly improve bit-error rate (BER) performance over the traditional coherent-detection (CD) and energy detection (ED) receivers. To overcome the high complexity and improve the BER performance of SDSD, a splitting-detection joint-decision (SDJD) receiver is proposed. The core idea of SDJD is to split the received signal into two steams that can be separately processed by CD and ED, and then summed up as joint decision variables to achieve diversity combining. The theoretical channel capacity and BER of the SDSD and SDJD are derived for M-ary pulse position modulation (M-PPM) and PPM with spreading codes. The derivation takes into account the channel noise, intra-body channel fading, and channel estimation error. Simulation results verify the theoretical analysis and show that both SDSD and SDJD can achieve higher channel capacity and lower BER than the CD and ED receivers with perfect channel estimation, while SDJD can achieve the lowest BER with imperfect channel estimation. △ Less

Submitted 5 January, 2021; originally announced January 2021.

Comments: 29 pages, 10 figures

arXiv:2011.12771 [pdf, other]

Learning to Expand: Reinforced Pseudo-relevance Feedback Selection for Information-seeking Conversations

Authors: Haojie Pan, Cen Chen, Chengyu Wang, Minghui Qiu, Liu Yang, Feng Ji, Jun Huang

Abstract: Information-seeking conversation systems are increasingly popular in real-world applications, especially for e-commerce companies. To retrieve appropriate responses for users, it is necessary to compute the matching degrees between candidate responses and users' queries with historical dialogue utterances. As the contexts are usually much longer than responses, it is thus necessary to expand the r… ▽ More Information-seeking conversation systems are increasingly popular in real-world applications, especially for e-commerce companies. To retrieve appropriate responses for users, it is necessary to compute the matching degrees between candidate responses and users' queries with historical dialogue utterances. As the contexts are usually much longer than responses, it is thus necessary to expand the responses (usually short) with richer information. Recent studies on pseudo-relevance feedback (PRF) have demonstrated its effectiveness in query expansion for search engines, hence we consider expanding response using PRF information. However, existing PRF approaches are either based on heuristic rules or require heavy manual labeling, which are not suitable for solving our task. To alleviate this problem, we treat the PRF selection for response expansion as a learning task and propose a reinforced learning method that can be trained in an end-to-end manner without any human annotations. More specifically, we propose a reinforced selector to extract useful PRF terms to enhance response candidates and a BERT-based response ranker to rank the PRF-enhanced responses. The performance of the ranker serves as a reward to guide the selector to extract useful PRF terms, which boosts the overall task performance. Extensive experiments on both standard benchmarks and commercial datasets prove the superiority of our reinforced PRF term selector compared with other potential soft or hard selection methods. Both case studies and quantitative analysis show that our model is capable of selecting meaningful PRF terms to expand response candidates and also achieving the best results compared with all baselines on a variety of evaluation metrics. We have also deployed our method on online production in an e-commerce company, which shows a significant improvement over the existing online ranking system. △ Less

Submitted 2 November, 2022; v1 submitted 25 November, 2020; originally announced November 2020.

arXiv:2011.02705 [pdf, other]

Improving Commonsense Question Answering by Graph-based Iterative Retrieval over Multiple Knowledge Sources

Authors: Qianglong Chen, Feng Ji, Haiqing Chen, Yin Zhang

Abstract: In order to facilitate natural language understanding, the key is to engage commonsense or background knowledge. However, how to engage commonsense effectively in question answering systems is still under exploration in both research academia and industry. In this paper, we propose a novel question-answering method by integrating multiple knowledge sources, i.e. ConceptNet, Wikipedia, and the Camb… ▽ More In order to facilitate natural language understanding, the key is to engage commonsense or background knowledge. However, how to engage commonsense effectively in question answering systems is still under exploration in both research academia and industry. In this paper, we propose a novel question-answering method by integrating multiple knowledge sources, i.e. ConceptNet, Wikipedia, and the Cambridge Dictionary, to boost the performance. More concretely, we first introduce a novel graph-based iterative knowledge retrieval module, which iteratively retrieves concepts and entities related to the given question and its choices from multiple knowledge sources. Afterward, we use a pre-trained language model to encode the question, retrieved knowledge and choices, and propose an answer choice-aware attention mechanism to fuse all hidden representations of the previous modules. Finally, the linear classifier for specific tasks is used to predict the answer. Experimental results on the CommonsenseQA dataset show that our method significantly outperforms other competitive methods and achieves the new state-of-the-art. In addition, further ablation studies demonstrate the effectiveness of our graph-based iterative knowledge retrieval module and the answer choice-aware attention module in retrieving and synthesizing background knowledge from multiple knowledge sources. △ Less

Submitted 5 November, 2020; originally announced November 2020.

Comments: Accepted at COLING 2020

arXiv:2009.11684 [pdf]

doi 10.1145/3340531.3412685

AliMe KG: Domain Knowledge Graph Construction and Application in E-commerce

Authors: Feng-Lin Li, Hehong Chen, Guohai Xu, Tian Qiu, Feng Ji, Ji Zhang, Haiqing Chen

Abstract: Pre-sales customer service is of importance to E-commerce platforms as it contributes to optimizing customers' buying process. To better serve users, we propose AliMe KG, a domain knowledge graph in E-commerce that captures user problems, points of interests (POI), item information and relations thereof. It helps to understand user needs, answer pre-sales questions and generate explanation texts.… ▽ More Pre-sales customer service is of importance to E-commerce platforms as it contributes to optimizing customers' buying process. To better serve users, we propose AliMe KG, a domain knowledge graph in E-commerce that captures user problems, points of interests (POI), item information and relations thereof. It helps to understand user needs, answer pre-sales questions and generate explanation texts. We applied AliMe KG to several online business scenarios such as shopping guide, question answering over properties and recommendation reason generation, and gained positive results. In the paper, we systematically introduce how we construct domain knowledge graph from free text, and demonstrate its business value with several applications. Our experience shows that mining structured knowledge from free text in vertical domain is practicable, and can be of substantial value in industrial settings. △ Less

Submitted 24 September, 2020; originally announced September 2020.

arXiv:2005.13119 [pdf, other]

doi 10.1109/TASLP.2021.3110145

Predict-then-Decide: A Predictive Approach for Wait or Answer Task in Dialogue Systems

Authors: Zehao Lin, Shaobo Cui, Guodun Li, Xiaoming Kang, Feng Ji, Fenglin Li, Zhongzhou Zhao, Haiqing Chen, Yin Zhang

Abstract: Different people have different habits of describing their intents in conversations. Some people tend to deliberate their intents in several successive utterances, i.e., they use several consistent messages for readability instead of a long sentence to express their question. This creates a predicament faced by the application of dialogue systems, especially in real-world industry scenarios, in wh… ▽ More Different people have different habits of describing their intents in conversations. Some people tend to deliberate their intents in several successive utterances, i.e., they use several consistent messages for readability instead of a long sentence to express their question. This creates a predicament faced by the application of dialogue systems, especially in real-world industry scenarios, in which the dialogue system is unsure whether it should answer the query of user immediately or wait for further supplementary input. Motivated by such an interesting predicament, we define a novel Wait-or-Answer task for dialogue systems. We shed light on a new research topic about how the dialogue system can be more intelligent to behave in this Wait-or-Answer quandary. Further, we propose a predictive approach named Predict-then-Decide (PTD) to tackle this Wait-or-Answer task. More specifically, we take advantage of a decision model to help the dialogue system decide whether to wait or answer. The decision of decision model is made with the assistance of two ancillary prediction models: a user prediction and an agent prediction. The user prediction model tries to predict what the user would supplement and uses its prediction to persuade the decision model that the user has some information to add, so the dialogue system should wait. The agent prediction model tries to predict the answer of the dialogue system and convince the decision model that it is a superior choice to answer the query of user immediately since the input of user has come to an end. We conduct our experiments on two real-life scenarios and three public datasets. Experimental results on five datasets show our proposed PTD approach significantly outperforms the existing models in solving this Wait-or-Answer problem. △ Less

Submitted 22 September, 2021; v1 submitted 26 May, 2020; originally announced May 2020.

Comments: The latest version has been accepted to IEEE/ACM Transactions on Audio, Speech, and Language Processing, doi: 10.1109/TASLP.2021.3110145

arXiv:2005.10450 [pdf, other]

MTSS: Learn from Multiple Domain Teachers and Become a Multi-domain Dialogue Expert

Authors: Shuke Peng, Feng Ji, Zehao Lin, Shaobo Cui, Haiqing Chen, Yin Zhang

Abstract: How to build a high-quality multi-domain dialogue system is a challenging work due to its complicated and entangled dialogue state space among each domain, which seriously limits the quality of dialogue policy, and further affects the generated response. In this paper, we propose a novel method to acquire a satisfying policy and subtly circumvent the knotty dialogue state representation problem in… ▽ More How to build a high-quality multi-domain dialogue system is a challenging work due to its complicated and entangled dialogue state space among each domain, which seriously limits the quality of dialogue policy, and further affects the generated response. In this paper, we propose a novel method to acquire a satisfying policy and subtly circumvent the knotty dialogue state representation problem in the multi-domain setting. Inspired by real school teaching scenarios, our method is composed of multiple domain-specific teachers and a universal student. Each individual teacher only focuses on one specific domain and learns its corresponding domain knowledge and dialogue policy based on a precisely extracted single domain dialogue state representation. Then, these domain-specific teachers impart their domain knowledge and policies to a universal student model and collectively make this student model a multi-domain dialogue expert. Experiment results show that our method reaches competitive results with SOTAs in both multi-domain and single domain setting. △ Less

Submitted 20 May, 2020; originally announced May 2020.

Comments: AAAI 2020, Spotlight Paper

arXiv:2002.09616 [pdf, other]

"Wait, I'm Still Talking!" Predicting the Dialogue Interaction Behavior Using Imagine-Then-Arbitrate Model

Authors: Zehao Lin, Shaobo Cui, Guodun Li, Xiaoming Kang, Feng Ji, Fenglin Li, Zhongzhou Zhao, Haiqing Chen, Yin Zhang

Abstract: Producing natural and accurate responses like human beings is the ultimate goal of intelligent dialogue agents. So far, most of the past works concentrate on selecting or generating one pertinent and fluent response according to current query and its context. These models work on a one-to-one environment, making one response to one utterance each round. However, in real human-human conversations,… ▽ More Producing natural and accurate responses like human beings is the ultimate goal of intelligent dialogue agents. So far, most of the past works concentrate on selecting or generating one pertinent and fluent response according to current query and its context. These models work on a one-to-one environment, making one response to one utterance each round. However, in real human-human conversations, human often sequentially sends several short messages for readability instead of a long message in one turn. Thus messages will not end with an explicit ending signal, which is crucial for agents to decide when to reply. So the first step for an intelligent dialogue agent is not replying but deciding if it should reply at the moment. To address this issue, in this paper, we propose a novel Imagine-then-Arbitrate (ITA) neural dialogue model to help the agent decide whether to wait or to make a response directly. Our method has two imaginator modules and an arbitrator module. The two imaginators will learn the agent's and user's speaking style respectively, generate possible utterances as the input of the arbitrator, combining with dialogue history. And the arbitrator decides whether to wait or to make a response to the user directly. To verify the performance and effectiveness of our method, we prepared two dialogue datasets and compared our approach with several popular models. Experimental results show that our model performs well on addressing ending prediction issue and outperforms baseline models. △ Less

Submitted 22 September, 2021; v1 submitted 21 February, 2020; originally announced February 2020.

Comments: This is an outdated version of "Predict-then-Decide: A Predictive Approach for Wait or Answer Task in Dialogue Systems," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, doi: 10.1109/TASLP.2021.3110145

arXiv:1911.02914 [pdf, ps, other]

Transformation of Dense and Sparse Text Representations

Authors: Wenpeng Hu, Mengyu Wang, Bing Liu, Feng Ji, Haiqing Chen, Dongyan Zhao, Jinwen Ma, Rui Yan

Abstract: Sparsity is regarded as a desirable property of representations, especially in terms of explanation. However, its usage has been limited due to the gap with dense representations. Most NLP research progresses in recent years are based on dense representations. Thus the desirable property of sparsity cannot be leveraged. Inspired by Fourier Transformation, in this paper, we propose a novel Semantic… ▽ More Sparsity is regarded as a desirable property of representations, especially in terms of explanation. However, its usage has been limited due to the gap with dense representations. Most NLP research progresses in recent years are based on dense representations. Thus the desirable property of sparsity cannot be leveraged. Inspired by Fourier Transformation, in this paper, we propose a novel Semantic Transformation method to bridge the dense and sparse spaces, which can facilitate the NLP research to shift from dense space to sparse space or to jointly use both spaces. The key idea of the proposed approach is to use a Forward Transformation to transform dense representations to sparse representations. Then some useful operations in the sparse space can be performed over the sparse representations, and the sparse representations can be used directly to perform downstream tasks such as text classification and natural language inference. Then, a Backward Transformation can also be carried out to transform those processed sparse representations to dense representations. Experiments using classification tasks and natural language inference task show that the proposed Semantic Transformation is effective. △ Less

Submitted 7 November, 2019; originally announced November 2019.

arXiv:1911.02747 [pdf, other]

Query-bag Matching with Mutual Coverage for Information-seeking Conversations in E-commerce

Authors: Zhenxin Fu, Feng Ji, Wenpeng Hu, Wei Zhou, Dongyan Zhao, Haiqing Chen, Rui Yan

Abstract: Information-seeking conversation system aims at satisfying the information needs of users through conversations. Text matching between a user query and a pre-collected question is an important part of the information-seeking conversation in E-commerce. In the practical scenario, a sort of questions always correspond to a same answer. Naturally, these questions can form a bag. Learning the matching… ▽ More Information-seeking conversation system aims at satisfying the information needs of users through conversations. Text matching between a user query and a pre-collected question is an important part of the information-seeking conversation in E-commerce. In the practical scenario, a sort of questions always correspond to a same answer. Naturally, these questions can form a bag. Learning the matching between user query and bag directly may improve the conversation performance, denoted as query-bag matching. Inspired by such opinion, we propose a query-bag matching model which mainly utilizes the mutual coverage between query and bag and measures the degree of the content in the query mentioned by the bag, and vice verse. In addition, the learned bag representation in word level helps find the main points of a bag in a fine grade and promotes the query-bag matching performance. Experiments on two datasets show the effectiveness of our model. △ Less

Submitted 6 November, 2019; originally announced November 2019.

Comments: CIKM 2019 Short

arXiv:1909.11287 [pdf, other]

Task-Oriented Conversation Generation Using Heterogeneous Memory Networks

Authors: Zehao Lin, Xinjing Huang, Feng Ji, Haiqing Chen, Ying Zhang

Abstract: How to incorporate external knowledge into a neural dialogue model is critically important for dialogue systems to behave like real humans. To handle this problem, memory networks are usually a great choice and a promising way. However, existing memory networks do not perform well when leveraging heterogeneous information from different sources. In this paper, we propose a novel and versatile exte… ▽ More How to incorporate external knowledge into a neural dialogue model is critically important for dialogue systems to behave like real humans. To handle this problem, memory networks are usually a great choice and a promising way. However, existing memory networks do not perform well when leveraging heterogeneous information from different sources. In this paper, we propose a novel and versatile external memory networks called Heterogeneous Memory Networks (HMNs), to simultaneously utilize user utterances, dialogue history and background knowledge tuples. In our method, historical sequential dialogues are encoded and stored into the context-aware memory enhanced by gating mechanism while grounding knowledge tuples are encoded and stored into the context-free memory. During decoding, the decoder augmented with HMNs recurrently selects each word in one response utterance from these two memories and a general vocabulary. Experimental results on multiple real-world datasets show that HMNs significantly outperform the state-of-the-art data-driven task-oriented dialogue models in most domains. △ Less

Submitted 25 September, 2019; originally announced September 2019.

Comments: Accepted as a long paper at EMNLP-IJCNLP 2019

arXiv:1908.07137 [pdf, other]

Teacher-Student Framework Enhanced Multi-domain Dialogue Generation

Authors: Shuke Peng, Xinjing Huang, Zehao Lin, Feng Ji, Haiqing Chen, Yin Zhang

Abstract: Dialogue systems dealing with multi-domain tasks are highly required. How to record the state remains a key problem in a task-oriented dialogue system. Normally we use human-defined features as dialogue states and apply a state tracker to extract these features. However, the performance of such a system is limited by the error propagation of a state tracker. In this paper, we propose a dialogue ge… ▽ More Dialogue systems dealing with multi-domain tasks are highly required. How to record the state remains a key problem in a task-oriented dialogue system. Normally we use human-defined features as dialogue states and apply a state tracker to extract these features. However, the performance of such a system is limited by the error propagation of a state tracker. In this paper, we propose a dialogue generation model that needs no external state trackers and still benefits from human-labeled semantic data. By using a teacher-student framework, several teacher models are firstly trained in their individual domains, learn dialogue policies from labeled states. And then the learned knowledge and experience are merged and transferred to a universal student model, which takes raw utterance as its input. Experiments show that the dialogue system trained under our framework outperforms the one uses a belief tracker. △ Less

Submitted 26 May, 2020; v1 submitted 19 August, 2019; originally announced August 2019.

Comments: Official Version: arXiv:2005.10450

arXiv:1908.00300 [pdf, other]

Simple and Effective Text Matching with Richer Alignment Features

Authors: Runqi Yang, Jianhai Zhang, Xing Gao, Feng Ji, Haiqing Chen

Abstract: In this paper, we present a fast and strong neural approach for general purpose text matching applications. We explore what is sufficient to build a fast and well-performed text matching model and propose to keep three key features available for inter-sequence alignment: original point-wise features, previous aligned features, and contextual features while simplifying all the remaining components.… ▽ More In this paper, we present a fast and strong neural approach for general purpose text matching applications. We explore what is sufficient to build a fast and well-performed text matching model and propose to keep three key features available for inter-sequence alignment: original point-wise features, previous aligned features, and contextual features while simplifying all the remaining components. We conduct experiments on four well-studied benchmark datasets across tasks of natural language inference, paraphrase identification and answer selection. The performance of our model is on par with the state-of-the-art on all datasets with much fewer parameters and the inference speed is at least 6 times faster compared with similarly performed ones. △ Less

Submitted 1 August, 2019; originally announced August 2019.

Comments: 11 pages, 7 tables, 3 figures, accepted by ACL 2019

arXiv:1905.01994 [pdf, ps, other]

Review-Driven Answer Generation for Product-Related Questions in E-Commerce

Authors: Shiqian Chen, Chenliang Li, Feng Ji, Wei Zhou, Haiqing Chen

Abstract: The users often have many product-related questions before they make a purchase decision in E-commerce. However, it is often time-consuming to examine each user review to identify the desired information. In this paper, we propose a novel review-driven framework for answer generation for product-related questions in E-commerce, named RAGE. We develope RAGE on the basis of the multi-layer convoluti… ▽ More The users often have many product-related questions before they make a purchase decision in E-commerce. However, it is often time-consuming to examine each user review to identify the desired information. In this paper, we propose a novel review-driven framework for answer generation for product-related questions in E-commerce, named RAGE. We develope RAGE on the basis of the multi-layer convolutional architecture to facilitate speed-up of answer generation with the parallel computation. For each question, RAGE first extracts the relevant review snippets from the reviews of the corresponding product. Then, we devise a mechanism to identify the relevant information from the noise-prone review snippets and incorporate this information to guide the answer generation. The experiments on two real-world E-Commerce datasets show that the proposed RAGE significantly outperforms the existing alternatives in producing more accurate and informative answers in natural language. Moreover, RAGE takes much less time for both model training and answer generation than the existing RNN based generation models. △ Less

Submitted 26 April, 2019; originally announced May 2019.

Report number: https://dl.acm.org/citation.cfm?doid=3289600.3290971

Journal ref: WSDM 2019

arXiv:1902.09173 [pdf, other]

GFCN: A New Graph Convolutional Network Based on Parallel Flows

Authors: Feng Ji, Jielong Yang, Qiang Zhang, Wee Peng Tay

Abstract: In view of the huge success of convolution neural networks (CNN) for image classification and object recognition, there have been attempts to generalize the method to general graph-structured data. One major direction is based on spectral graph theory and graph signal processing. In this paper, we study the problem from a completely different perspective, by introducing parallel flow decomposition… ▽ More In view of the huge success of convolution neural networks (CNN) for image classification and object recognition, there have been attempts to generalize the method to general graph-structured data. One major direction is based on spectral graph theory and graph signal processing. In this paper, we study the problem from a completely different perspective, by introducing parallel flow decomposition of graphs. The essential idea is to decompose a graph into families of non-intersecting one dimensional (1D) paths, after which, we may apply a 1D CNN along each family of paths. We demonstrate that the our method, which we call GraphFlow, is able to transfer CNN architectures to general graphs. To show the effectiveness of our approach, we test our method on the classical MNIST dataset, synthetic datasets on network information propagation and a news article classification dataset. △ Less

Submitted 6 March, 2020; v1 submitted 25 February, 2019; originally announced February 2019.

arXiv:1812.11561 [pdf, other]

doi 10.1145/3289600.3290978

Learning to Selectively Transfer: Reinforced Transfer Learning for Deep Text Matching

Authors: Chen Qu, Feng Ji, Minghui Qiu, Liu Yang, Zhiyu Min, Haiqing Chen, Jun Huang, W. Bruce Croft

Abstract: Deep text matching approaches have been widely studied for many applications including question answering and information retrieval systems. To deal with a domain that has insufficient labeled data, these approaches can be used in a Transfer Learning (TL) setting to leverage labeled data from a resource-rich source domain. To achieve better performance, source domain data selection is essential in… ▽ More Deep text matching approaches have been widely studied for many applications including question answering and information retrieval systems. To deal with a domain that has insufficient labeled data, these approaches can be used in a Transfer Learning (TL) setting to leverage labeled data from a resource-rich source domain. To achieve better performance, source domain data selection is essential in this process to prevent the "negative transfer" problem. However, the emerging deep transfer models do not fit well with most existing data selection methods, because the data selection policy and the transfer learning model are not jointly trained, leading to sub-optimal training efficiency. In this paper, we propose a novel reinforced data selector to select high-quality source domain data to help the TL model. Specifically, the data selector "acts" on the source domain data to find a subset for optimization of the TL model, and the performance of the TL model can provide "rewards" in turn to update the selector. We build the reinforced data selector based on the actor-critic framework and integrate it to a DNN based transfer learning model, resulting in a Reinforced Transfer Learning (RTL) method. We perform a thorough experimental evaluation on two major tasks for text matching, namely, paraphrase identification and natural language inference. Experimental results show the proposed RTL can significantly improve the performance of the TL model. We further investigate different settings of states, rewards, and policy optimization methods to examine the robustness of our method. Last, we conduct a case study on the selected data and find our method is able to select source domain data whose Wasserstein distance is close to the target domain data. This is reasonable and intuitive as such source domain data can provide more transferability power to the model. △ Less

Submitted 30 December, 2018; originally announced December 2018.

Comments: Accepted to WSDM 2019

arXiv:1810.08740 [pdf, other]

Improving Multilingual Semantic Textual Similarity with Shared Sentence Encoder for Low-resource Languages

Authors: Xin Tang, Shanbo Cheng, Loc Do, Zhiyu Min, Feng Ji, Heng Yu, Ji Zhang, Haiqin Chen

Abstract: Measuring the semantic similarity between two sentences (or Semantic Textual Similarity - STS) is fundamental in many NLP applications. Despite the remarkable results in supervised settings with adequate labeling, little attention has been paid to this task in low-resource languages with insufficient labeling. Existing approaches mostly leverage machine translation techniques to translate sentence… ▽ More Measuring the semantic similarity between two sentences (or Semantic Textual Similarity - STS) is fundamental in many NLP applications. Despite the remarkable results in supervised settings with adequate labeling, little attention has been paid to this task in low-resource languages with insufficient labeling. Existing approaches mostly leverage machine translation techniques to translate sentences into rich-resource language. These approaches either beget language biases, or be impractical in industrial applications where spoken language scenario is more often and rigorous efficiency is required. In this work, we propose a multilingual framework to tackle the STS task in a low-resource language e.g. Spanish, Arabic , Indonesian and Thai, by utilizing the rich annotation data in a rich resource language, e.g. English. Our approach is extended from a basic monolingual STS framework to a shared multilingual encoder pretrained with translation task to incorporate rich-resource language data. By exploiting the nature of a shared multilingual encoder, one sentence can have multiple representations for different target translation language, which are used in an ensemble model to improve similarity evaluation. We demonstrate the superiority of our method over other state of the art approaches on SemEval STS task by its significant improvement on non-MT method, as well as an online industrial product where MT method fails to beat baseline while our approach still has consistently improvements. △ Less

Submitted 30 October, 2018; v1 submitted 19 October, 2018; originally announced October 2018.

arXiv:1807.01468 [pdf, ps, other]

Spatial Modulation for Molecular Communication

Authors: Yu Huang, Miaowen Wen, Lie-Liang Yang, Chan-Byoung Chae, Fei Ji

Abstract: In this paper, we propose an energy-efficient spatial modulation based molecular communication (SM-MC) scheme, in which a transmitted symbol is composed of two parts, i.e., a space derived symbol and a concentration derived symbol. The space symbol is transmitted by embedding the information into the index of a single activated transmitter nanomachine. The concentration symbol is drawn according t… ▽ More In this paper, we propose an energy-efficient spatial modulation based molecular communication (SM-MC) scheme, in which a transmitted symbol is composed of two parts, i.e., a space derived symbol and a concentration derived symbol. The space symbol is transmitted by embedding the information into the index of a single activated transmitter nanomachine. The concentration symbol is drawn according to the conventional concentration shift keying (CSK) constellation. Befitting from a single active transmitter during each symbol transmission period, SM-MC can avoid the inter-link interference problem existing in the current multiple-input multiple-output (MIMO) MC schemes, which hence enables low-complexity symbol detection and performance improvement. Specifically, in our low-complexity scheme, the space symbol is first detected by energy comparison, and then the concentration symbol is detected by the equal gain combining assisted CSK demodulation. In this paper, we analyze the symbol error rate (SER) of the SM-MC and its special case, namely the space shift keying based MC (SSK-MC), where only space symbol is transmitted and no CSK modulation is invoked. Finally, the analytical results are validated by computer simulations, and our studies demonstrate that both the SM-MC and SSK-MC are capable of achieving better SER performance than the conventional MIMO-MC and single-input single-output MC (SISO-MC) when the same symbol rate is assumed. △ Less

Submitted 4 July, 2018; v1 submitted 4 July, 2018; originally announced July 2018.

arXiv:1806.05434 [pdf, other]

Transfer Learning for Context-Aware Question Matching in Information-seeking Conversations in E-commerce

Authors: Minghui Qiu, Liu Yang, Feng Ji, Weipeng Zhao, Wei Zhou, Jun Huang, Haiqing Chen, W. Bruce Croft, Wei Lin

Abstract: Building multi-turn information-seeking conversation systems is an important and challenging research topic. Although several advanced neural text matching models have been proposed for this task, they are generally not efficient for industrial applications. Furthermore, they rely on a large amount of labeled data, which may not be available in real-world applications. To alleviate these problems,… ▽ More Building multi-turn information-seeking conversation systems is an important and challenging research topic. Although several advanced neural text matching models have been proposed for this task, they are generally not efficient for industrial applications. Furthermore, they rely on a large amount of labeled data, which may not be available in real-world applications. To alleviate these problems, we study transfer learning for multi-turn information seeking conversations in this paper. We first propose an efficient and effective multi-turn conversation model based on convolutional neural networks. After that, we extend our model to adapt the knowledge learned from a resource-rich domain to enhance the performance. Finally, we deployed our model in an industrial chatbot called AliMe Assist (https://consumerservice.taobao.com/online-help) and observed a significant improvement over the existing online model. △ Less

Submitted 14 June, 2018; originally announced June 2018.

Comments: 6

Journal ref: ACL 2018

arXiv:1805.00150 [pdf, other]

Memory-augmented Dialogue Management for Task-oriented Dialogue Systems

Authors: Zheng Zhang, Minlie Huang, Zhongzhou Zhao, Feng Ji, Haiqing Chen, Xiaoyan Zhu

Abstract: Dialogue management (DM) decides the next action of a dialogue system according to the current dialogue state, and thus plays a central role in task-oriented dialogue systems. Since dialogue management requires to have access to not only local utterances, but also the global semantics of the entire dialogue session, modeling the long-range history information is a critical issue. To this end, we p… ▽ More Dialogue management (DM) decides the next action of a dialogue system according to the current dialogue state, and thus plays a central role in task-oriented dialogue systems. Since dialogue management requires to have access to not only local utterances, but also the global semantics of the entire dialogue session, modeling the long-range history information is a critical issue. To this end, we propose a novel Memory-Augmented Dialogue management model (MAD) which employs a memory controller and two additional memory structures, i.e., a slot-value memory and an external memory. The slot-value memory tracks the dialogue state by memorizing and updating the values of semantic slots (for instance, cuisine, price, and location), and the external memory augments the representation of hidden states of traditional recurrent neural networks through storing more context information. To update the dialogue state efficiently, we also propose slot-level attention on user utterances to extract specific semantic information for each slot. Experiments show that our model can obtain state-of-the-art performance and outperforms existing baselines. △ Less

Submitted 30 April, 2018; originally announced May 2018.

Comments: 25 pages, 9 figures, Under review of ACM Transactions on Information Systems (TOIS)

MSC Class: 68T50

arXiv:1801.04650 [pdf, ps, other]

Non-Orthogonal Multiple Access For Cooperative Communications: Challenges, Opportunities, And Trends

Authors: Dehuan Wan, Miaowen Wen, Fei Ji, Hua Yu, Fangjiong Chen

Abstract: Non-orthogonal multiple access (NOMA) is a promising radio access technique for next-generation wireless networks. In this article, we investigate the NOMA-based cooperative relay network. We begin with an introduction of the existing relay-assisted NOMA systems by classifying them into three categories: uplink, downlink, and composite architectures. Then, we discuss their principles and key featu… ▽ More Non-orthogonal multiple access (NOMA) is a promising radio access technique for next-generation wireless networks. In this article, we investigate the NOMA-based cooperative relay network. We begin with an introduction of the existing relay-assisted NOMA systems by classifying them into three categories: uplink, downlink, and composite architectures. Then, we discuss their principles and key features, and provide a comprehensive comparison from the perspective of spectral efficiency, energy efficiency, and total transmit power. A novel strategy termed hybrid power allocation is further discussed for the composite architecture, which can reduce the computational complexity and signaling overhead at the expense of marginal sum rate degradation. Finally, major challenges, opportunities, and future research trends for the design of NOMA-based cooperative relay systems with other techniques are also highlighted to provide insights for researchers in this field. △ Less

Submitted 14 January, 2018; originally announced January 2018.

Showing 1–50 of 52 results for author: Ji, F