subscribe to arXiv mailings

doi 10.1145/3589334.3645421

Debiasing Recommendation with Personal Popularity

Authors: Wentao Ning, Reynold Cheng, Xiao Yan, Ben Kao, Nan Huo, Nur AI Hasan Haldar, Bo Tang

Abstract: Global popularity (GP) bias is the phenomenon that popular items are recommended much more frequently than they should be, which goes against the goal of providing personalized recommendations and harms user experience and recommendation accuracy. Many methods have been proposed to reduce GP bias but they fail to notice the fundamental problem of GP, i.e., it considers popularity from a \textit{gl… ▽ More Global popularity (GP) bias is the phenomenon that popular items are recommended much more frequently than they should be, which goes against the goal of providing personalized recommendations and harms user experience and recommendation accuracy. Many methods have been proposed to reduce GP bias but they fail to notice the fundamental problem of GP, i.e., it considers popularity from a \textit{global} perspective of \textit{all users} and uses a single set of popular items, and thus cannot capture the interests of individual users. As such, we propose a user-aware version of item popularity named \textit{personal popularity} (PP), which identifies different popular items for each user by considering the users that share similar interests. As PP models the preferences of individual users, it naturally helps to produce personalized recommendations and mitigate GP bias. To integrate PP into recommendation, we design a general \textit{personal popularity aware counterfactual} (PPAC) framework, which adapts easily to existing recommendation models. In particular, PPAC recognizes that PP and GP have both direct and indirect effects on recommendations and controls direct effects with counterfactual inference techniques for unbiased recommendations. All codes and datasets are available at \url{https://github.com/Stevenn9981/PPAC}. △ Less

Submitted 21 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

Comments: Accepted by WWW'24 as a research full paper

arXiv:2309.14756 [pdf, other]

On quantifying and improving realism of images generated with diffusion

Authors: Yunzhuo Chen, Naveed Akhtar, Nur Al Hasan Haldar, Ajmal Mian

Abstract: Recent advances in diffusion models have led to a quantum leap in the quality of generative visual content. However, quantification of realism of the content is still challenging. Existing evaluation metrics, such as Inception Score and Fréchet inception distance, fall short on benchmarking diffusion models due to the versatility of the generated images. Moreover, they are not designed to quantify… ▽ More Recent advances in diffusion models have led to a quantum leap in the quality of generative visual content. However, quantification of realism of the content is still challenging. Existing evaluation metrics, such as Inception Score and Fréchet inception distance, fall short on benchmarking diffusion models due to the versatility of the generated images. Moreover, they are not designed to quantify realism of an individual image. This restricts their application in forensic image analysis, which is becoming increasingly important in the emerging era of generative models. To address that, we first propose a metric, called Image Realism Score (IRS), computed from five statistical measures of a given image. This non-learning based metric not only efficiently quantifies realism of the generated images, it is readily usable as a measure to classify a given image as real or fake. We experimentally establish the model- and data-agnostic nature of the proposed IRS by successfully detecting fake images generated by Stable Diffusion Model (SDM), Dalle2, Midjourney and BigGAN. We further leverage this attribute of our metric to minimize an IRS-augmented generative loss of SDM, and demonstrate a convenient yet considerable quality improvement of the SDM-generated content with our modification. Our efforts have also led to Gen-100 dataset, which provides 1,000 samples for 100 classes generated by four high-quality models. We will release the dataset and code. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 10 pages, 5 figures

arXiv:2309.14751 [pdf, other]

Text-image guided Diffusion Model for generating Deepfake celebrity interactions

Authors: Yunzhuo Chen, Nur Al Hasan Haldar, Naveed Akhtar, Ajmal Mian

Abstract: Deepfake images are fast becoming a serious concern due to their realism. Diffusion models have recently demonstrated highly realistic visual content generation, which makes them an excellent potential tool for Deepfake generation. To curb their exploitation for Deepfakes, it is imperative to first explore the extent to which diffusion models can be used to generate realistic content that is contr… ▽ More Deepfake images are fast becoming a serious concern due to their realism. Diffusion models have recently demonstrated highly realistic visual content generation, which makes them an excellent potential tool for Deepfake generation. To curb their exploitation for Deepfakes, it is imperative to first explore the extent to which diffusion models can be used to generate realistic content that is controllable with convenient prompts. This paper devises and explores a novel method in that regard. Our technique alters the popular stable diffusion model to generate a controllable high-quality Deepfake image with text and image prompts. In addition, the original stable model lacks severely in generating quality images that contain multiple persons. The modified diffusion model is able to address this problem, it add input anchor image's latent at the beginning of inferencing rather than Gaussian random latent as input. Hence, we focus on generating forged content for celebrity interactions, which may be used to spread rumors. We also apply Dreambooth to enhance the realism of our fake images. Dreambooth trains the pairing of center words and specific features to produce more refined and personalized output images. Our results show that with the devised scheme, it is possible to create fake visual content with alarming realism, such that the content can serve as believable evidence of meetings between powerful political figures. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 8 pages,8 figures, DICTA

arXiv:2202.01525 [pdf, ps, other]

doi 10.14778/3551793.3551834

Reliable Community Search in Dynamic Networks

Authors: Yifu Tang, Jianxin Li, Nur Al Hasan Haldar, Ziyu Guan, Jiajie Xu, Chengfei Liu

Abstract: Searching for local communities is an important research problem that supports advanced data analysis in various complex networks, such as social networks, collaboration networks, cellular networks, etc. The evolution of such networks over time has motivated several recent studies to identify local communities in dynamic networks. However, these studies only utilize the aggregation of disjoint str… ▽ More Searching for local communities is an important research problem that supports advanced data analysis in various complex networks, such as social networks, collaboration networks, cellular networks, etc. The evolution of such networks over time has motivated several recent studies to identify local communities in dynamic networks. However, these studies only utilize the aggregation of disjoint structural information to measure the quality and ignore the reliability of the communities in a continuous time interval. To fill this research gap, we propose a novel $(θ,k)$-$core$ reliable community (CRC) model in the weighted dynamic networks, and define the problem of \textit{most reliable community search} that couples the desirable properties of connection strength, cohesive structure continuity, and the maximal member engagement. To solve this problem, we first develop a novel edge filtering based online CRC search algorithm that can effectively filter out the trivial edge information from the networks while searching for a \textit{reliable} community. Further, we propose an index structure, Weighted Core Forest-Index (WCF-index), and devise an index-based dynamic programming CRC search algorithm, that can prune a large number of insignificant intermediate results and support efficient query processing. Finally, we conduct extensive experiments systematically to demonstrate the efficiency and effectiveness of our proposed algorithms on eight real datasets under various experimental settings. △ Less

Submitted 18 October, 2022; v1 submitted 3 February, 2022; originally announced February 2022.

Journal ref: PVLDB, 15(11): 2826 - 2838, 2022

arXiv:2112.12845 [pdf, other]

doi 10.1145/3511808.3557244

Automatic Meta-Path Discovery for Effective Graph-Based Recommendation

Authors: Wentao Ning, Reynold Cheng, Jiajun Shen, Nur Al Hasan Haldar, Ben Kao, Xiao Yan, Nan Huo, Wai Kit Lam, Tian Li, Bo Tang

Abstract: Heterogeneous Information Networks (HINs) are labeled graphs that depict relationships among different types of entities (e.g., users, movies and directors). For HINs, meta-path-based recommenders (MPRs) utilize meta-paths (i.e., abstract paths consisting of node and link types) to predict user preference, and have attracted a lot of attention due to their explainability and performance. We observ… ▽ More Heterogeneous Information Networks (HINs) are labeled graphs that depict relationships among different types of entities (e.g., users, movies and directors). For HINs, meta-path-based recommenders (MPRs) utilize meta-paths (i.e., abstract paths consisting of node and link types) to predict user preference, and have attracted a lot of attention due to their explainability and performance. We observe that the performance of MPRs is highly sensitive to the meta-paths they use, but existing works manually select the meta-paths from many possible ones. Thus, to discover effective meta-paths automatically, we propose the Reinforcement learning-based Meta-path Selection (RMS) framework. Specifically, we define a vector encoding for meta-paths and design a policy network to extend meta-paths. The policy network is trained based on the results of downstream recommendation tasks and an early stopping approximation strategy is proposed to speed up training. RMS is a general model, and it can work with all existing MPRs. We also propose a new MPR called RMS-HRec, which uses an attention mechanism to aggregate information from the meta-paths. We conduct extensive experiments on real datasets. Compared with the manually selected meta-paths, the meta-paths identified by RMS consistently improve recommendation quality. Moreover, RMS-HRec outperforms state-of-the-art recommender systems by an average of 7% in hit ratio. The codes and datasets are available on https://github.com/Stevenn9981/RMS-HRec. △ Less

Submitted 7 September, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

Comments: This paper is accepted as a full research paper by CIKM 2022

arXiv:2009.00373 [pdf, other]

doi 10.1109/TKDE.2022.3151095

Top-k Socio-Spatial Co-engaged Location Selection for Social Users

Authors: Nur Al Hasan Haldar, Jianxin Li, Mohammed Eunus Ali, Taotao Cai, Timos Sellis, Mark Reynolds

Abstract: With the advent of location-based social networks, users can tag their daily activities in different locations through check-ins. These check-in locations signify user preferences for various socio-spatial activities and can be used to build their profiles to improve the quality of services in some applications such as recommendation systems, advertising, and group formation. To support such appli… ▽ More With the advent of location-based social networks, users can tag their daily activities in different locations through check-ins. These check-in locations signify user preferences for various socio-spatial activities and can be used to build their profiles to improve the quality of services in some applications such as recommendation systems, advertising, and group formation. To support such applications, in this paper, we formulate a new problem of identifying top-k Socio-Spatial co-engaged Location Selection (SSLS) for users in a social graph, that selects the best set of k locations from a large number of location candidates relating to the user and her friends. The selected locations should be (i) spatially and socially relevant to the user and her friends, and (ii) diversified in both spatially and socially to maximize the coverage of friends in the spatial space. This problem has been proved as NP-hard. To address the challenging problem, we first develop a branch-and-bound based Exact solution by designing some pruning strategies based on the derived bounds on diversity. To make the solution scalable for large datasets, we also develop an approximate solution by deriving the relaxed bounds and advanced termination rules to filter out insignificant intermediate results. To further accelerate the efficiency, we present one fast exact approach and a meta-heuristic approximate approach by avoiding the repeated computation of diversity at the running time. Finally, we have performed extensive experiments to evaluate the performance of our proposed models and algorithms against the adapted existing methods using four real-world large datasets. △ Less

Submitted 14 September, 2020; v1 submitted 1 September, 2020; originally announced September 2020.

Showing 1–6 of 6 results for author: Haldar, N A H