-
KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs
Authors:
Aihua Pei,
Zehua Yang,
Shunan Zhu,
Ruoxi Cheng,
Ju Jia,
Lina Wang
Abstract:
Existing frameworks for assessing robustness of large language models (LLMs) overly depend on specific benchmarks, increasing costs and failing to evaluate performance of LLMs in professional domains due to dataset limitations. This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). Our framework…
▽ More
Existing frameworks for assessing robustness of large language models (LLMs) overly depend on specific benchmarks, increasing costs and failing to evaluate performance of LLMs in professional domains due to dataset limitations. This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). Our framework generates original prompts from the triplets of knowledge graphs and creates adversarial prompts by poisoning, assessing the robustness of LLMs through the results of these adversarial attacks. We systematically evaluate the effectiveness of this framework and its modules. Experiments show that adversarial robustness of the ChatGPT family ranks as GPT-4-turbo > GPT-4o > GPT-3.5-turbo, and the robustness of large language models is influenced by the professional domains in which they operate.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Reinforcement Learning from Multi-role Debates as Feedback for Bias Mitigation in LLMs
Authors:
Ruoxi Cheng,
Haoxuan Ma,
Shuirong Cao,
Jiaqi Li,
Aihua Pei,
Zhiqiang Wang,
Pengliang Ji,
Haoyu Wang,
Jiaqi Huo
Abstract:
Bias in LLMs can harm user experience and societal outcomes. However, current bias mitigation methods often require intensive human feedback, lack transferability to other topics or yield overconfident and random outputs. We find that involving LLMs in role-playing scenario boosts their ability to recognize and mitigate biases. Based on this, we propose Reinforcement Learning from Multi-role Debat…
▽ More
Bias in LLMs can harm user experience and societal outcomes. However, current bias mitigation methods often require intensive human feedback, lack transferability to other topics or yield overconfident and random outputs. We find that involving LLMs in role-playing scenario boosts their ability to recognize and mitigate biases. Based on this, we propose Reinforcement Learning from Multi-role Debates as Feedback (RLDF), a novel approach for bias mitigation replacing human feedback in traditional RLHF. We utilize LLMs in multi-role debates to create a dataset that includes both high-bias and low-bias instances for training the reward model in reinforcement learning. Our approach comprises two modes: (1) self-reflection, where the same LLM participates in multi-role debates, and (2) teacher-student, where a more advanced LLM like GPT-3.5-turbo guides the LLM to perform this task. Experimental results across different LLMs demonstrate the effectiveness of our approach in bias mitigation.
△ Less
Submitted 18 June, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Debunking Disinformation: Revolutionizing Truth with NLP in Fake News Detection
Authors:
Li He,
Siyi Hu,
Ailun Pei
Abstract:
The Internet and social media have altered how individuals access news in the age of instantaneous information distribution. While this development has increased access to information, it has also created a significant problem: the spread of fake news and information. Fake news is rapidly spreading on digital platforms, which has a negative impact on the media ecosystem, public opinion, decision-m…
▽ More
The Internet and social media have altered how individuals access news in the age of instantaneous information distribution. While this development has increased access to information, it has also created a significant problem: the spread of fake news and information. Fake news is rapidly spreading on digital platforms, which has a negative impact on the media ecosystem, public opinion, decision-making, and social cohesion. Natural Language Processing(NLP), which offers a variety of approaches to identify content as authentic, has emerged as a potent weapon in the growing war against disinformation. This paper takes an in-depth look at how NLP technology can be used to detect fake news and reveals the challenges and opportunities it presents.
△ Less
Submitted 15 November, 2023; v1 submitted 30 August, 2023;
originally announced August 2023.
-
Fast Template Matching by Subsampled Circulant Matrix
Authors:
Sung-Hsien Hsieh,
Chun-Shien Lu,
and Soo-Chang Pei
Abstract:
Template matching is widely used for many applications in image and signal processing and usually is time-critical. Traditional methods usually focus on how to reduce the search locations by coarse-to-fine strategy or full search combined with pruning strategy. However, the computation cost of those methods is easily dominated by the size of signal N instead of that of template K. This paper propo…
▽ More
Template matching is widely used for many applications in image and signal processing and usually is time-critical. Traditional methods usually focus on how to reduce the search locations by coarse-to-fine strategy or full search combined with pruning strategy. However, the computation cost of those methods is easily dominated by the size of signal N instead of that of template K. This paper proposes a probabilistic and fast matching scheme, which computation costs requires O(N) additions and O(K \log K) multiplications, based on cross-correlation. The nuclear idea is to first downsample signal, which size becomes O(K), and then subsequent operations only involves downsampled signals. The probability of successful match depends on cross-correlation between signal and the template. We show the sufficient condition for successful match and prove that the probability is high for binary signals with K^2/log K >= O(N). The experiments shows this proposed scheme is fast and efficient and supports the theoretical results.
△ Less
Submitted 16 September, 2015;
originally announced September 2015.