-
fakenewsbr: A Fake News Detection Platform for Brazilian Portuguese
Authors:
Luiz Giordani,
Gilsiley Darú,
Rhenan Queiroz,
Vitor Buzinaro,
Davi Keglevich Neiva,
Daniel Camilo Fuentes Guzmán,
Marcos Jardel Henriques,
Oilson Alberto Gonzatto Junior,
Francisco Louzada
Abstract:
The proliferation of fake news has become a significant concern in recent times due to its potential to spread misinformation and manipulate public opinion. This paper presents a comprehensive study on detecting fake news in Brazilian Portuguese, focusing on journalistic-type news. We propose a machine learning-based approach that leverages natural language processing techniques, including TF-IDF…
▽ More
The proliferation of fake news has become a significant concern in recent times due to its potential to spread misinformation and manipulate public opinion. This paper presents a comprehensive study on detecting fake news in Brazilian Portuguese, focusing on journalistic-type news. We propose a machine learning-based approach that leverages natural language processing techniques, including TF-IDF and Word2Vec, to extract features from textual data. We evaluate the performance of various classification algorithms, such as logistic regression, support vector machine, random forest, AdaBoost, and LightGBM, on a dataset containing both true and fake news articles. The proposed approach achieves high accuracy and F1-Score, demonstrating its effectiveness in identifying fake news. Additionally, we developed a user-friendly web platform, fakenewsbr.com, to facilitate the verification of news articles' veracity. Our platform provides real-time analysis, allowing users to assess the likelihood of fake news articles. Through empirical analysis and comparative studies, we demonstrate the potential of our approach to contribute to the fight against the spread of fake news and promote more informed media consumption.
△ Less
Submitted 20 September, 2023; v1 submitted 20 September, 2023;
originally announced September 2023.
-
Random Machines Regression Approach: an ensemble support vector regression model with free kernel choice
Authors:
Anderson Ara,
Mateus Maia,
Samuel Macêdo,
Francisco Louzada
Abstract:
Machine learning techniques always aim to reduce the generalized prediction error. In order to reduce it, ensemble methods present a good approach combining several models that results in a greater forecasting capacity. The Random Machines already have been demonstrated as strong technique, i.e: high predictive power, to classification tasks, in this article we propose an procedure to use the bagg…
▽ More
Machine learning techniques always aim to reduce the generalized prediction error. In order to reduce it, ensemble methods present a good approach combining several models that results in a greater forecasting capacity. The Random Machines already have been demonstrated as strong technique, i.e: high predictive power, to classification tasks, in this article we propose an procedure to use the bagged-weighted support vector model to regression problems. Simulation studies were realized over artificial datasets, and over real data benchmarks. The results exhibited a good performance of Regression Random Machines through lower generalization error without needing to choose the best kernel function during tuning process.
△ Less
Submitted 27 March, 2020;
originally announced March 2020.
-
Random Machines: A bagged-weighted support vector model with free kernel choice
Authors:
Anderson Ara,
Mateus Maia,
Samuel Macêdo,
Francisco Louzada
Abstract:
Improvement of statistical learning models in order to increase efficiency in solving classification or regression problems is still a goal pursued by the scientific community. In this way, the support vector machine model is one of the most successful and powerful algorithms for those tasks. However, its performance depends directly from the choice of the kernel function and their hyperparameters…
▽ More
Improvement of statistical learning models in order to increase efficiency in solving classification or regression problems is still a goal pursued by the scientific community. In this way, the support vector machine model is one of the most successful and powerful algorithms for those tasks. However, its performance depends directly from the choice of the kernel function and their hyperparameters. The traditional choice of them, actually, can be computationally expensive to do the kernel choice and the tuning processes. In this article, it is proposed a novel framework to deal with the kernel function selection called Random Machines. The results improved accuracy and reduced computational time. The data study was performed in simulated data and over 27 real benchmarking datasets.
△ Less
Submitted 21 November, 2019;
originally announced November 2019.
-
MWStat: A Modulated Web-Based Statistical System
Authors:
Francisco Louzada,
Anderson Ara
Abstract:
In this paper we present the development of a modulated web based statistical system, hereafter MWStat, which shifts the statistical paradigm of analyzing data into a real time structure. The MWStat system is useful for both online storage data and questionnaires analysis, as well as to provide real time disposal of results from analysis related to several statistical methodologies in a customizab…
▽ More
In this paper we present the development of a modulated web based statistical system, hereafter MWStat, which shifts the statistical paradigm of analyzing data into a real time structure. The MWStat system is useful for both online storage data and questionnaires analysis, as well as to provide real time disposal of results from analysis related to several statistical methodologies in a customizable fashion. Overall, it can be seem as a useful technical solution that can be applied to a large range of statistical applications, which needs of a scheme of devolution of real time results, accessible to anyone with internet access. We display here the step-by-step instructions for implementing the system. The structure is accessible, built with an easily interpretable language and it can be strategically applied to online statistical applications. We rely on the relationship of several free languages, namely, PHP, R, MySQL database and an Apache HTTP server, and on the use of software tools such as phpMyAdmin. We expose three didactical examples of the MWStat system on institutional evaluation, statistical quality control and multivariate analysis. The methodology is also illustrated in a real example on institutional evaluation.
△ Less
Submitted 29 April, 2016;
originally announced May 2016.