-
Deep Learning Brasil at ABSAPT 2022: Portuguese Transformer Ensemble Approaches
Authors:
Juliana Resplande Santanna Gomes,
Eduardo Augusto Santos Garcia,
Adalberto Ferreira Barbosa Junior,
Ruan Chaves Rodrigues,
Diogo Fernandes Costa Silva,
Dyonnatan Ferreira Maia,
Nádia Félix Felipe da Silva,
Arlindo Rodrigues Galvão Filho,
Anderson da Silva Soares
Abstract:
Aspect-based Sentiment Analysis (ABSA) is a task whose objective is to classify the individual sentiment polarity of all entities, called aspects, in a sentence. The task is composed of two subtasks: Aspect Term Extraction (ATE), identify all aspect terms in a sentence; and Sentiment Orientation Extraction (SOE), given a sentence and its aspect terms, the task is to determine the sentiment polarit…
▽ More
Aspect-based Sentiment Analysis (ABSA) is a task whose objective is to classify the individual sentiment polarity of all entities, called aspects, in a sentence. The task is composed of two subtasks: Aspect Term Extraction (ATE), identify all aspect terms in a sentence; and Sentiment Orientation Extraction (SOE), given a sentence and its aspect terms, the task is to determine the sentiment polarity of each aspect term (positive, negative or neutral). This article presents we present our participation in Aspect-Based Sentiment Analysis in Portuguese (ABSAPT) 2022 at IberLEF 2022. We submitted the best performing systems, achieving new state-of-the-art results on both subtasks.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
To Tune or Not To Tune? Zero-shot Models for Legal Case Entailment
Authors:
Guilherme Moraes Rosa,
Ruan Chaves Rodrigues,
Roberto de Alencar Lotufo,
Rodrigo Nogueira
Abstract:
There has been mounting evidence that pretrained language models fine-tuned on large and diverse supervised datasets can transfer well to a variety of out-of-domain tasks. In this work, we investigate this transfer ability to the legal domain. For that, we participated in the legal case entailment task of COLIEE 2021, in which we use such models with no adaptations to the target domain. Our submis…
▽ More
There has been mounting evidence that pretrained language models fine-tuned on large and diverse supervised datasets can transfer well to a variety of out-of-domain tasks. In this work, we investigate this transfer ability to the legal domain. For that, we participated in the legal case entailment task of COLIEE 2021, in which we use such models with no adaptations to the target domain. Our submissions achieved the highest scores, surpassing the second-best team by more than six percentage points. Our experiments confirm a counter-intuitive result in the new paradigm of pretrained language models: given limited labeled data, models with little or no adaptation to the target task can be more robust to changes in the data distribution than models fine-tuned on it. Code is available at https://github.com/neuralmind-ai/coliee.
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
Zero-shot hashtag segmentation for multilingual sentiment analysis
Authors:
Ruan Chaves Rodrigues,
Marcelo Akira Inuzuka,
Juliana Resplande Sant'Anna Gomes,
Acquila Santos Rocha,
Iacer Calixto,
Hugo Alexandre Dantas do Nascimento
Abstract:
Hashtag segmentation, also known as hashtag decomposition, is a common step in preprocessing pipelines for social media datasets. It usually precedes tasks such as sentiment analysis and hate speech detection. For sentiment analysis in medium to low-resourced languages, previous research has demonstrated that a multilingual approach that resorts to machine translation can be competitive or superio…
▽ More
Hashtag segmentation, also known as hashtag decomposition, is a common step in preprocessing pipelines for social media datasets. It usually precedes tasks such as sentiment analysis and hate speech detection. For sentiment analysis in medium to low-resourced languages, previous research has demonstrated that a multilingual approach that resorts to machine translation can be competitive or superior to previous approaches to the task. We develop a zero-shot hashtag segmentation framework and demonstrate how it can be used to improve the accuracy of multilingual sentiment analysis pipelines. Our zero-shot framework establishes a new state-of-the-art for hashtag segmentation datasets, surpassing even previous approaches that relied on feature engineering and language models trained on in-domain data.
△ Less
Submitted 6 December, 2021;
originally announced December 2021.
-
Yes, BM25 is a Strong Baseline for Legal Case Retrieval
Authors:
Guilherme Moraes Rosa,
Ruan Chaves Rodrigues,
Roberto Lotufo,
Rodrigo Nogueira
Abstract:
We describe our single submission to task 1 of COLIEE 2021. Our vanilla BM25 got second place, well above the median of submissions. Code is available at https://github.com/neuralmind-ai/coliee.
We describe our single submission to task 1 of COLIEE 2021. Our vanilla BM25 got second place, well above the median of submissions. Code is available at https://github.com/neuralmind-ai/coliee.
△ Less
Submitted 25 October, 2021; v1 submitted 26 April, 2021;
originally announced May 2021.