Computer Science > Computation and Language

arXiv:2210.08855 (cs)

[Submitted on 17 Oct 2022 (v1), last revised 18 May 2023 (this version, v2)]

Title:PeerDA: Data Augmentation via Modeling Peer Relation for Span Identification Tasks

Authors:Weiwen Xu, Xin Li, Yang Deng, Wai Lam, Lidong Bing

View PDF

Abstract:Span identification aims at identifying specific text spans from text input and classifying them into pre-defined categories. Different from previous works that merely leverage the Subordinate (SUB) relation (i.e. if a span is an instance of a certain category) to train models, this paper for the first time explores the Peer (PR) relation, which indicates that two spans are instances of the same category and share similar features. Specifically, a novel Peer Data Augmentation (PeerDA) approach is proposed which employs span pairs with the PR relation as the augmentation data for training. PeerDA has two unique advantages: (1) There are a large number of PR span pairs for augmenting the training data. (2) The augmented data can prevent the trained model from over-fitting the superficial span-category mapping by pushing the model to leverage the span semantics. Experimental results on ten datasets over four diverse tasks across seven domains demonstrate the effectiveness of PeerDA. Notably, PeerDA achieves state-of-the-art results on six of them.

Comments:	To appear at ACL 2023 main conference
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2210.08855 [cs.CL]
	(or arXiv:2210.08855v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2210.08855

Submission history

From: Weiwen Xu [view email]
[v1] Mon, 17 Oct 2022 08:51:30 UTC (729 KB)
[v2] Thu, 18 May 2023 12:11:08 UTC (722 KB)

Computer Science > Computation and Language

Title:PeerDA: Data Augmentation via Modeling Peer Relation for Span Identification Tasks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:PeerDA: Data Augmentation via Modeling Peer Relation for Span Identification Tasks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators