Google Scholar

Order matters in the presence of dataset imbalance for multilingual learning

D Choi, D Xin, H Dadkhahi, J Gilmer…�- Advances in�…, 2024 - proceedings.neurips.cc

D Choi, D Xin, H Dadkhahi, J Gilmer, A Garg, O Firat, CK Yeh, AM Dai, B Ghorbani

Advances in Neural Information Processing Systems, 2024•proceedings.neurips.cc

In this paper, we empirically study the optimization dynamics of multi-task learning,
particularly focusing on those that govern a collection of tasks with significant data
imbalance. We present a simple yet effective method of pre-training on high-resource tasks,
followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough
empirical study and analysis of this method's benefits showing that it achieves consistent
improvements relative to the performance trade-off profile of standard static weighting. We�…

Abstract

In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's benefits showing that it achieves consistent improvements relative to the performance trade-off profile of standard static weighting. We analyze under what data regimes this method is applicable and show its improvements empirically in neural machine translation (NMT) and multi-lingual language modeling.

proceedings.neurips.cc

Show moreShow less

Save Cite Cited by 1 Related articles All 5 versions View as HTML

Cite

Advanced search

Saved to My library

Order matters in the presence of dataset imbalance for multilingual learning