Computer Science > Computation and Language

arXiv:2405.17386 (cs)

[Submitted on 27 May 2024]

Title:MindMerger: Efficient Boosting LLM Reasoning in non-English Languages

Authors:Zixian Huang, Wenhao Zhu, Gong Cheng, Lei Li, Fei Yuan

Abstract:Reasoning capabilities are crucial for Large Language Models (LLMs), yet a notable gap exists between English and non-English languages. To bridge this disparity, some works fine-tune LLMs to relearn reasoning capabilities in non-English languages, while others replace non-English inputs with an external model's outputs such as English translation text to circumvent the challenge of LLM understanding non-English. Unfortunately, these methods often underutilize the built-in skilled reasoning and useful language understanding capabilities of LLMs. In order to better utilize the minds of reasoning and language understanding in LLMs, we propose a new method, namely MindMerger, which merges LLMs with the external language understanding capabilities from multilingual models to boost the multilingual reasoning performance. Furthermore, a two-step training scheme is introduced to first train to embeded the external capabilities into LLMs and then train the collaborative utilization of the external capabilities and the built-in capabilities in LLMs. Experiments on three multilingual reasoning datasets and a language understanding dataset demonstrate that MindMerger consistently outperforms all baselines, especially in low-resource languages. Without updating the parameters of LLMs, the average accuracy improved by 6.7% and 8.0% across all languages and low-resource languages on the MGSM dataset, respectively.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2405.17386 [cs.CL]
	(or arXiv:2405.17386v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2405.17386

Submission history

From: Zixian Huang [view email]
[v1] Mon, 27 May 2024 17:41:54 UTC (3,108 KB)

Computer Science > Computation and Language

Title:MindMerger: Efficient Boosting LLM Reasoning in non-English Languages

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:MindMerger: Efficient Boosting LLM Reasoning in non-English Languages

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators