Skip to main content

Showing 1–1 of 1 results for author: Pantoja, D

  1. arXiv:2406.19470  [pdf, other

    cs.CL

    Changing Answer Order Can Decrease MMLU Accuracy

    Authors: Vipul Gupta, David Pantoja, Candace Ross, Adina Williams, Megan Ung

    Abstract: As large language models (LLMs) have grown in prevalence, particular benchmarks have become essential for the evaluation of these models and for understanding model capabilities. Most commonly, we use test accuracy averaged across multiple subtasks in order to rank models on leaderboards, to determine which model is best for our purposes. In this paper, we investigate the robustness of the accurac… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Short paper, 9 pages