Skip to main content

Showing 1–1 of 1 results for author: Visvanathan, H

  1. arXiv:2406.09393  [pdf, other

    cs.CL cs.AI cs.LG

    Improving Autoregressive Training with Dynamic Oracles

    Authors: Jianing Yang, Harshine Visvanathan, Yilin Wang, Xinyi Hu, Matthew Gormley

    Abstract: Many tasks within NLP can be framed as sequential decision problems, ranging from sequence tagging to text generation. However, for many tasks, the standard training methods, including maximum likelihood (teacher forcing) and scheduled sampling, suffer from exposure bias and a mismatch between metrics employed during training and inference. DAgger provides a solution to mitigate these problems, ye… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.