Skip to main content

Showing 1–4 of 4 results for author: Bojun, H

  1. arXiv:2305.14859  [pdf, other

    cs.LG cs.CL cs.NE

    Utility-Probability Duality of Neural Networks

    Authors: Huang Bojun, Fei Yuan

    Abstract: It is typically understood that the training of modern neural networks is a process of fitting the probability distribution of desired output. However, recent paradoxical observations in a number of language generation tasks let one wonder if this canonical probability-based explanation can really account for the empirical success of deep learning. To resolve this issue, we propose an alternative… ▽ More

    Submitted 25 May, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  2. arXiv:2207.11161  [pdf, other

    cs.LG cs.AI cs.CL

    Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)

    Authors: Huang Bojun

    Abstract: This paper discusses a new approach to the fundamental problem of learning optimal Q-functions. In this approach, optimal Q-functions are formulated as saddle points of a nonlinear Lagrangian function derived from the classic Bellman optimality equation. The paper shows that the Lagrangian enjoys strong duality, in spite of its nonlinearity, which paves the way to a general Lagrangian method to Q-… ▽ More

    Submitted 26 August, 2022; v1 submitted 22 July, 2022; originally announced July 2022.

    Comments: ICML 2022

  3. arXiv:2103.11795  [pdf, other

    cs.CL cs.AI cs.LG

    Simpson's Bias in NLP Training

    Authors: Fei Yuan, Longtu Zhang, Huang Bojun, Yaobo Liang

    Abstract: In most machine learning tasks, we evaluate a model $M$ on a given data population $S$ by measuring a population-level metric $F(S;M)$. Examples of such evaluation metric $F$ include precision/recall for (binary) recognition, the F1 score for multi-class classification, and the BLEU metric for language generation. On the other hand, the model $M$ is trained by optimizing a sample-level loss… ▽ More

    Submitted 13 March, 2021; originally announced March 2021.

    Comments: AAAI 2021

  4. arXiv:2011.06631  [pdf, other

    cs.LG cs.AI

    Steady State Analysis of Episodic Reinforcement Learning

    Authors: Huang Bojun

    Abstract: This paper proves that the episodic learning environment of every finite-horizon decision task has a unique steady state under any behavior policy, and that the marginal distribution of the agent's input indeed converges to the steady-state distribution in essentially all episodic learning processes. This observation supports an interestingly reversed mindset against conventional wisdom: While the… ▽ More

    Submitted 13 January, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

    Comments: NeurIPS 2020