Skip to main content

Showing 1–5 of 5 results for author: Tran, N L

  1. arXiv:2401.06406  [pdf

    cs.LG cs.AI

    Knowledge-Informed Machine Learning for Cancer Diagnosis and Prognosis: A review

    Authors: Lingchao Mao, Hairong Wang, Leland S. Hu, Nhan L Tran, Peter D Canoll, Kristin R Swanson, Jing Li

    Abstract: Cancer remains one of the most challenging diseases to treat in the medical field. Machine learning has enabled in-depth analysis of rich multi-omics profiles and medical imaging for cancer diagnosis and prognosis. Despite these advancements, machine learning models face challenges stemming from limited labeled sample sizes, the intricate interplay of high-dimensionality data types, the inherent h… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

    Comments: 41 pages, 4 figures, 2 tables

    MSC Class: 92B99

  2. arXiv:2401.00128  [pdf

    cs.LG cs.CV math.OC

    Quantifying intra-tumoral genetic heterogeneity of glioblastoma toward precision medicine using MRI and a data-inclusive machine learning algorithm

    Authors: Lujia Wang, Hairong Wang, Fulvio D'Angelo, Lee Curtin, Christopher P. Sereduk, Gustavo De Leon, Kyle W. Singleton, Javier Urcuyo, Andrea Hawkins-Daarud, Pamela R. Jackson, Chandan Krishna, Richard S. Zimmerman, Devi P. Patra, Bernard R. Bendok, Kris A. Smith, Peter Nakaji, Kliment Donev, Leslie C. Baxter, Maciej M. Mrugała, Michele Ceccarelli, Antonio Iavarone, Kristin R. Swanson, Nhan L. Tran, Leland S. Hu, Jing Li

    Abstract: Glioblastoma (GBM) is one of the most aggressive and lethal human cancers. Intra-tumoral genetic heterogeneity poses a significant challenge for treatment. Biopsy is invasive, which motivates the development of non-invasive, MRI-based machine learning (ML) models to quantify intra-tumoral genetic heterogeneity for each patient. This capability holds great promise for enabling better therapeutic se… ▽ More

    Submitted 29 December, 2023; originally announced January 2024.

    Comments: 36 pages, 8 figures, 3 tables

  3. arXiv:2208.04243  [pdf, other

    cs.CL

    A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation

    Authors: Linh The Nguyen, Nguyen Luong Tran, Long Doan, Manh Luong, Dat Quoc Nguyen

    Abstract: In this paper, we introduce a high-quality and large-scale benchmark dataset for English-Vietnamese speech translation with 508 audio hours, consisting of 331K triplets of (sentence-lengthed audio, English source transcript sentence, Vietnamese target subtitle sentence). We also conduct empirical experiments using strong baselines and find that the traditional "Cascaded" approach still outperforms… ▽ More

    Submitted 8 August, 2022; originally announced August 2022.

    Comments: In Proceedings of INTERSPEECH 2022, to appear. The first three authors contributed equally to this work

  4. arXiv:2110.12199  [pdf, other

    cs.CL

    PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation

    Authors: Long Doan, Linh The Nguyen, Nguyen Luong Tran, Thai Hoang, Dat Quoc Nguyen

    Abstract: We introduce a high-quality and large-scale Vietnamese-English parallel dataset of 3.02M sentence pairs, which is 2.9M pairs larger than the benchmark Vietnamese-English machine translation corpus IWSLT15. We conduct experiments comparing strong neural baselines and well-known automatic translation engines on our dataset and find that in both automatic and human evaluations: the best performance i… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

    Comments: To appear in Proceedings of EMNLP 2021 (main conference). The first three authors contribute equally to this work

  5. arXiv:2109.09701  [pdf, other

    cs.CL

    BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

    Authors: Nguyen Luong Tran, Duong Minh Le, Dat Quoc Nguyen

    Abstract: We present BARTpho with two versions, BARTpho-syllable and BARTpho-word, which are the first public large-scale monolingual sequence-to-sequence models pre-trained for Vietnamese. BARTpho uses the "large" architecture and the pre-training scheme of the sequence-to-sequence denoising autoencoder BART, thus it is especially suitable for generative NLP tasks. We conduct experiments to compare our BAR… ▽ More

    Submitted 27 June, 2022; v1 submitted 20 September, 2021; originally announced September 2021.

    Comments: In Proceedings of INTERSPEECH 2022 (to appear)