Skip to main content

Showing 1–21 of 21 results for author: Karlsson, B F

  1. arXiv:2406.10118  [pdf, other

    cs.CL

    SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages

    Authors: Holy Lovenia, Rahmad Mahendra, Salsabil Maulana Akbar, Lester James V. Miranda, Jennifer Santoso, Elyanah Aco, Akhdan Fadhilah, Jonibek Mansurov, Joseph Marvin Imperial, Onno P. Kampman, Joel Ruben Antony Moniz, Muhammad Ravi Shulthan Habibi, Frederikus Hudi, Railey Montalan, Ryan Ignatius, Joanito Agili Lopo, William Nixon, Börje F. Karlsson, James Jaya, Ryandito Diandaru, Yuze Gao, Patrick Amadeus, Bin Wang, Jan Christian Blaise Cruz, Chenxi Whitehouse , et al. (36 additional authors not shown)

    Abstract: Southeast Asia (SEA) is a region rich in linguistic diversity and cultural variety, with over 1,300 indigenous languages and a population of 671 million people. However, prevailing AI models suffer from a significant lack of representation of texts, images, and audio datasets from SEA, compromising the quality of AI models for SEA languages. Evaluating models for SEA languages is challenging due t… ▽ More

    Submitted 8 July, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: https://github.com/SEACrowd

  2. arXiv:2403.10249  [pdf, other

    cs.AI

    A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges

    Authors: Xinrun Xu, Yuxin Wang, Chaoyi Xu, Ziluo Ding, Jiechuan Jiang, Zhiming Ding, Börje F. Karlsson

    Abstract: The swift evolution of Large-scale Models (LMs), either language-focused or multi-modal, has garnered extensive attention in both academy and industry. But despite the surge in interest in this rapidly evolving area, there are scarce systematic reviews on their capabilities and potential in distinct impactful scenarios. This paper endeavours to help bridge this gap, offering a thorough examination… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: 13 pages, 3 figures

  3. arXiv:2403.03186  [pdf, other

    cs.AI

    Cradle: Empowering Foundation Agents Towards General Computer Control

    Authors: Weihao Tan, Wentao Zhang, Xinrun Xu, Haochong Xia, Ziluo Ding, Boyu Li, Bohan Zhou, Junpeng Yue, Jiechuan Jiang, Yewen Li, Ruyi An, Molei Qin, Chuqiao Zong, Longtao Zheng, Yujie Wu, Xiaoqiang Chai, Yifei Bi, Tianbao Xie, Pengjie Gu, Xiyun Li, Ceyao Zhang, Long Tian, Chaojie Wang, Xinrun Wang, Börje F. Karlsson , et al. (3 additional authors not shown)

    Abstract: Despite the success in specific scenarios, existing foundation agents still struggle to generalize across various virtual scenarios, mainly due to the dramatically different encapsulations of environments with manually designed observation and action spaces. To handle this issue, we propose the General Computer Control (GCC) setting to restrict foundation agents to interact with software through t… ▽ More

    Submitted 2 July, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  4. arXiv:2402.11166  [pdf, other

    cs.CL

    GenDec: A robust generative Question-decomposition method for Multi-hop reasoning

    Authors: Jian Wu, Linyi Yang, Yuliang Ji, Wenhao Huang, Börje F. Karlsson, Manabu Okumura

    Abstract: Multi-hop QA (MHQA) involves step-by-step reasoning to answer complex questions and find multiple relevant supporting facts. However, Existing large language models'(LLMs) reasoning ability in multi-hop question answering remains exploration, which is inadequate in answering multi-hop questions. Moreover, it is unclear whether LLMs follow a desired reasoning chain to reach the right final answer.… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  5. arXiv:2402.06619  [pdf, other

    cs.CL cs.AI

    Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning

    Authors: Shivalika Singh, Freddie Vargus, Daniel Dsouza, Börje F. Karlsson, Abinaya Mahendiran, Wei-Yin Ko, Herumb Shandilya, Jay Patel, Deividas Mataciunas, Laura OMahony, Mike Zhang, Ramith Hettiarachchi, Joseph Wilson, Marina Machado, Luisa Souza Moura, Dominik Krzemiński, Hakimeh Fadaei, Irem Ergün, Ifeoma Okoh, Aisha Alaagib, Oshan Mudannayake, Zaid Alyafeai, Vu Minh Chien, Sebastian Ruder, Surya Guthikonda , et al. (8 additional authors not shown)

    Abstract: Datasets are foundational to many breakthroughs in modern artificial intelligence. Many recent achievements in the space of natural language processing (NLP) can be attributed to the finetuning of pre-trained models on a diverse set of tasks that enables a large language model (LLM) to respond to instructions. Instruction fine-tuning (IFT) requires specifically constructed and annotated datasets.… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  6. arXiv:2311.09122  [pdf, other

    cs.CL

    Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark

    Authors: Stephen Mayhew, Terra Blevins, Shuheng Liu, Marek Šuppa, Hila Gonen, Joseph Marvin Imperial, Börje F. Karlsson, Peiqin Lin, Nikola Ljubešić, LJ Miranda, Barbara Plank, Arij Riabi, Yuval Pinter

    Abstract: We introduce Universal NER (UNER), an open, community-driven project to develop gold-standard NER benchmarks in many languages. The overarching goal of UNER is to provide high-quality, cross-lingually consistent annotations to facilitate and standardize multilingual NER research. UNER v1 contains 18 datasets annotated with named entities in a cross-lingual consistent schema across 12 diverse langu… ▽ More

    Submitted 29 June, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: NAACL 2024 Camera-ready

  7. arXiv:2311.08189  [pdf, other

    cs.CL

    All Data on the Table: Novel Dataset and Benchmark for Cross-Modality Scientific Information Extraction

    Authors: Yuhan Li, Jian Wu, Zhiwei Yu, Börje F. Karlsson, Wei Shen, Manabu Okumura, Chin-Yew Lin

    Abstract: Extracting key information from scientific papers has the potential to help researchers work more efficiently and accelerate the pace of scientific progress. Over the last few years, research on Scientific Information Extraction (SciIE) witnessed the release of several new systems and benchmarks. However, existing paper-focused datasets mostly focus only on specific parts of a manuscript (e.g., ab… ▽ More

    Submitted 17 December, 2023; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: Work in progress; 17 pages, 6 figures, 11 tables

  8. arXiv:2309.17288  [pdf, other

    cs.AI

    AutoAgents: A Framework for Automatic Agent Generation

    Authors: Guangyao Chen, Siwei Dong, Yu Shu, Ge Zhang, Jaward Sesay, Börje F. Karlsson, Jie Fu, Yemin Shi

    Abstract: Large language models (LLMs) have enabled remarkable advances in automated task-solving with multi-agent systems. However, most existing LLM-based multi-agent approaches rely on predefined agents to handle simple tasks, limiting the adaptability of multi-agent collaboration to different scenarios. Therefore, we introduce AutoAgents, an innovative framework that adaptively generates and coordinates… ▽ More

    Submitted 29 April, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: IJCAI 2024

  9. arXiv:2307.08059  [pdf, other

    cs.CV

    LafitE: Latent Diffusion Model with Feature Editing for Unsupervised Multi-class Anomaly Detection

    Authors: Haonan Yin, Guanlong Jiao, Qianhui Wu, Borje F. Karlsson, Biqing Huang, Chin Yew Lin

    Abstract: In the context of flexible manufacturing systems that are required to produce different types and quantities of products with minimal reconfiguration, this paper addresses the problem of unsupervised multi-class anomaly detection: develop a unified model to detect anomalies from objects belonging to multiple classes when only normal data is accessible. We first explore the generative-based approac… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

    Comments: 8 pages

  10. arXiv:2305.14913  [pdf, other

    cs.CL

    CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition

    Authors: Tingting Ma, Qianhui Wu, Huiqiang Jiang, Börje F. Karlsson, Tiejun Zhao, Chin-Yew Lin

    Abstract: Cross-lingual named entity recognition (NER) aims to train an NER system that generalizes well to a target language by leveraging labeled data in a given source language. Previous work alleviates the data scarcity problem by translating source-language labeled data or performing knowledge distillation on target-language unlabeled data. However, these methods may suffer from label noise due to the… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: ACL 2023. Our code is available at https://github.com/microsoft/vert-papers/tree/master/papers/CoLaDa

  11. arXiv:2305.14682  [pdf, other

    cs.CL

    TACR: A Table-alignment-based Cell-selection and Reasoning Model for Hybrid Question-Answering

    Authors: Jian Wu, Yicheng Xu, Yan Gao, Jian-Guang Lou, Börje F. Karlsson, Manabu Okumura

    Abstract: Hybrid Question-Answering (HQA), which targets reasoning over tables and passages linked from table cells, has witnessed significant research in recent years. A common challenge in HQA and other passage-table QA datasets is that it is generally unrealistic to iterate over all table rows, columns, and linked passages to retrieve evidence. Such a challenge made it difficult for previous studies to s… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted at Findings of ACL 2023

  12. arXiv:2303.18103  [pdf, other

    cs.CL cs.AI

    Dataset and Baseline System for Multi-lingual Extraction and Normalization of Temporal and Numerical Expressions

    Authors: Sanxing Chen, Yongqiang Chen, Börje F. Karlsson

    Abstract: Temporal and numerical expression understanding is of great importance in many downstream Natural Language Processing (NLP) and Information Retrieval (IR) tasks. However, much previous work covers only a few sub-types and focuses only on entity extraction, which severely limits the usability of identified mentions. In order for such entities to be useful in downstream scenarios, coverage and granu… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

    Comments: Technical Report

    Report number: MSR-TR-2023-9

  13. arXiv:2212.04634  [pdf, other

    cs.CL cs.AI

    Open-world Story Generation with Structured Knowledge Enhancement: A Comprehensive Survey

    Authors: Yuxin Wang, Jieru Lin, Zhiwei Yu, Wei Hu, Börje F. Karlsson

    Abstract: Storytelling and narrative are fundamental to human experience, intertwined with our social and cultural engagement. As such, researchers have long attempted to create systems that can generate stories automatically. In recent years, powered by deep learning and massive data resources, automatic story generation has shown significant advances. However, considerable challenges, like the need for gl… ▽ More

    Submitted 12 September, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: Accepted in Neurocomputing

  14. arXiv:2211.11300  [pdf, other

    cs.CL cs.AI

    Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text

    Authors: Qianhui Wu, Huiqiang Jiang, Haonan Yin, Börje F. Karlsson, Chin-Yew Lin

    Abstract: Self-supervised representation learning has proved to be a valuable component for out-of-distribution (OoD) detection with only the texts of in-distribution (ID) examples. These approaches either train a language model from scratch or fine-tune a pre-trained language model using ID examples, and then take the perplexity output by the language model as OoD scores. In this paper, we analyze the comp… ▽ More

    Submitted 2 June, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: ACL 2023. Our code is available at: https://github.com/microsoft/KC/tree/main/papers/MLKD_OOD

  15. arXiv:2210.12925  [pdf, other

    cs.CL cs.AI

    TIARA: Multi-grained Retrieval for Robust Question Answering over Large Knowledge Bases

    Authors: Yiheng Shu, Zhiwei Yu, Yuhan Li, Börje F. Karlsson, Tingting Ma, Yuzhong Qu, Chin-Yew Lin

    Abstract: Pre-trained language models (PLMs) have shown their effectiveness in multiple scenarios. However, KBQA remains challenging, especially regarding coverage and generalization settings. This is due to two main factors: i) understanding the semantics of both questions and relevant knowledge from the KB; ii) generating executable logical forms with both semantic and syntactic correctness. In this paper… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

  16. arXiv:2107.09429  [pdf, other

    cs.CL

    BoningKnife: Joint Entity Mention Detection and Typing for Nested NER via prior Boundary Knowledge

    Authors: Huiqiang Jiang, Guoxin Wang, Weile Chen, Chengxi Zhang, Börje F. Karlsson

    Abstract: While named entity recognition (NER) is a key task in natural language processing, most approaches only target flat entities, ignoring nested structures which are common in many scenarios. Most existing nested NER methods traverse all sub-sequences which is both expensive and inefficient, and also don't well consider boundary knowledge which is significant for nested entities. In this paper, we pr… ▽ More

    Submitted 20 July, 2021; originally announced July 2021.

    Comments: Work performed at Microsoft Research Asia between 2019/2020

  17. arXiv:2106.02300  [pdf, other

    cs.CL

    AdvPicker: Effectively Leveraging Unlabeled Data via Adversarial Discriminator for Cross-Lingual NER

    Authors: Weile Chen, Huiqiang Jiang, Qianhui Wu, Börje F. Karlsson, Yi Guan

    Abstract: Neural methods have been shown to achieve high performance in Named Entity Recognition (NER), but rely on costly high-quality labeled data for training, which is not always available across languages. While previous works have shown that unlabeled data in a target language can be used to improve cross-lingual model performance, we propose a novel adversarial approach (AdvPicker) to better leverage… ▽ More

    Submitted 7 June, 2021; v1 submitted 4 June, 2021; originally announced June 2021.

    Comments: This paper has been accepted at ACL-IJCNLP 2021

  18. arXiv:2007.07683  [pdf, other

    cs.CL

    UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data

    Authors: Qianhui Wu, Zijia Lin, Börje F. Karlsson, Biqing Huang, Jian-Guang Lou

    Abstract: Prior works in cross-lingual named entity recognition (NER) with no/little labeled data fall into two primary categories: model transfer based and data transfer based methods. In this paper we find that both method types can complement each other, in the sense that, the former can exploit context information via language-independent features but sees no task-specific information in the target lang… ▽ More

    Submitted 15 July, 2020; originally announced July 2020.

    Comments: This paper is accepted by IJCAI 2020. Code is available at https://github.com/microsoft/vert-papers/tree/master/papers/UniTrans

    Journal ref: In IJCAI, pages 3926-3932, 2020

  19. arXiv:2004.12440  [pdf, other

    cs.CL

    Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language

    Authors: Qianhui Wu, Zijia Lin, Börje F. Karlsson, Jian-Guang Lou, Biqing Huang

    Abstract: To better tackle the named entity recognition (NER) problem on languages with little/no labeled data, cross-lingual NER must effectively leverage knowledge learned from source languages with rich labeled data. Previous works on cross-lingual NER are mostly based on label projection with pairwise texts or direct model transfer. However, such methods either are not applicable if the labeled data in… ▽ More

    Submitted 15 July, 2020; v1 submitted 26 April, 2020; originally announced April 2020.

    Comments: This paper is accepted by ACL2020. Code is available at https://github.com/microsoft/vert-papers/tree/master/papers/SingleMulti-TS

    Journal ref: In ACL, pages 6505-6514, 2020

  20. arXiv:1911.06161  [pdf, ps, other

    cs.CL

    Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

    Authors: Qianhui Wu, Zijia Lin, Guoxin Wang, Hui Chen, Börje F. Karlsson, Biqing Huang, Chin-Yew Lin

    Abstract: For languages with no annotated resources, transferring knowledge from rich-resource languages is an effective solution for named entity recognition (NER). While all existing methods directly transfer from source-learned model to a target language, in this paper, we propose to fine-tune the learned model with a few similar examples given a test case, which could benefit the prediction by leveragin… ▽ More

    Submitted 15 July, 2020; v1 submitted 14 November, 2019; originally announced November 2019.

    Comments: This paper is accepted by AAAI2020. Code is available at https://github.com/microsoft/vert-papers/tree/master/papers/Meta-Cross

    Journal ref: In AAAI, pages 9274-9281, 2020

  21. arXiv:1904.02141  [pdf, other

    cs.CL

    CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition

    Authors: Yuying Zhu, Guoxin Wang, Börje F. Karlsson

    Abstract: Named entity recognition (NER) in Chinese is essential but difficult because of the lack of natural delimiters. Therefore, Chinese Word Segmentation (CWS) is usually considered as the first step for Chinese NER. However, models based on word-level embeddings and lexicon features often suffer from segmentation errors and out-of-vocabulary (OOV) words. In this paper, we investigate a Convolutional A… ▽ More

    Submitted 15 July, 2020; v1 submitted 3 April, 2019; originally announced April 2019.

    Comments: This paper is accepted by NAACL-HLT 2019. The code is available at https://github.com/microsoft/vert-papers/tree/master/papers/CAN-NER