Skip to main content

Showing 1–23 of 23 results for author: Gui, H

  1. arXiv:2402.14710  [pdf, other

    cs.CL cs.AI cs.DB cs.IR cs.LG

    IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus

    Authors: Honghao Gui, Lin Yuan, Hongbin Ye, Ningyu Zhang, Mengshu Sun, Lei Liang, Huajun Chen

    Abstract: Large Language Models (LLMs) demonstrate remarkable potential across various domains; however, they exhibit a significant performance gap in Information Extraction (IE). Note that high-quality instruction data is the vital key for enhancing the specific capabilities of LLMs, while current IE datasets tend to be small in scale, fragmented, and lack standardized schema. To this end, we introduce IEP… ▽ More

    Submitted 26 May, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: ACL 2024 (short); 21 pages; Github: https://github.com/zjunlp/IEPile

  2. arXiv:2402.04644  [pdf, other

    cs.LG cs.AI

    LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views

    Authors: Yuji Roh, Qingyun Liu, Huan Gui, Zhe Yuan, Yujin Tang, Steven Euijong Whang, Liang Liu, Shuchao Bi, Lichan Hong, Ed H. Chi, Zhe Zhao

    Abstract: Fine-tuning is becoming widely used for leveraging the power of pre-trained foundation models in new downstream tasks. While there are many successes of fine-tuning on various tasks, recent studies have observed challenges in the generalization of fine-tuned models to unseen distributions (i.e., out-of-distribution; OOD). To improve OOD generalization, some previous studies identify the limitation… ▽ More

    Submitted 18 June, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: In Proceedings of the 41st International Conference on Machine Learning (ICML), 2024

  3. arXiv:2402.03049  [pdf, other

    cs.CL cs.AI cs.HC cs.IR cs.LG

    EasyInstruct: An Easy-to-use Instruction Processing Framework for Large Language Models

    Authors: Yixin Ou, Ningyu Zhang, Honghao Gui, Ziwen Xu, Shuofei Qiao, Yida Xue, Runnan Fang, Kangwei Liu, Lei Li, Zhen Bi, Guozhou Zheng, Huajun Chen

    Abstract: In recent years, instruction tuning has gained increasing attention and emerged as a crucial technique to enhance the capabilities of Large Language Models (LLMs). To construct high-quality instruction datasets, many instruction processing approaches have been proposed, aiming to achieve a delicate balance between data quantity and data quality. Nevertheless, due to inconsistencies that persist am… ▽ More

    Submitted 23 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: ACL 2024 System Demonstrations; Project website: https://zjunlp.github.io/project/EasyInstruct Code: https://github.com/zjunlp/EasyInstruct Video: https://youtu.be/rfQOWYfziFo Demo: https://huggingface.co/spaces/zjunlp/EasyInstruct

  4. arXiv:2312.03022  [pdf, other

    cs.AI cs.CL cs.LG

    Beyond Isolation: Multi-Agent Synergy for Improving Knowledge Graph Construction

    Authors: Hongbin Ye, Honghao Gui, Aijia Zhang, Tong Liu, Wei Hua, Weiqiang Jia

    Abstract: Knowledge graph construction (KGC) is a multifaceted undertaking involving the extraction of entities, relations, and events. Traditionally, large language models (LLMs) have been viewed as solitary task-solving agents in this complex landscape. However, this paper challenges this paradigm by introducing a novel framework, CooperKGC. Departing from the conventional approach, CooperKGC establishes… ▽ More

    Submitted 29 December, 2023; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: work in progress; 12 pages

  5. arXiv:2311.07178  [pdf, other

    cs.AI cs.GT cs.LG

    Game Solving with Online Fine-Tuning

    Authors: Ti-Rong Wu, Hung Guei, Ting Han Wei, Chung-Chin Shih, Jui-Te Chin, I-Chen Wu

    Abstract: Game solving is a similar, yet more difficult task than mastering a game. Solving a game typically means to find the game-theoretic value (outcome given optimal play), and optionally a full strategy to follow in order to achieve that outcome. The AlphaZero algorithm has demonstrated super-human level play, and its powerful policy and value predictions have also served as heuristics in game solving… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: Accepted by the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  6. arXiv:2311.05884  [pdf, other

    cs.IR cs.LG

    Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems

    Authors: Huan Gui, Ruoxi Wang, Ke Yin, Long Jin, Maciej Kula, Taibai Xu, Lichan Hong, Ed H. Chi

    Abstract: Learning feature interaction is the critical backbone to building recommender systems. In web-scale applications, learning feature interaction is extremely challenging due to the sparse and large input feature space; meanwhile, manually crafting effective feature interactions is infeasible because of the exponential solution space. We propose to leverage a Transformer-based architecture with atten… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  7. arXiv:2310.12086  [pdf, other

    cs.CL cs.AI cs.CV cs.IR cs.LG

    FactCHD: Benchmarking Fact-Conflicting Hallucination Detection

    Authors: Xiang Chen, Duanzheng Song, Honghao Gui, Chenxi Wang, Ningyu Zhang, Yong Jiang, Fei Huang, Chengfei Lv, Dan Zhang, Huajun Chen

    Abstract: Despite their impressive generative capabilities, LLMs are hindered by fact-conflicting hallucinations in real-world applications. The accurate identification of hallucinations in texts generated by LLMs, especially in complex inferential scenarios, is a relatively unexplored area. To address this gap, we present FactCHD, a dedicated benchmark designed for the detection of fact-conflicting halluci… ▽ More

    Submitted 26 May, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: IJCAI 2024

  8. arXiv:2310.11305  [pdf, other

    cs.AI cs.LG

    MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games

    Authors: Ti-Rong Wu, Hung Guei, Pei-Chiun Peng, Po-Wei Huang, Ting Han Wei, Chung-Chin Shih, Yun-Jui Tsai

    Abstract: This paper presents MiniZero, a zero-knowledge learning framework that supports four state-of-the-art algorithms, including AlphaZero, MuZero, Gumbel AlphaZero, and Gumbel MuZero. While these algorithms have demonstrated super-human performance in many games, it remains unclear which among them is most suitable or efficient for specific tasks. Through MiniZero, we systematically evaluate the perfo… ▽ More

    Submitted 26 April, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted by IEEE Transactions on Games

  9. arXiv:2310.03188  [pdf, other

    cs.AI

    Talking Models: Distill Pre-trained Knowledge to Downstream Models via Interactive Communication

    Authors: Zhe Zhao, Qingyun Liu, Huan Gui, Bang An, Lichan Hong, Ed H. Chi

    Abstract: Many recent breakthroughs in machine learning have been enabled by the pre-trained foundation models. By scaling up model parameters, training data, and computation resources, foundation models have significantly advanced the state-of-the-art in many applications. However, it is still an open question of how to use these models to perform downstream tasks efficiently. Knowledge distillation (KD) h… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: 19 pages, 3 figures

  10. arXiv:2309.00087  [pdf

    cs.CL cs.AI cs.CY

    Large language models in medicine: the potentials and pitfalls

    Authors: Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou

    Abstract: Large language models (LLMs) have been applied to tasks in healthcare, ranging from medical exam questions to responding to patient questions. With increasing institutional partnerships between companies producing LLMs and healthcare systems, real world clinical application is coming closer to reality. As these models gain traction, it is essential for healthcare practitioners to understand what L… ▽ More

    Submitted 31 August, 2023; originally announced September 2023.

    Journal ref: Ann. Intern. Med., 177(2), 210-220(2024)

  11. arXiv:2305.13068  [pdf, other

    cs.CL cs.AI cs.HC cs.IR cs.LG

    Making Language Models Better Tool Learners with Execution Feedback

    Authors: Shuofei Qiao, Honghao Gui, Chengfei Lv, Qianghuai Jia, Huajun Chen, Ningyu Zhang

    Abstract: Tools serve as pivotal interfaces that enable humans to understand and reshape the environment. With the advent of foundation models, AI systems can utilize tools to expand their capabilities and interact with the real world. Existing tool learning methodologies, encompassing supervised fine-tuning and prompt engineering approaches, often induce large language models to utilize tools indiscriminat… ▽ More

    Submitted 14 March, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: NAACL 2024

  12. arXiv:2305.11527  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    InstructIE: A Bilingual Instruction-based Information Extraction Dataset

    Authors: Honghao Gui, Shuofei Qiao, Jintian Zhang, Hongbin Ye, Mengshu Sun, Lei Liang, Jeff Z. Pan, Huajun Chen, Ningyu Zhang

    Abstract: Large language models can perform well on general natural language tasks, but their effectiveness is still not optimal for information extraction. Recent works indicate that the main reason lies in the lack of extensive data on information extraction instructions. Note that the existing datasets on information extraction instructions not only have limited coverage but also involve high constructio… ▽ More

    Submitted 18 April, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: Work in progress; project homepage: https://www.zjukg.org/project/InstructIE/ dataset: https://huggingface.co/datasets/zjunlp/InstructIE

  13. arXiv:2305.08703  [pdf, other

    cs.CL cs.AI cs.DB cs.IR cs.LG

    Schema-adaptable Knowledge Graph Construction

    Authors: Hongbin Ye, Honghao Gui, Xin Xu, Xi Chen, Huajun Chen, Ningyu Zhang

    Abstract: Conventional Knowledge Graph Construction (KGC) approaches typically follow the static information extraction paradigm with a closed set of pre-defined schema. As a result, such approaches fall short when applied to dynamic scenarios or domains, whereas a new type of knowledge emerges. This necessitates a system that can handle evolving schema automatically to extract information for KGC. To addre… ▽ More

    Submitted 15 November, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023 (Findings)

  14. On Reinforcement Learning for the Game of 2048

    Authors: Hung Guei

    Abstract: 2048 is a single-player stochastic puzzle game. This intriguing and addictive game has been popular worldwide and has attracted researchers to develop game-playing programs. Due to its simplicity and complexity, 2048 has become an interesting and challenging platform for evaluating the effectiveness of machine learning methods. This dissertation conducts comprehensive research on reinforcement lea… ▽ More

    Submitted 10 January, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: A Ph.D. dissertation submitted to Institute of Computer Science and Engineering, National Yang Ming Chiao Tung University

    ACM Class: I.2.6; I.2.8

  15. arXiv:2206.02541  [pdf

    cs.CR cs.AI

    PCPT and ACPT: Copyright Protection and Traceability Scheme for DNN Models

    Authors: Xuefeng Fan, Dahao Fu, Hangyu Gui, Xinpeng Zhang, Xiaoyi Zhou

    Abstract: Deep neural networks (DNNs) have achieved tremendous success in artificial intelligence (AI) fields. However, DNN models can be easily illegally copied, redistributed, or abused by criminals, seriously damaging the interests of model inventors. The copyright protection of DNN models by neural network watermarking has been studied, but the establishment of a traceability mechanism for determining t… ▽ More

    Submitted 28 November, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

  16. Optimistic Temporal Difference Learning for 2048

    Authors: Hung Guei, Lung-Pin Chen, I-Chen Wu

    Abstract: Temporal difference (TD) learning and its variants, such as multistage TD (MS-TD) learning and temporal coherence (TC) learning, have been successfully applied to 2048. These methods rely on the stochasticity of the environment of 2048 for exploration. In this paper, we propose to employ optimistic initialization (OI) to encourage exploration for 2048, and empirically show that the learning qualit… ▽ More

    Submitted 22 November, 2021; originally announced November 2021.

    Comments: Accepted by the IEEE Transactions on Games, September 3, 2021

    ACM Class: I.2.6; I.2.8

  17. arXiv:1803.03370  [pdf, other

    cs.IR cs.AI cs.CL cs.SI

    Expert Finding in Heterogeneous Bibliographic Networks with Locally-trained Embeddings

    Authors: Huan Gui, Qi Zhu, Liyuan Liu, Aston Zhang, Jiawei Han

    Abstract: Expert finding is an important task in both industry and academia. It is challenging to rank candidates with appropriate expertise for various queries. In addition, different types of objects interact with one another, which naturally forms heterogeneous information networks. We study the task of expert finding in heterogeneous bibliographical networks based on two aspects: textual content analysi… ▽ More

    Submitted 8 March, 2018; originally announced March 2018.

  18. arXiv:1803.01848  [pdf, other

    cs.SI cs.LG

    AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks

    Authors: Yu Shi, Huan Gui, Qi Zhu, Lance Kaplan, Jiawei Han

    Abstract: Heterogeneous information networks (HINs) are ubiquitous in real-world applications. Due to the heterogeneity in HINs, the typed edges may not fully align with each other. In order to capture the semantic subtlety, we propose the concept of aspects with each aspect being a unit representing one underlying semantic facet. Meanwhile, network embedding has emerged as a powerful method for learning ne… ▽ More

    Submitted 5 March, 2018; originally announced March 2018.

    Comments: 11 pages including additional supplementary materials. In Proceedings of the 2018 SIAM International Conference on Data Mining, San Diego, California, USA, SIAM, 2018

  19. arXiv:1712.08357  [pdf

    cs.IR

    Integrating Knowledge from Latent and Explicit Features for Triple Scoring - Team Radicchio's Triple Scorer at WSDM Cup 2017

    Authors: Liang-Wei Chen, Bhargav Mangipudi, Jayachandu Bandlamudi, Richa Sehgal, Yun Hao, Meng Jiang, Huan Gui

    Abstract: The objective of the triple scoring task in WSDM Cup 2017 is to compute relevance scores for knowledge-base triples of type-like relations. For example, consider Julius Caesar who has had various professions, including Politician and Author. For two given triples (Julius Caesar, profession, Politician) and (Julius Caesar, profession, Author), the former triple is likely to have a higher relevance… ▽ More

    Submitted 22 December, 2017; originally announced December 2017.

    Comments: Triple Scorer at WSDM Cup 2017, see arXiv:1712.08081

    ACM Class: H.3

  20. arXiv:1712.06922  [pdf

    cs.IR

    Wikidata Vandalism Detection - The Loganberry Vandalism Detector at WSDM Cup 2017

    Authors: Qi Zhu, Hongwei Ng, Liyuan Liu, Ziwei Ji, Bingjie Jiang, Jiaming Shen, Huan Gui

    Abstract: Wikidata is the new, large-scale knowledge base of the Wikimedia Foundation. As it can be edited by anyone, entries frequently get vandalized, leading to the possibility that it might spread of falsified information if such posts are not detected. The WSDM 2017 Wiki Vandalism Detection Challenge requires us to solve this problem by computing a vandalism score denoting the likelihood that a revisio… ▽ More

    Submitted 19 December, 2017; originally announced December 2017.

    Comments: Vandalism Detector at WSDM Cup 2017, see arXiv:1712.05956

    ACM Class: H.3

  21. arXiv:1709.04109  [pdf, other

    cs.CL cs.LG

    Empower Sequence Labeling with Task-Aware Neural Language Model

    Authors: Liyuan Liu, Jingbo Shang, Frank F. Xu, Xiang Ren, Huan Gui, Jian Peng, Jiawei Han

    Abstract: Linguistic sequence labeling is a general modeling approach that encompasses a variety of problems, such as part-of-speech tagging and named entity recognition. Recent advances in neural networks (NNs) make it possible to build reliable models without handcrafted features. However, in many cases, it is hard to obtain sufficient annotations to train these models. In this study, we develop a novel n… ▽ More

    Submitted 23 November, 2017; v1 submitted 12 September, 2017; originally announced September 2017.

    Comments: AAAI 2018

  22. arXiv:1707.00166  [pdf, other

    cs.CL

    Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach

    Authors: Liyuan Liu, Xiang Ren, Qi Zhu, Shi Zhi, Huan Gui, Heng Ji, Jiawei Han

    Abstract: Relation extraction is a fundamental task in information extraction. Most existing methods have heavy reliance on annotations labeled by human experts, which are costly and time-consuming. To overcome this drawback, we propose a novel framework, REHession, to conduct relation extractor learning using annotations from heterogeneous information source, e.g., knowledge base and domain heuristics. The… ▽ More

    Submitted 1 August, 2017; v1 submitted 1 July, 2017; originally announced July 2017.

    Comments: EMNLP 2017

  23. PReP: Path-Based Relevance from a Probabilistic Perspective in Heterogeneous Information Networks

    Authors: Yu Shi, Po-Wei Chan, Honglei Zhuang, Huan Gui, Jiawei Han

    Abstract: As a powerful representation paradigm for networked and multi-typed data, the heterogeneous information network (HIN) is ubiquitous. Meanwhile, defining proper relevance measures has always been a fundamental problem and of great pragmatic importance for network mining tasks. Inspired by our probabilistic interpretation of existing path-based relevance measures, we propose to study HIN relevance f… ▽ More

    Submitted 20 February, 2019; v1 submitted 4 June, 2017; originally announced June 2017.

    Comments: 10 pages. In Proceedings of the 23nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, Nova Scotia, Canada, ACM, 2017