Skip to main content

Showing 1–19 of 19 results for author: Ning, K

  1. arXiv:2402.15198  [pdf, other

    cs.LG

    Bidirectional Uncertainty-Based Active Learning for Open Set Annotation

    Authors: Chen-Chen Zong, Ye-Wen Wang, Kun-Peng Ning, Hai-Bo Ye, Sheng-Jun Huang

    Abstract: Active learning (AL) in open set scenarios presents a novel challenge of identifying the most valuable examples in an unlabeled data pool that comprises data from both known and unknown classes. Traditional methods prioritize selecting informative examples with low confidence, with the risk of mistakenly selecting unknown-class examples with similarly low confidence. Recent methods favor the most… ▽ More

    Submitted 6 July, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted to ECCV 2024

  2. arXiv:2402.01830  [pdf, other

    cs.CL cs.AI cs.LG

    PiCO: Peer Review in LLMs based on the Consistency Optimization

    Authors: Kun-Peng Ning, Shuo Yang, Yu-Yang Liu, Jia-Yu Yao, Zhen-Hui Liu, Yu Wang, Ming Pang, Li Yuan

    Abstract: Existing large language models (LLMs) evaluation methods typically focus on testing the performance on some closed-environment and domain-specific benchmarks with human annotations. In this paper, we explore a novel unsupervised evaluation direction, utilizing peer-review mechanisms to measure LLMs automatically. In this setting, both open-source and closed-source LLMs lie in the same environment,… ▽ More

    Submitted 20 April, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  3. arXiv:2311.10372  [pdf, other

    cs.SE

    A Survey of Large Language Models for Code: Evolution, Benchmarking, and Future Trends

    Authors: Zibin Zheng, Kaiwen Ning, Yanlin Wang, Jingwen Zhang, Dewu Zheng, Mingxi Ye, Jiachi Chen

    Abstract: General large language models (LLMs), represented by ChatGPT, have demonstrated significant potential in tasks such as code generation in software engineering. This has led to the development of specialized LLMs for software engineering, known as Code LLMs. A considerable portion of Code LLMs is derived from general LLMs through model fine-tuning. As a result, Code LLMs are often updated frequentl… ▽ More

    Submitted 8 January, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

  4. arXiv:2310.01469  [pdf, other

    cs.CL cs.AI

    LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples

    Authors: Jia-Yu Yao, Kun-Peng Ning, Zhen-Hui Liu, Mu-Nan Ning, Li Yuan

    Abstract: Large Language Models (LLMs), including GPT-3.5, LLaMA, and PaLM, seem to be knowledgeable and able to adapt to many tasks. However, we still can not completely trust their answer, since LLMs suffer from hallucination--fabricating non-existent facts to cheat users without perception. And the reasons for their existence and pervasiveness remain unclear. In this paper, we demonstrate that non-sense… ▽ More

    Submitted 4 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

  5. arXiv:2308.11396  [pdf, other

    cs.SE

    Towards an Understanding of Large Language Models in Software Engineering Tasks

    Authors: Zibin Zheng, Kaiwen Ning, Jiachi Chen, Yanlin Wang, Wenqing Chen, Lianghong Guo, Weicheng Wang

    Abstract: Large Language Models (LLMs) have drawn widespread attention and research due to their astounding performance in tasks such as text generation and reasoning. Derivative products, like ChatGPT, have been extensively deployed and highly sought after. Meanwhile, the evaluation and optimization of LLMs in software engineering tasks, such as code generation, have become a research focus. However, there… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  6. arXiv:2308.04779  [pdf, other

    cs.CV cs.AI

    Multi-View Fusion and Distillation for Subgrade Distresses Detection based on 3D-GPR

    Authors: Chunpeng Zhou, Kangjie Ning, Haishuai Wang, Zhi Yu, Sheng Zhou, Jiajun Bu

    Abstract: The application of 3D ground-penetrating radar (3D-GPR) for subgrade distress detection has gained widespread popularity. To enhance the efficiency and accuracy of detection, pioneering studies have attempted to adopt automatic detection techniques, particularly deep learning. However, existing works typically rely on traditional 1D A-scan, 2D B-scan or 3D C-scan data of the GPR, resulting in eith… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

  7. arXiv:2308.01098  [pdf, other

    cs.IR cs.AI

    Towards Better Query Classification with Multi-Expert Knowledge Condensation in JD Ads Search

    Authors: Kun-Peng Ning, Ming Pang, Zheng Fang, Xue Jiang, Xi-Wei Zhao, Chang-Ping Peng, Zhan-Gang Lin, Jing-He Hu, Jing-Ping Shao

    Abstract: Search query classification, as an effective way to understand user intents, is of great importance in real-world online ads systems. To ensure a lower latency, a shallow model (e.g. FastText) is widely used for efficient online inference. However, the representation ability of the FastText model is insufficient, resulting in poor classification performance, especially on some low-frequency querie… ▽ More

    Submitted 19 November, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

  8. arXiv:2201.06758  [pdf, other

    cs.LG

    Active Learning for Open-set Annotation

    Authors: Kun-Peng Ning, Xun Zhao, Yu Li, Sheng-Jun Huang

    Abstract: Existing active learning studies typically work in the closed-set setting by assuming that all data examples to be labeled are drawn from known classes. However, in real annotation tasks, the unlabeled data usually contains a large amount of examples from unknown classes, resulting in the failure of most active learning methods. To tackle this open-set annotation (OSA) problem, we propose a new ac… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

  9. arXiv:2103.14824  [pdf, other

    cs.LG

    Improving Model Robustness by Adaptively Correcting Perturbation Levels with Active Queries

    Authors: Kun-Peng Ning, Lue Tao, Songcan Chen, Sheng-Jun Huang

    Abstract: In addition to high accuracy, robustness is becoming increasingly important for machine learning models in various applications. Recently, much research has been devoted to improving the model robustness by training with noise perturbations. Most existing studies assume a fixed perturbation level for all training examples, which however hardly holds in real tasks. In fact, excessive perturbations… ▽ More

    Submitted 27 March, 2021; originally announced March 2021.

    Comments: To be published in AAAI-21

  10. arXiv:2103.14823  [pdf, other

    cs.LG cs.AI

    Co-Imitation Learning without Expert Demonstration

    Authors: Kun-Peng Ning, Hu Xu, Kun Zhu, Sheng-Jun Huang

    Abstract: Imitation learning is a primary approach to improve the efficiency of reinforcement learning by exploiting the expert demonstrations. However, in many real scenarios, obtaining expert demonstrations could be extremely expensive or even impossible. To overcome this challenge, in this paper, we propose a novel learning framework called Co-Imitation Learning (CoIL) to exploit the past good experience… ▽ More

    Submitted 23 July, 2023; v1 submitted 27 March, 2021; originally announced March 2021.

  11. arXiv:2006.07808  [pdf, other

    cs.LG stat.ML

    Reinforcement Learning with Supervision from Noisy Demonstrations

    Authors: Kun-Peng Ning, Sheng-Jun Huang

    Abstract: Reinforcement learning has achieved great success in various applications. To learn an effective policy for the agent, it usually requires a huge amount of data by interacting with the environment, which could be computational costly and time consuming. To overcome this challenge, the framework called Reinforcement Learning with Expert Demonstrations (RLED) was proposed to exploit the supervision… ▽ More

    Submitted 14 June, 2020; originally announced June 2020.

  12. arXiv:1808.08803  [pdf, other

    cs.CV

    Attentive Sequence to Sequence Translation for Localizing Clips of Interest by Natural Language Descriptions

    Authors: Ke Ning, Linchao Zhu, Ming Cai, Yi Yang, Di Xie, Fei Wu

    Abstract: We propose a novel attentive sequence to sequence translator (ASST) for clip localization in videos by natural language descriptions. We make two contributions. First, we propose a bi-directional Recurrent Neural Network (RNN) with a finely calibrated vision-language attentive mechanism to comprehensively understand the free-formed natural language descriptions. The RNN parses natural language des… ▽ More

    Submitted 27 August, 2018; originally announced August 2018.

  13. arXiv:1412.7384  [pdf

    q-bio.QM cs.CE cs.LG q-bio.GN

    Microbial community pattern detection in human body habitats via ensemble clustering framework

    Authors: Peng Yang, Xiaoquan Su, Le Ou-Yang, Hon-Nian Chua, Xiao-Li Li, Kang Ning

    Abstract: The human habitat is a host where microbial species evolve, function, and continue to evolve. Elucidating how microbial communities respond to human habitats is a fundamental and critical task, as establishing baselines of human microbiome is essential in understanding its role in human disease and health. However, current studies usually overlook a complex and interconnected landscape of human mi… ▽ More

    Submitted 4 January, 2015; v1 submitted 21 December, 2014; originally announced December 2014.

    Comments: BMC Systems Biology 2014

    Journal ref: BMC Systems Biology 2014, 8(Suppl 4):S7

  14. arXiv:1306.4253  [pdf, ps, other

    cs.DM

    Systematic assessment of the expected length, variance and distribution of Longest Common Subsequences

    Authors: Kang Ning, Kwok Pui Choi

    Abstract: The Longest Common Subsequence (LCS) problem is a very important problem in math- ematics, which has a broad application in scheduling problems, physics and bioinformatics. It is known that the given two random sequences of infinite lengths, the expected length of LCS will be a constant. however, the value of this constant is not yet known. Moreover, the variance distribution of LCS length is also… ▽ More

    Submitted 18 June, 2013; originally announced June 2013.

  15. arXiv:1004.5436  [pdf

    cs.DM q-bio.QM

    Multiple oligo nucleotide arrays: Methods to reduce manufacture time and cost

    Authors: Kang Ning

    Abstract: The customized multiple arrays are becoming vastly used in microarray experiments for varies purposes, mainly for its ability to handle a large quantity of data and output high quality results. However, experimenters who use customized multiple arrays still face many problems, such as the cost and time to manufacture the masks, and the cost for production of the multiple arrays by costly machines.… ▽ More

    Submitted 29 April, 2010; originally announced April 2010.

    Comments: 11 pages, 7 figures. A simple method targets some researchers in the field.

  16. arXiv:0904.1242  [pdf

    cs.DS cs.DC cs.DM

    The Distribution and Deposition Algorithm for Multiple Sequences Sets

    Authors: Kang Ning, Hon Wai Leong

    Abstract: Sequences set is a mathematical model used in many applications. As the number of the sequences becomes larger, single sequence set model is not appropriate for the rapidly increasing problem sizes. For example, more and more text processing applications separate a single big text file into multiple files before processing. For these applications, the underline mathematical model is multiple seq… ▽ More

    Submitted 29 April, 2010; v1 submitted 7 April, 2009; originally announced April 2009.

    Comments: 15 pages, 7 figures, extended version of conference paper presented on GIW 2006, revised version accepted by Journal of Combinatorial Optimization.

  17. A Pseudo DNA Cryptography Method

    Authors: Kang Ning

    Abstract: The DNA cryptography is a new and very promising direction in cryptography research. DNA can be used in cryptography for storing and transmitting the information, as well as for computation. Although in its primitive stage, DNA cryptography is shown to be very effective. Currently, several DNA computing algorithms are proposed for quite some cryptography, cryptanalysis and steganography problems… ▽ More

    Submitted 16 March, 2009; originally announced March 2009.

    Comments: A small work that quite some people asked about

  18. arXiv:0903.2310  [pdf

    cs.DS cs.DM cs.IR cs.OH q-bio.QM

    Analysis of the Relationships among Longest Common Subsequences, Shortest Common Supersequences and Patterns and its application on Pattern Discovery in Biological Sequences

    Authors: Kang Ning, Hoong Kee Ng, Hon Wai Leong

    Abstract: For a set of mulitple sequences, their patterns,Longest Common Subsequences (LCS) and Shortest Common Supersequences (SCS) represent different aspects of these sequences profile, and they can all be used for biological sequence comparisons and analysis. Revealing the relationship between the patterns and LCS,SCS might provide us with a deeper view of the patterns of biological sequences, in turn… ▽ More

    Submitted 13 March, 2009; originally announced March 2009.

    Comments: Extended version of paper presented in IEEE BIBE 2006 submitted to journal for review

  19. arXiv:0903.2015  [pdf

    cs.DS cs.DM math.CO

    Deposition and Extension Approach to Find Longest Common Subsequence for Multiple Sequences

    Authors: Kang Ning

    Abstract: The problem of finding the longest common subsequence (LCS) for a set of sequences is a very interesting and challenging problem in computer science. This problem is NP-complete, but because of its importance, many heuristic algorithms have been proposed, such as Long Run algorithm and Expansion algorithm. However, the performance of many current heuristic algorithms deteriorates fast when the… ▽ More

    Submitted 29 June, 2009; v1 submitted 11 March, 2009; originally announced March 2009.

    Comments: 25 pages, 6 figures. Ready to be submitted