Skip to main content

Showing 1–45 of 45 results for author: Zhu, K Q

  1. arXiv:2402.18409  [pdf, other

    cs.AI cs.CL cs.CV

    A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision-Language Models

    Authors: Xiujie Song, Mengyue Wu, Kenny Q. Zhu, Chunhao Zhang, Yanyi Chen

    Abstract: Large Vision-Language Models (LVLMs), despite their recent success, are hardly comprehensively tested for their cognitive abilities. Inspired by the prevalent use of the "Cookie Theft" task in human cognition test, we propose a novel evaluation benchmark to evaluate high-level cognitive ability of LVLMs using images with rich semantics. It defines eight reasoning capabilities and consists of an im… ▽ More

    Submitted 14 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

  2. arXiv:2402.15985  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Phonetic and Lexical Discovery of a Canine Language using HuBERT

    Authors: Xingyuan Li, Sinong Wang, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu

    Abstract: This paper delves into the pioneering exploration of potential communication patterns within dog vocalizations and transcends traditional linguistic analysis barriers, which heavily relies on human priori knowledge on limited datasets to find sound units in dog vocalization. We present a self-supervised approach with HuBERT, enabling the accurate classification of phoneme labels and the identifica… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  3. arXiv:2402.06262  [pdf, other

    cs.CL cs.AI

    On the Efficacy of Eviction Policy for Key-Value Constrained Generative Language Model Inference

    Authors: Siyu Ren, Kenny Q. Zhu

    Abstract: Despite the recent success associated with Large Language Models (LLMs), they are notably cost-prohibitive to deploy in resource-constrained environments due to their excessive memory and computational demands. In addition to model parameters, the key-value cache is also stored in GPU memory, growing linearly with batch size and sequence length. As a remedy, recent works have proposed various evic… ▽ More

    Submitted 17 February, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

  4. arXiv:2311.09189  [pdf, other

    cs.CL

    PsyEval: A Suite of Mental Health Related Tasks for Evaluating Large Language Models

    Authors: Haoan Jin, Siyuan Chen, Dilawaier Dilixiati, Yewei Jiang, Mengyue Wu, Kenny Q. Zhu

    Abstract: Evaluating Large Language Models (LLMs) in the mental health domain poses distinct challenged from other domains, given the subtle and highly subjective nature of symptoms that exhibit significant variability among individuals. This paper presents PsyEval, the first comprehensive suite of mental health-related tasks for evaluating LLMs. PsyEval encompasses five sub-tasks that evaluate three critic… ▽ More

    Submitted 3 June, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

  5. arXiv:2310.11648  [pdf, other

    cs.CL

    Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model

    Authors: Qi Jia, Siyu Ren, Yizhu Liu, Kenny Q. Zhu

    Abstract: Despite tremendous improvements in natural language generation, summarization models still suffer from the unfaithfulness issue. Previous work evaluates faithfulness either using models trained on the other tasks or in-domain synthetic data, or prompting a large model such as ChatGPT. This paper proposes to do zero-shot faithfulness evaluation simply with a moderately-sized foundation language mod… ▽ More

    Submitted 14 December, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted by EMNLP2023

  6. arXiv:2310.08152  [pdf, other

    cs.CL

    Context Compression for Auto-regressive Transformers with Sentinel Tokens

    Authors: Siyu Ren, Qi Jia, Kenny Q. Zhu

    Abstract: The quadratic complexity of the attention module makes it gradually become the bulk of compute in Transformer-based LLMs during generation. Moreover, the excessive key-value cache that arises when dealing with long inputs also brings severe issues on memory footprint and inference latency. In this work, we propose a plug-and-play approach that is able to incrementally compress the intermediate act… ▽ More

    Submitted 15 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: To appear at EMNLP 2023 main conference

  7. arXiv:2310.04691  [pdf, other

    cs.CL

    EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling

    Authors: Siyu Ren, Zhiyong Wu, Kenny Q. Zhu

    Abstract: Neural language models are probabilistic models of human text. They are predominantly trained using maximum likelihood estimation (MLE), which is equivalent to minimizing the forward cross-entropy between the empirical data distribution and the model distribution. However, various degeneration phenomena are still widely observed when decoding from the distributions learned by such models. We estab… ▽ More

    Submitted 1 February, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: To appear at ICLR 2024

  8. arXiv:2306.14152  [pdf, other

    cs.CL

    Low-Rank Prune-And-Factorize for Language Model Compression

    Authors: Siyu Ren, Kenny Q. Zhu

    Abstract: The components underpinning PLMs -- large weight matrices -- were shown to bear considerable redundancy. Matrix factorization, a well-established technique from matrix theory, has been utilized to reduce the number of parameters in PLM. However, it fails to retain satisfactory performance under moderate to high compression rate. In this paper, we identify the \textit{full-rankness} of fine-tuned P… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

    Comments: Model Compression

  9. arXiv:2305.13833  [pdf, other

    cs.CL

    Reducing Sensitivity on Speaker Names for Text Generation from Dialogues

    Authors: Qi Jia, Haifeng Tang, Kenny Q. Zhu

    Abstract: Changing speaker names consistently throughout a dialogue should not affect its meaning and corresponding outputs for text generation from dialogues. However, pre-trained language models, serving as the backbone for dialogue-processing tasks, have shown to be sensitive to nuances. This may result in unfairness in real-world applications. No comprehensive analysis of this problem has been done in t… ▽ More

    Submitted 20 August, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: findings of ACL'23

  10. arXiv:2305.13614  [pdf, other

    cs.CL

    LLM-empowered Chatbots for Psychiatrist and Patient Simulation: Application and Evaluation

    Authors: Siyuan Chen, Mengyue Wu, Kenny Q. Zhu, Kunyao Lan, Zhiling Zhang, Lyuchun Cui

    Abstract: Empowering chatbots in the field of mental health is receiving increasing amount of attention, while there still lacks exploration in developing and evaluating chatbots in psychiatric outpatient scenarios. In this work, we focus on exploring the potential of ChatGPT in powering chatbots for psychiatrist and patient simulation. We collaborate with psychiatrists to identify objectives and iterativel… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  11. arXiv:2305.12394  [pdf, other

    cs.CL

    Pruning Pre-trained Language Models with Principled Importance and Self-regularization

    Authors: Siyu Ren, Kenny Q. Zhu

    Abstract: Iterative pruning is one of the most effective compression methods for pre-trained language models. We discovered that finding the optimal pruning decision is an equality-constrained 0-1 Integer Linear Programming problem. The solution to this optimization problem leads to a principled importance criterion which we use to rank parameters during iterative model pruning. To mitigate the poor general… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

    Comments: Accepted at Findings of ACL 2023

  12. arXiv:2305.02820  [pdf, other

    cs.CL

    Semantic Space Grounded Weighted Decoding for Multi-Attribute Controllable Dialogue Generation

    Authors: Zhiling Zhang, Mengyue Wu, Kenny Q. Zhu

    Abstract: Controlling chatbot utterance generation with multiple attributes such as personalities, emotions and dialogue acts is a practically useful but under-studied problem. We propose a novel framework called DASC that possesses strong controllability with a weighted decoding paradigm, while improving generation quality with the grounding in an attribute semantics space. Generation with multiple attribu… ▽ More

    Submitted 5 November, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023

  13. arXiv:2303.07902  [pdf, other

    cs.SD eess.AS

    BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet Tag-guided Synthetic Data

    Authors: Xuenan Xu, Zhiling Zhang, Zelin Zhou, Pingyue Zhang, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu

    Abstract: Compared with ample visual-text pre-training research, few works explore audio-text pre-training, mostly due to the lack of sufficient parallel audio-text data. Most existing methods incorporate the visual modality as a pivot for audio-text pre-training, which inevitably induces data noise. In this paper, we propose to utilize audio captioning to generate text directly from audio, without the aid… ▽ More

    Submitted 5 March, 2024; v1 submitted 14 March, 2023; originally announced March 2023.

  14. arXiv:2211.11297  [pdf, other

    cs.CL

    In-sample Curriculum Learning by Sequence Completion for Natural Language Generation

    Authors: Qi Jia, Yizhu Liu, Haifeng Tang, Kenny Q. Zhu

    Abstract: Curriculum learning has shown promising improvements in multiple domains by training machine learning models from easy samples to hard ones. Previous works which either design rules or train models for scoring the difficulty highly rely on task-specific expertise, and cannot generalize. Inspired by the "easy-to-hard" intuition, we propose to do in-sample curriculum learning for natural language ge… ▽ More

    Submitted 23 May, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: ACL'23

  15. arXiv:2210.09894  [pdf, other

    cs.CL

    Taxonomy of Abstractive Dialogue Summarization: Scenarios, Approaches and Future Directions

    Authors: Qi Jia, Yizhu Liu, Siyu Ren, Kenny Q. Zhu

    Abstract: Abstractive dialogue summarization is to generate a concise and fluent summary covering the salient information in a dialogue among two or more interlocutors. It has attracted great attention in recent years based on the massive emergence of social communication platforms and an urgent requirement for efficient dialogue information understanding and digestion. Different from news or articles in tr… ▽ More

    Submitted 6 August, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: Under review at ACM Computing Surveys (CSUR), submitted in January 2022

  16. arXiv:2205.11308  [pdf, other

    cs.CL

    Symptom Identification for Interpretable Detection of Multiple Mental Disorders

    Authors: Zhiling Zhang, Siyuan Chen, Mengyue Wu, Kenny Q. Zhu

    Abstract: Mental disease detection (MDD) from social media has suffered from poor generalizability and interpretability, due to lack of symptom modeling. This paper introduces PsySym, the first annotated symptom identification corpus of multiple psychiatric disorders, to facilitate further research progress. PsySym is annotated according to a knowledge graph of the 38 symptom classes related to 7 mental dis… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

  17. arXiv:2205.10593  [pdf, other

    cs.CL cs.AI

    Few-Shot Natural Language Inference Generation with PDD: Prompt and Dynamic Demonstration

    Authors: Kaijian Li, Shansan Gong, Kenny Q. Zhu

    Abstract: Natural Language Inference Generation task is to generate a text hypothesis given a text premise and a logical relation between the two. This task can be used in data augmentation and controllable text generation in practice. In this paper, we propose language models with prompt and dynamic demonstration (LM-PDD) to tackle this problem in few-shot settings. Our framework outperforms standard fine-… ▽ More

    Submitted 21 May, 2022; originally announced May 2022.

    Comments: 13 pages

  18. arXiv:2205.09497  [pdf, other

    cs.CL

    Psychiatric Scale Guided Risky Post Screening for Early Detection of Depression

    Authors: Zhiling Zhang, Siyuan Chen, Mengyue Wu, Kenny Q. Zhu

    Abstract: Depression is a prominent health challenge to the world, and early risk detection (ERD) of depression from online posts can be a promising technique for combating the threat. Early depression detection faces the challenge of efficiently tackling streaming data, balancing the tradeoff between timeliness, accuracy and explainability. To tackle these challenges, we propose a psychiatric scale guided… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: IJCAI 2022 AI for Good

  19. Positive, Negative and Neutral: Modeling Implicit Feedback in Session-based News Recommendation

    Authors: Shansan Gong, Kenny Q. Zhu

    Abstract: News recommendation for anonymous readers is a useful but challenging task for many news portals, where interactions between readers and articles are limited within a temporary login session. Previous works tend to formulate session-based recommendation as a next item prediction task, while they neglect the implicit feedback from user behaviors, which indicates what users really like or dislike. H… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: Accepted by SIGIR 2022 main conference

    ACM Class: H.3.3

  20. arXiv:2204.13913  [pdf, ps, other

    cs.CV cs.CL

    Leaner and Faster: Two-Stage Model Compression for Lightweight Text-Image Retrieval

    Authors: Siyu Ren, Kenny Q. Zhu

    Abstract: Current text-image approaches (e.g., CLIP) typically adopt dual-encoder architecture using pre-trained vision-language representation. However, these models still pose non-trivial memory requirements and substantial incremental indexing time, which makes them less practical on mobile devices. In this paper, we present an effective two-stage framework to compress large pre-trained dual-encoder for… ▽ More

    Submitted 29 April, 2022; originally announced April 2022.

    Comments: Accepted by NAACL 2022 main conference

  21. arXiv:2204.13498  [pdf, other

    cs.CL

    Post-Training Dialogue Summarization using Pseudo-Paraphrasing

    Authors: Qi Jia, Yizhu Liu, Haifeng Tang, Kenny Q. Zhu

    Abstract: Previous dialogue summarization techniques adapt large language models pretrained on the narrative text by injecting dialogue-specific features into the models. These features either require additional knowledge to recognize or make the resulting models harder to tune. To bridge the format gap between dialogues and narrative summaries in dialogue summarization tasks, we propose to post-train pretr… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: Findings of NAACL 2022

  22. arXiv:2201.10376  [pdf, other

    cs.CL

    Modeling Multi-level Context for Informational Bias Detection by Contrastive Learning and Sentential Graph Network

    Authors: Shijia Guo, Kenny Q. Zhu

    Abstract: Informational bias is widely present in news articles. It refers to providing one-sided, selective or suggestive information of specific aspects of certain entity to guide a specific interpretation, thereby biasing the reader's opinion. Sentence-level informational bias detection is a very challenging task in a way that such bias can only be revealed together with the context, examples include col… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    Comments: 10 pages including bibliography

  23. arXiv:2110.04684  [pdf, other

    cs.SD cs.CL eess.AS

    Can Audio Captions Be Evaluated with Image Caption Metrics?

    Authors: Zelin Zhou, Zhiling Zhang, Xuenan Xu, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu

    Abstract: Automated audio captioning aims at generating textual descriptions for an audio clip. To evaluate the quality of generated audio captions, previous works directly adopt image captioning metrics like SPICE and CIDEr, without justifying their suitability in this new domain, which may mislead the development of advanced models. This problem is still unstudied due to the lack of human judgment dataset… ▽ More

    Submitted 27 January, 2022; v1 submitted 9 October, 2021; originally announced October 2021.

    Comments: ICASSP 2022

  24. Enriching Ontology with Temporal Commonsense for Low-Resource Audio Tagging

    Authors: Zhiling Zhang, Zelin Zhou, Haifeng Tang, Guangwei Li, Mengyue Wu, Kenny Q. Zhu

    Abstract: Audio tagging aims at predicting sound events occurred in a recording. Traditional models require enormous laborious annotations, otherwise performance degeneration will be the norm. Therefore, we investigate robust audio tagging models in low-resource scenarios with the enhancement of knowledge graphs. Besides existing ontological knowledge, we further propose a semi-automatic approach that can c… ▽ More

    Submitted 3 October, 2021; originally announced October 2021.

    Comments: CIKM 2021

  25. Diverse and Specific Clarification Question Generation with Keywords

    Authors: Zhiling Zhang, Kenny Q. Zhu

    Abstract: Product descriptions on e-commerce websites often suffer from missing important aspects. Clarification question generation (CQGen) can be a promising approach to help alleviate the problem. Unlike traditional QGen assuming the existence of answers in the context and generating questions accordingly, CQGen mimics user behaviors of asking for unstated information. The generated CQs can serve as a sa… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

    Comments: 11 pages, 3 figures, WWW 2021

  26. arXiv:2102.04632  [pdf, ps, other

    cs.CL cs.AI

    Statistically Profiling Biases in Natural Language Reasoning Datasets and Models

    Authors: Shanshan Huang, Kenny Q. Zhu

    Abstract: Recent work has indicated that many natural language understanding and reasoning datasets contain statistical cues that may be taken advantaged of by NLP models whose capability may thus be grossly overestimated. To discover the potential weakness in the models, some human-designed stress tests have been proposed but they are expensive to create and do not generalize to arbitrary models. We propos… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

  27. arXiv:2012.02553  [pdf, other

    cs.CL

    DDRel: A New Dataset for Interpersonal Relation Classification in Dyadic Dialogues

    Authors: Qi Jia, Hongru Huang, Kenny Q. Zhu

    Abstract: Interpersonal language style shifting in dialogues is an interesting and almost instinctive ability of human. Understanding interpersonal relationship from language content is also a crucial step toward further understanding dialogues. Previous work mainly focuses on relation extraction between named entities in texts. In this paper, we propose the task of relation classification of interlocutors… ▽ More

    Submitted 4 December, 2020; originally announced December 2020.

    Comments: This paper has been accepted by AAAI2021

  28. arXiv:2010.01502  [pdf, other

    cs.CL

    Multi-turn Response Selection using Dialogue Dependency Relations

    Authors: Qi Jia, Yizhu Liu, Siyu Ren, Kenny Q. Zhu, Haifeng Tang

    Abstract: Multi-turn response selection is a task designed for developing dialogue agents. The performance on this task has a remarkable improvement with pre-trained language models. However, these models simply concatenate the turns in dialogue history as the input and largely ignore the dependencies between the turns. In this paper, we propose a dialogue extraction algorithm to transform a dialogue histor… ▽ More

    Submitted 30 November, 2023; v1 submitted 4 October, 2020; originally announced October 2020.

    Comments: Accepted for publication as a long paper in EMNLP2020

    Journal ref: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

  29. Matching Questions and Answers in Dialogues from Online Forums

    Authors: Qi Jia, Mengxue Zhang, Shengyao Zhang, Kenny Q. Zhu

    Abstract: Matching question-answer relations between two turns in conversations is not only the first step in analyzing dialogue structures, but also valuable for training dialogue systems. This paper presents a QA matching model considering both distance information and dialogue history by two simultaneous attention mechanisms called mutual attention. Given scores computed by the trained model between each… ▽ More

    Submitted 2 August, 2020; v1 submitted 19 May, 2020; originally announced May 2020.

    Comments: Accepted at ECAI2020

    Journal ref: ECAI 2020: 24th European Conference on Artificial Intelligence

  30. arXiv:2004.14164  [pdf, other

    cs.CL cs.LG stat.ML

    MICK: A Meta-Learning Framework for Few-shot Relation Classification with Small Training Data

    Authors: Xiaoqing Geng, Xiwen Chen, Kenny Q. Zhu, Libin Shen, Yinggong Zhao

    Abstract: Few-shot relation classification seeks to classify incoming query instances after meeting only few support instances. This ability is gained by training with large amount of in-domain annotated data. In this paper, we tackle an even harder problem by further limiting the amount of data available at training time. We propose a few-shot learning framework for relation classification, which is partic… ▽ More

    Submitted 14 December, 2020; v1 submitted 26 April, 2020; originally announced April 2020.

    Journal ref: CIKM 2020: The 29th ACM International Conference on Information and Knowledge Management

  31. arXiv:2004.11742  [pdf, other

    cs.CL

    ST$^2$: Small-data Text Style Transfer via Multi-task Meta-Learning

    Authors: Xiwen Chen, Kenny Q. Zhu

    Abstract: Text style transfer aims to paraphrase a sentence in one style into another style while preserving content. Due to lack of parallel training data, state-of-art methods are unsupervised and rely on large datasets that share content. Furthermore, existing methods have been applied on very limited categories of styles such as positive/negative and formal/informal. In this work, we develop a meta-lear… ▽ More

    Submitted 24 April, 2020; originally announced April 2020.

    Comments: 9 pages, 11 figures

  32. arXiv:2004.09853  [pdf, ps, other

    cs.CL cs.AI

    Knowledge-Driven Distractor Generation for Cloze-style Multiple Choice Questions

    Authors: Siyu Ren, Kenny Q. Zhu

    Abstract: In this paper, we propose a novel configurable framework to automatically generate distractive choices for open-domain cloze-style multiple-choice questions, which incorporates a general-purpose knowledge base to effectively create a small distractor candidate set, and a feature-rich learning-to-rank model to select distractors that are both plausible and reliable. Experimental results on datasets… ▽ More

    Submitted 7 December, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

    Comments: To appear at AAAI 2021

  33. arXiv:2004.01358  [pdf, other

    cs.LG stat.ML

    Unpack Local Model Interpretation for GBDT

    Authors: Wenjing Fang, Jun Zhou, Xiaolong Li, Kenny Q. Zhu

    Abstract: A gradient boosting decision tree (GBDT), which aggregates a collection of single weak learners (i.e. decision trees), is widely used for data mining tasks. Because GBDT inherits the good performance from its ensemble essence, much attention has been drawn to the optimization of this model. With its popularization, an increasing need for model interpretation arises. Besides the commonly used featu… ▽ More

    Submitted 2 April, 2020; originally announced April 2020.

    Comments: 12 pages, 5 figures

  34. arXiv:2003.13230  [pdf, ps, other

    cs.IR cs.CL

    AliCoCo: Alibaba E-commerce Cognitive Concept Net

    Authors: Xusheng Luo, Luxin Liu, Yonghua Yang, Le Bo, Yuanpeng Cao, Jinhang Wu, Qiang Li, Keping Yang, Kenny Q. Zhu

    Abstract: One of the ultimate goals of e-commerce platforms is to satisfy various shopping needs for their customers. Much efforts are devoted to creating taxonomies or ontologies in e-commerce towards this goal. However, user needs in e-commerce are still not well defined, and none of the existing ontologies has the enough depth and breadth for universal user needs understanding. The semantic gap in-betwee… ▽ More

    Submitted 30 March, 2020; originally announced March 2020.

    Comments: 15 pages. Accepted by SIGMOD 2020 Industry Track

  35. arXiv:2002.11982  [pdf, other

    cs.LG stat.ML

    Adapted tree boosting for Transfer Learning

    Authors: Wenjing Fang, Chaochao Chen, Bowen Song, Li Wang, Jun Zhou, Kenny Q. Zhu

    Abstract: Secure online transaction is an essential task for e-commerce platforms. Alipay, one of the world's leading cashless payment platform, provides the payment service to both merchants and individual customers. The fraud detection models are built to protect the customers, but stronger demands are raised by the new scenes, which are lacking in training data and labels. The proposed model makes a diff… ▽ More

    Submitted 2 April, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

    ACM Class: I.2.6

  36. arXiv:1910.10893  [pdf, other

    cs.CL

    Low-Resource Sequence Labeling via Unsupervised Multilingual Contextualized Representations

    Authors: Zuyi Bao, Rui Huang, Chen Li, Kenny Q. Zhu

    Abstract: Previous work on cross-lingual sequence labeling tasks either requires parallel data or bridges the two languages through word-byword matching. Such requirements and assumptions are infeasible for most languages, especially for languages with large linguistic distances, e.g., English and Chinese. In this work, we propose a Multilingual Language Model with deep semantic Alignment (MLMA) to generate… ▽ More

    Submitted 23 October, 2019; originally announced October 2019.

    Comments: Accepted at EMNLP 2019

  37. arXiv:1910.03295  [pdf, ps, other

    cs.IR cs.AI cs.CY

    Conceptualize and Infer User Needs in E-commerce

    Authors: Xusheng Luo, Yonghua Yang, Kenny Q. Zhu, Yu Gong, Keping Yang

    Abstract: Understanding latent user needs beneath shopping behaviors is critical to e-commercial applications. Without a proper definition of user needs in e-commerce, most industry solutions are not driven directly by user needs at current stage, which prevents them from further improving user satisfaction. Representing implicit user needs explicitly as nodes like "outdoor barbecue" or "keep warm for kids"… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

    Comments: 9 pages, 6 figures. Accepted by CIKM 2019 Applied Research Track

  38. arXiv:1905.07089  [pdf, other

    cs.IR cs.LG

    Exact-K Recommendation via Maximal Clique Optimization

    Authors: Yu Gong, Yu Zhu, Lu Duan, Qingwen Liu, Ziyu Guan, Fei Sun, Wenwu Ou, Kenny Q. Zhu

    Abstract: This paper targets to a novel but practical recommendation problem named exact-K recommendation. It is different from traditional top-K recommendation, as it focuses more on (constrained) combinatorial optimization which will optimize to recommend a whole set of K items called card, rather than ranking optimization which assumes that "better" items should be put into top positions. Thus we take th… ▽ More

    Submitted 16 May, 2019; originally announced May 2019.

    Comments: SIGKDD 2019

  39. arXiv:1808.06880  [pdf, other

    cs.AI cs.CL

    Automatic Generation of Text Descriptive Comments for Code Blocks

    Authors: Yuding Liang, Kenny Q. Zhu

    Abstract: We propose a framework to automatically generate descriptive comments for source code blocks. While this problem has been studied by many researchers previously, their methods are mostly based on fixed template and achieves poor results. Our framework does not rely on any template, but makes use of a new recursive neural network called Code-RNN to extract features from the source code and embed th… ▽ More

    Submitted 21 August, 2018; originally announced August 2018.

    Comments: aaai 2018

  40. arXiv:1803.11359  [pdf, other

    cs.CL

    Automatic Generation of Chinese Short Product Titles for Mobile Display

    Authors: Yu Gong, Xusheng Luo, Kenny Q. Zhu, Wenwu Ou, Zhao Li, Lu Duan

    Abstract: This paper studies the problem of automatically extracting a short title from a manually written longer description of E-commerce products for display on mobile devices. It is a new extractive summarization problem on short text inputs, for which we propose a feature-enriched network model, combining three different categories of features in parallel. Experimental results show that our framework s… ▽ More

    Submitted 6 May, 2019; v1 submitted 30 March, 2018; originally announced March 2018.

    Comments: IAAI 2019

  41. arXiv:1803.11326  [pdf, other

    cs.CL

    Deep Cascade Multi-task Learning for Slot Filling in Online Shopping Assistant

    Authors: Yu Gong, Xusheng Luo, Yu Zhu, Wenwu Ou, Zhao Li, Muhua Zhu, Kenny Q. Zhu, Lu Duan, Xi Chen

    Abstract: Slot filling is a critical task in natural language understanding (NLU) for dialog systems. State-of-the-art approaches treat it as a sequence labeling problem and adopt such models as BiLSTM-CRF. While these models work relatively well on standard benchmark datasets, they face challenges in the context of E-commerce where the slot labels are more informative and carry richer expressions. In this… ▽ More

    Submitted 6 May, 2019; v1 submitted 29 March, 2018; originally announced March 2018.

    Comments: AAAI 2019

  42. arXiv:1803.00729  [pdf, other

    cs.CL cs.AI

    Representing Verbs as Argument Concepts

    Authors: Yu Gong, Kaiqi Zhao, Kenny Q. Zhu

    Abstract: Verbs play an important role in the understanding of natural language text. This paper studies the problem of abstracting the subject and object arguments of a verb into a set of noun concepts, known as the "argument concepts". This set of concepts, whose size is parameterized, represents the fine-grained semantics of a verb. For example, the object of "enjoy" can be abstracted into time, hobby an… ▽ More

    Submitted 2 March, 2018; originally announced March 2018.

    Comments: 7 pages, 2 figures, AAAI 2016

  43. arXiv:1711.04204  [pdf, other

    cs.CL cs.AI

    Automatic Extraction of Commonsense LocatedNear Knowledge

    Authors: Frank F. Xu, Bill Yuchen Lin, Kenny Q. Zhu

    Abstract: LocatedNear relation is a kind of commonsense knowledge describing two physical objects that are typically found near each other in real life. In this paper, we study how to automatically extract such relationship through a sentence-level relation classifier and aggregating the scores of entity pairs from a large corpus. Also, we release two benchmark datasets for evaluation and future research.

    Submitted 12 May, 2018; v1 submitted 11 November, 2017; originally announced November 2017.

    Comments: Accepted by ACL 2018. A preliminary version is presented on AKBC@NIPS'17

  44. arXiv:1707.02047  [pdf, ps, other

    cs.DB

    InferSpark: Statistical Inference at Scale

    Authors: Zhuoyue Zhao, Jialing Pei, Eric Lo, Kenny Q. Zhu, Chris Liu

    Abstract: The Apache Spark stack has enabled fast large-scale data processing. Despite a rich library of statistical models and inference algorithms, it does not give domain users the ability to develop their own models. The emergence of probabilistic programming languages has showed the promise of developing sophisticated probabilistic models in a succinct and programmatic way. These frameworks have the po… ▽ More

    Submitted 9 October, 2017; v1 submitted 7 July, 2017; originally announced July 2017.

    Comments: 13 pages, 22 figures

  45. arXiv:1211.2073  [pdf, ps, other

    cs.LG cs.CE q-bio.QM stat.ML

    LAGE: A Java Framework to reconstruct Gene Regulatory Networks from Large-Scale Continues Expression Data

    Authors: Yang Lu, Mengying Wang, Kenny Q. Zhu, Bo Yuan

    Abstract: LAGE is a systematic framework developed in Java. The motivation of LAGE is to provide a scalable and parallel solution to reconstruct Gene Regulatory Networks (GRNs) from continuous gene expression data for very large amount of genes. The basic idea of our framework is motivated by the philosophy of divideand-conquer. Specifically, LAGE recursively partitions genes into multiple overlapping commu… ▽ More

    Submitted 9 November, 2012; originally announced November 2012.

    Comments: 2 pages