-
Semantic Deep Hiding for Robust Unlearnable Examples
Authors:
Ruohan Meng,
Chenyu Yi,
Yi Yu,
Siyuan Yang,
Bingquan Shen,
Alex C. Kot
Abstract:
Ensuring data privacy and protection has become paramount in the era of deep learning. Unlearnable examples are proposed to mislead the deep learning models and prevent data from unauthorized exploration by adding small perturbations to data. However, such perturbations (e.g., noise, texture, color change) predominantly impact low-level features, making them vulnerable to common countermeasures. I…
▽ More
Ensuring data privacy and protection has become paramount in the era of deep learning. Unlearnable examples are proposed to mislead the deep learning models and prevent data from unauthorized exploration by adding small perturbations to data. However, such perturbations (e.g., noise, texture, color change) predominantly impact low-level features, making them vulnerable to common countermeasures. In contrast, semantic images with intricate shapes have a wealth of high-level features, making them more resilient to countermeasures and potential for producing robust unlearnable examples. In this paper, we propose a Deep Hiding (DH) scheme that adaptively hides semantic images enriched with high-level features. We employ an Invertible Neural Network (INN) to invisibly integrate predefined images, inherently hiding them with deceptive perturbations. To enhance data unlearnability, we introduce a Latent Feature Concentration module, designed to work with the INN, regularizing the intra-class variance of these perturbations. To further boost the robustness of unlearnable examples, we design a Semantic Images Generation module that produces hidden semantic images. By utilizing similar semantic information, this module generates similar semantic images for samples within the same classes, thereby enlarging the inter-class distance and narrowing the intra-class distance. Extensive experiments on CIFAR-10, CIFAR-100, and an ImageNet subset, against 18 countermeasures, reveal that our proposed method exhibits outstanding robustness for unlearnable examples, demonstrating its efficacy in preventing unauthorized data exploitation.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Decoupled Marked Temporal Point Process using Neural Ordinary Differential Equations
Authors:
Yujee Song,
Donghyun Lee,
Rui Meng,
Won Hwa Kim
Abstract:
A Marked Temporal Point Process (MTPP) is a stochastic process whose realization is a set of event-time data. MTPP is often used to understand complex dynamics of asynchronous temporal events such as money transaction, social media, healthcare, etc. Recent studies have utilized deep neural networks to capture complex temporal dependencies of events and generate embedding that aptly represent the o…
▽ More
A Marked Temporal Point Process (MTPP) is a stochastic process whose realization is a set of event-time data. MTPP is often used to understand complex dynamics of asynchronous temporal events such as money transaction, social media, healthcare, etc. Recent studies have utilized deep neural networks to capture complex temporal dependencies of events and generate embedding that aptly represent the observed events. While most previous studies focus on the inter-event dependencies and their representations, how individual events influence the overall dynamics over time has been under-explored. In this regime, we propose a Decoupled MTPP framework that disentangles characterization of a stochastic process into a set of evolving influences from different events. Our approach employs Neural Ordinary Differential Equations (Neural ODEs) to learn flexible continuous dynamics of these influences while simultaneously addressing multiple inference problems, such as density estimation and survival rate computation. We emphasize the significance of disentangling the influences by comparing our framework with state-of-the-art methods on real-life datasets, and provide analysis on the model behavior for potential applications.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Leveraging partial stragglers within gradient coding
Authors:
Aditya Ramamoorthy,
Ruoyu Meng,
Vrinda S. Girimaji
Abstract:
Within distributed learning, workers typically compute gradients on their assigned dataset chunks and send them to the parameter server (PS), which aggregates them to compute either an exact or approximate version of $\nabla L$ (gradient of the loss function $L$). However, in large-scale clusters, many workers are slower than their promised speed or even failure-prone. A gradient coding solution i…
▽ More
Within distributed learning, workers typically compute gradients on their assigned dataset chunks and send them to the parameter server (PS), which aggregates them to compute either an exact or approximate version of $\nabla L$ (gradient of the loss function $L$). However, in large-scale clusters, many workers are slower than their promised speed or even failure-prone. A gradient coding solution introduces redundancy within the assignment of chunks to the workers and uses coding theoretic ideas to allow the PS to recover $\nabla L$ (exactly or approximately), even in the presence of stragglers. Unfortunately, most existing gradient coding protocols are inefficient from a computation perspective as they coarsely classify workers as operational or failed; the potentially valuable work performed by slow workers (partial stragglers) is ignored. In this work, we present novel gradient coding protocols that judiciously leverage the work performed by partial stragglers. Our protocols are efficient from a computation and communication perspective and numerically stable. For an important class of chunk assignments, we present efficient algorithms for optimizing the relative ordering of chunks within the workers; this ordering affects the overall execution time. For exact gradient reconstruction, our protocol is around $2\times$ faster than the original class of protocols and for approximate gradient reconstruction, the mean-squared-error of our reconstructed gradient is several orders of magnitude better.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Integer Scale: A Free Lunch for Faster Fine-grained Quantization of LLMs
Authors:
Qingyuan Li,
Ran Meng,
Yiduo Li,
Bo Zhang,
Yifan Lu,
Yerui Sun,
Lin Ma,
Yuchen Xie
Abstract:
We introduce Integer Scale, a novel post-training quantization scheme for large language models that effectively resolves the inference bottleneck in current fine-grained quantization approaches while maintaining similar accuracies. Integer Scale is a free lunch as it requires no extra calibration or fine-tuning which will otherwise incur additional costs. It can be used plug-and-play for most fin…
▽ More
We introduce Integer Scale, a novel post-training quantization scheme for large language models that effectively resolves the inference bottleneck in current fine-grained quantization approaches while maintaining similar accuracies. Integer Scale is a free lunch as it requires no extra calibration or fine-tuning which will otherwise incur additional costs. It can be used plug-and-play for most fine-grained quantization methods. Its integration results in at most 1.85x end-to-end speed boost over the original counterpart with comparable accuracy. Additionally, due to the orchestration of the proposed Integer Scale and fine-grained quantization, we resolved the quantization difficulty for Mixtral-8x7B and LLaMA-3 models with negligible performance degradation, and it comes with an end-to-end speed boost of 2.13x, and 2.31x compared with their FP16 versions respectively.
△ Less
Submitted 28 May, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
RAG-RLRC-LaySum at BioLaySumm: Integrating Retrieval-Augmented Generation and Readability Control for Layman Summarization of Biomedical Texts
Authors:
Yuelyu Ji,
Zhuochun Li,
Rui Meng,
Sonish Sivarajkumar,
Yanshan Wang,
Zeshui Yu,
Hui Ji,
Yushui Han,
Hanyu Zeng,
Daqing He
Abstract:
This paper introduces the RAG-RLRC-LaySum framework, designed to make complex biomedical research understandable to laymen through advanced Natural Language Processing (NLP) techniques. Our Retrieval Augmented Generation (RAG) solution, enhanced by a reranking method, utilizes multiple knowledge sources to ensure the precision and pertinence of lay summaries. Additionally, our Reinforcement Learni…
▽ More
This paper introduces the RAG-RLRC-LaySum framework, designed to make complex biomedical research understandable to laymen through advanced Natural Language Processing (NLP) techniques. Our Retrieval Augmented Generation (RAG) solution, enhanced by a reranking method, utilizes multiple knowledge sources to ensure the precision and pertinence of lay summaries. Additionally, our Reinforcement Learning for Readability Control (RLRC) strategy improves readability, making scientific content comprehensible to non-specialists. Evaluations using the publicly accessible PLOS and eLife datasets show that our methods surpass Plain Gemini model, demonstrating a 20% increase in readability scores, a 15% improvement in ROUGE-2 relevance scores, and a 10% enhancement in factual accuracy. The RAG-RLRC-LaySum framework effectively democratizes scientific knowledge, enhancing public engagement with biomedical discoveries.
△ Less
Submitted 24 June, 2024; v1 submitted 21 May, 2024;
originally announced May 2024.
-
Optimizing Surgical Plans for Parenchyma-Sparing Liver Resections through Contour-Guided Resection and Surface Approximation
Authors:
Gabriella d'Albenzio,
Ruoyan Meng,
Davit Aghayan,
Egidijus Pelanis,
Rebecca Hisey,
Sarkis Drejian,
Åsmund Avdem Fretland,
Ole Jakob Elle,
Bjørn Edwin,
Rafael Palomar
Abstract:
Objective: This study introduces a novel method for defining virtual resections in liver cancer surgery, aimed at enhancing the adaptability of parenchyma-sparing resection (PSR) plans. By comparing these with traditional anatomical resection (AR) plans, we explore the potential for optimization in surgical planning. Methods: Leveraging contours and spline surface approximations directly from the…
▽ More
Objective: This study introduces a novel method for defining virtual resections in liver cancer surgery, aimed at enhancing the adaptability of parenchyma-sparing resection (PSR) plans. By comparing these with traditional anatomical resection (AR) plans, we explore the potential for optimization in surgical planning. Methods: Leveraging contours and spline surface approximations directly from the liver's surface, our method aligns closely with actual surgical procedures, offering a more realistic representation of curved resection paths. This technique, tested against 14 cases from the OSLO-COMET study, incorporates surface deformation for versatile plan modeling, comparing volumetric outcomes of PSR and AR. Results: The study highlights significant benefits of PSR over AR, including reduced resected volume ($32.71 \pm 13.80$ ml for PSR vs. $249.53 \pm 135.23$ ml for AR, $p <0.0001$) and higher remnant liver volume ($1922.77 \pm 442.86$ ml for PSR vs. $1716.87 \pm 403.00$ ml for AR, $p <0.0001$). PSR also showed a considerably higher remnant percentage ($98.16 \pm 0.81%$) compared to AR ($87.40 \pm 6.49%$, $p <0.0001$). Conclusion: The proposed approach is able to define virtual resections accommodating a wide variety of resections (i.e., PSR and AR). Careful surgical planning using virtual resections can optimize the resection strategy. Significance: This study presents a novel computer-aided planning system for liver surgery, demonstrating its efficacy and flexibility for definition of virtual resections. Virtual surgery planning can be used for optimization of resection strategies leading to increased preservation of healthy tissue.
△ Less
Submitted 8 April, 2024;
originally announced May 2024.
-
Program Environment Fuzzing
Authors:
Ruijie Meng,
Gregory J. Duck,
Abhik Roychoudhury
Abstract:
Computer programs are not executed in isolation, but rather interact with the execution environment which drives the program behaviours. Software validation and verification methods, such as greybox fuzzing, thus need to capture the effect of possibly complex environmental interactions, including files, databases, configurations, network sockets, human-user interactions, and more. Conventional app…
▽ More
Computer programs are not executed in isolation, but rather interact with the execution environment which drives the program behaviours. Software validation and verification methods, such as greybox fuzzing, thus need to capture the effect of possibly complex environmental interactions, including files, databases, configurations, network sockets, human-user interactions, and more. Conventional approaches for environment capture in symbolic execution and model checking employ environment modelling, which involves manual effort. In this paper, we take a different approach based on an extension of greybox fuzzing. Given a program, we first record all observed environmental interactions at the kernel/user-mode boundary in the form of system calls. Next, we replay the program under the original recorded interactions, but this time with selective mutations applied, in order to get the effect of different program environments -- all without environment modelling. Via repeated (feedback-driven) mutations over a fuzzing campaign, we can search for program environments that induce crashing behaviour. Our EFuzz tool found 33 zero-day bugs in well-known real-world protocol implementations and GUI applications. Many of these are security vulnerabilities and 14 CVEs were assigned.
△ Less
Submitted 20 May, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
Steganographic Passport: An Owner and User Verifiable Credential for Deep Model IP Protection Without Retraining
Authors:
Qi Cui,
Ruohan Meng,
Chaohui Xu,
Chip-Hong Chang
Abstract:
Ensuring the legal usage of deep models is crucial to promoting trustable, accountable, and responsible artificial intelligence innovation. Current passport-based methods that obfuscate model functionality for license-to-use and ownership verifications suffer from capacity and quality constraints, as they require retraining the owner model for new users. They are also vulnerable to advanced Expand…
▽ More
Ensuring the legal usage of deep models is crucial to promoting trustable, accountable, and responsible artificial intelligence innovation. Current passport-based methods that obfuscate model functionality for license-to-use and ownership verifications suffer from capacity and quality constraints, as they require retraining the owner model for new users. They are also vulnerable to advanced Expanded Residual Block ambiguity attacks. We propose Steganographic Passport, which uses an invertible steganographic network to decouple license-to-use from ownership verification by hiding the user's identity images into the owner-side passport and recovering them from their respective user-side passports. An irreversible and collision-resistant hash function is used to avoid exposing the owner-side passport from the derived user-side passports and increase the uniqueness of the model signature. To safeguard both the passport and model's weights against advanced ambiguity attacks, an activation-level obfuscation is proposed for the verification branch of the owner's model. By jointly training the verification and deployment branches, their weights become tightly coupled. The proposed method supports agile licensing of deep models by providing a strong ownership proof and license accountability without requiring a separate model retraining for the admission of every new user. Experiment results show that our Steganographic Passport outperforms other passport-based deep model protection methods in robustness against various known attacks.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Quantum advantage in zero-error function computation with side information
Authors:
Ruoyu Meng,
Aditya Ramamoorthy
Abstract:
We consider the problem of zero-error function computation with side information. Alice has a source $X$ and Bob has correlated source $Y$ and they can communicate via either classical or a quantum channel. Bob wants to calculate $f(X,Y)$ with zero error. We aim to characterize the minimum amount of information that Alice needs to send to Bob for this to happen with zero-error. In the classical se…
▽ More
We consider the problem of zero-error function computation with side information. Alice has a source $X$ and Bob has correlated source $Y$ and they can communicate via either classical or a quantum channel. Bob wants to calculate $f(X,Y)$ with zero error. We aim to characterize the minimum amount of information that Alice needs to send to Bob for this to happen with zero-error. In the classical setting, this quantity depends on the asymptotic growth of $χ(G^{(m)})$, the chromatic number of an appropriately defined $m$-instance "confusion graph". In this work we present structural characterizations of $G^{(m)}$ and demonstrate two function computation scenarios that have the same single-instance confusion graph. However, in one case there a strict advantage in using quantum transmission as against classical transmission, whereas there is no such advantage in the other case.
△ Less
Submitted 4 March, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding
Authors:
Lifu Tu,
Semih Yavuz,
Jin Qu,
Jiacheng Xu,
Rui Meng,
Caiming Xiong,
Yingbo Zhou
Abstract:
Large Language Models (LLMs) have demonstrated a powerful ability for text generation. However, achieving optimal results with a given prompt or instruction can be challenging, especially for billion-sized models. Additionally, undesired behaviors such as toxicity or hallucinations can manifest. While much larger models (e.g., ChatGPT) may demonstrate strength in mitigating these issues, there is…
▽ More
Large Language Models (LLMs) have demonstrated a powerful ability for text generation. However, achieving optimal results with a given prompt or instruction can be challenging, especially for billion-sized models. Additionally, undesired behaviors such as toxicity or hallucinations can manifest. While much larger models (e.g., ChatGPT) may demonstrate strength in mitigating these issues, there is still no guarantee of complete prevention. In this work, we propose formalizing text generation as a future-constrained generation problem to minimize undesirable behaviors and enforce faithfulness to instructions. The estimation of future constraint satisfaction, accomplished using LLMs, guides the text generation process. Our extensive experiments demonstrate the effectiveness of the proposed approach across three distinct text generation tasks: keyword-constrained generation (Lin et al., 2020), toxicity reduction (Gehman et al., 2020), and factual correctness in question-answering (Gao et al., 2023).
△ Less
Submitted 25 June, 2024; v1 submitted 11 December, 2023;
originally announced December 2023.
-
A Speed Odyssey for Deployable Quantization of LLMs
Authors:
Qingyuan Li,
Ran Meng,
Yiduo Li,
Bo Zhang,
Liang Li,
Yifan Lu,
Xiangxiang Chu,
Yerui Sun,
Yuchen Xie
Abstract:
The large language model era urges faster and less costly inference. Prior model compression works on LLMs tend to undertake a software-centric approach primarily focused on the simulated quantization performance. By neglecting the feasibility of deployment, these approaches are typically disabled in real practice. They used to drastically push down the quantization bit range for a reduced computa…
▽ More
The large language model era urges faster and less costly inference. Prior model compression works on LLMs tend to undertake a software-centric approach primarily focused on the simulated quantization performance. By neglecting the feasibility of deployment, these approaches are typically disabled in real practice. They used to drastically push down the quantization bit range for a reduced computation which might not be supported by the mainstream hardware, or involve sophisticated algorithms that introduce extra computation or memory access overhead. We argue that pursuing a hardware-centric approach in the construction of quantization algorithms is crucial. In this regard, we are driven to build our compression method on top of hardware awareness, eliminating impractical algorithm choices while maximizing the benefit of hardware acceleration. Our method, OdysseyLLM, comes with a novel W4A8 kernel implementation called FastGEMM and a combined recipe of quantization strategies. Extensive experiments manifest the superiority of our W4A8 method which brings the actual speed boosting up to \textbf{4$\times$} compared to Hugging Face FP16 inference and \textbf{2.23$\times$} vs. the state-of-the-art inference engine TensorRT-LLM in FP16, and \textbf{1.45$\times$} vs. TensorRT-LLM in INT8, yet without substantially harming the performance.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Investigating Answerability of LLMs for Long-Form Question Answering
Authors:
Meghana Moorthy Bhat,
Rui Meng,
Ye Liu,
Yingbo Zhou,
Semih Yavuz
Abstract:
As we embark on a new era of LLMs, it becomes increasingly crucial to understand their capabilities, limitations, and differences. Toward making further progress in this direction, we strive to build a deeper understanding of the gaps between massive LLMs (e.g., ChatGPT) and smaller yet effective open-source LLMs and their distilled counterparts. To this end, we specifically focus on long-form que…
▽ More
As we embark on a new era of LLMs, it becomes increasingly crucial to understand their capabilities, limitations, and differences. Toward making further progress in this direction, we strive to build a deeper understanding of the gaps between massive LLMs (e.g., ChatGPT) and smaller yet effective open-source LLMs and their distilled counterparts. To this end, we specifically focus on long-form question answering (LFQA) because it has several practical and impactful applications (e.g., troubleshooting, customer service, etc.) yet is still understudied and challenging for LLMs. We propose a question-generation method from abstractive summaries and show that generating follow-up questions from summaries of long documents can create a challenging setting for LLMs to reason and infer from long contexts. Our experimental results confirm that: (1) our proposed method of generating questions from abstractive summaries pose a challenging setup for LLMs and shows performance gaps between LLMs like ChatGPT and open-source LLMs (Alpaca, Llama) (2) open-source LLMs exhibit decreased reliance on context for generated questions from the original document, but their generation capabilities drop significantly on generated questions from summaries -- especially for longer contexts (>1024 tokens)
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
XGen-7B Technical Report
Authors:
Erik Nijkamp,
Tian Xie,
Hiroaki Hayashi,
Bo Pang,
Congying Xia,
Chen Xing,
Jesse Vig,
Semih Yavuz,
Philippe Laban,
Ben Krause,
Senthil Purushwalkam,
Tong Niu,
Wojciech Kryściński,
Lidiya Murakhovs'ka,
Prafulla Kumar Choubey,
Alex Fabbri,
Ye Liu,
Rui Meng,
Lifu Tu,
Meghana Bhat,
Chien-Sheng Wu,
Silvio Savarese,
Yingbo Zhou,
Shafiq Joty,
Caiming Xiong
Abstract:
Large Language Models (LLMs) have become ubiquitous across various domains, transforming the way we interact with information and conduct research. However, most high-performing LLMs remain confined behind proprietary walls, hindering scientific progress. Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many t…
▽ More
Large Language Models (LLMs) have become ubiquitous across various domains, transforming the way we interact with information and conduct research. However, most high-performing LLMs remain confined behind proprietary walls, hindering scientific progress. Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context. To address this, we have trained XGen, a series of 7B parameter models on up to 8K sequence length for up to 1.5T tokens. We have also finetuned the XGen models on public-domain instructional data, creating their instruction-tuned counterparts (XGen-Inst). We open-source our models for both research advancements and commercial applications. Our evaluation on standard benchmarks shows that XGen models achieve comparable or better results when compared with state-of-the-art open-source LLMs. Our targeted evaluation on long sequence modeling tasks shows the benefits of our 8K-sequence models over 2K-sequence open-source LLMs.
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
Modeling Uncertainty and Using Post-fusion as Fallback Improves Retrieval Augmented Generation with LLMs
Authors:
Ye Liu,
Semih Yavuz,
Rui Meng,
Meghana Moorthy,
Shafiq Joty,
Caiming Xiong,
Yingbo Zhou
Abstract:
The integration of retrieved passages and large language models (LLMs), such as ChatGPTs, has significantly contributed to improving open-domain question answering. However, there is still a lack of exploration regarding the optimal approach for incorporating retrieved passages into the answer generation process. This paper aims to fill this gap by investigating different methods of combining retr…
▽ More
The integration of retrieved passages and large language models (LLMs), such as ChatGPTs, has significantly contributed to improving open-domain question answering. However, there is still a lack of exploration regarding the optimal approach for incorporating retrieved passages into the answer generation process. This paper aims to fill this gap by investigating different methods of combining retrieved passages with LLMs to enhance answer generation. We begin by examining the limitations of a commonly-used concatenation approach. Surprisingly, this approach often results in generating "unknown" outputs, even when the correct document is among the top-k retrieved passages. To address this issue, we explore four alternative strategies for integrating the retrieved passages with the LLMs. These strategies include two single-round methods that utilize chain-of-thought reasoning and two multi-round strategies that incorporate feedback loops. Through comprehensive analyses and experiments, we provide insightful observations on how to effectively leverage retrieved passages to enhance the answer generation capability of LLMs.
△ Less
Submitted 7 April, 2024; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Enhancing Performance on Seen and Unseen Dialogue Scenarios using Retrieval-Augmented End-to-End Task-Oriented System
Authors:
Jianguo Zhang,
Stephen Roller,
Kun Qian,
Zhiwei Liu,
Rui Meng,
Shelby Heinecke,
Huan Wang,
Silvio Savarese,
Caiming Xiong
Abstract:
End-to-end task-oriented dialogue (TOD) systems have achieved promising performance by leveraging sophisticated natural language understanding and natural language generation capabilities of pre-trained models. This work enables the TOD systems with more flexibility through a simple cache. The cache provides the flexibility to dynamically update the TOD systems and handle both existing and unseen…
▽ More
End-to-end task-oriented dialogue (TOD) systems have achieved promising performance by leveraging sophisticated natural language understanding and natural language generation capabilities of pre-trained models. This work enables the TOD systems with more flexibility through a simple cache. The cache provides the flexibility to dynamically update the TOD systems and handle both existing and unseen dialogue scenarios. Towards this end, we first fine-tune a retrieval module to effectively retrieve the most relevant information entries from the cache. We then train end-to-end TOD models that can refer to and ground on both dialogue history and retrieved information during TOD generation. The cache is straightforward to construct, and the backbone models of TOD systems are compatible with existing pre-trained generative models. Extensive experiments demonstrate the superior performance of our framework, with a notable improvement in non-empty joint goal accuracy by 6.7% compared to strong baselines.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI
Authors:
Jianguo Zhang,
Kun Qian,
Zhiwei Liu,
Shelby Heinecke,
Rui Meng,
Ye Liu,
Zhou Yu,
Huan Wang,
Silvio Savarese,
Caiming Xiong
Abstract:
Despite advancements in conversational AI, language models encounter challenges to handle diverse conversational tasks, and existing dialogue dataset collections often lack diversity and comprehensiveness. To tackle these issues, we introduce DialogStudio: the largest and most diverse collection of dialogue datasets, unified under a consistent format while preserving their original information. Ou…
▽ More
Despite advancements in conversational AI, language models encounter challenges to handle diverse conversational tasks, and existing dialogue dataset collections often lack diversity and comprehensiveness. To tackle these issues, we introduce DialogStudio: the largest and most diverse collection of dialogue datasets, unified under a consistent format while preserving their original information. Our collection encompasses data from open-domain dialogues, task-oriented dialogues, natural language understanding, conversational recommendation, dialogue summarization, and knowledge-grounded dialogues, making it an incredibly rich and diverse resource for dialogue research and model training. To further enhance the utility of DialogStudio, we identify the licenses for each dataset, design external knowledge and domain-aware prompts for selected dialogues to facilitate instruction-aware fine-tuning. Furthermore, we develop conversational AI models using the dataset collection, and our experiments in both zero-shot and few-shot learning scenarios demonstrate the superiority of DialogStudio. To improve transparency and support dataset and task-based research, as well as language model pre-training, all datasets, licenses, codes, and models associated with DialogStudio are made publicly accessible\footnote{\url{https://github.com/salesforce/DialogStudio}}.
△ Less
Submitted 5 February, 2024; v1 submitted 19 July, 2023;
originally announced July 2023.
-
HPE:Answering Complex Questions over Text by Hybrid Question Parsing and Execution
Authors:
Ye Liu,
Semih Yavuz,
Rui Meng,
Dragomir Radev,
Caiming Xiong,
Yingbo Zhou
Abstract:
The dominant paradigm of textual question answering systems is based on end-to-end neural networks, which excels at answering natural language questions but falls short on complex ones. This stands in contrast to the broad adaptation of semantic parsing approaches over structured data sources (e.g., relational database, knowledge graphs), that convert natural language questions to logical forms an…
▽ More
The dominant paradigm of textual question answering systems is based on end-to-end neural networks, which excels at answering natural language questions but falls short on complex ones. This stands in contrast to the broad adaptation of semantic parsing approaches over structured data sources (e.g., relational database, knowledge graphs), that convert natural language questions to logical forms and execute them with query engines. Towards combining the strengths of neural and symbolic methods, we propose a framework of question parsing and execution on textual QA. It comprises two central pillars: (1) We parse the question of varying complexity into an intermediate representation, named H-expression, which is composed of simple questions as the primitives and symbolic operations representing the relationships among them; (2) To execute the resulting H-expressions, we design a hybrid executor, which integrates the deterministic rules to translate the symbolic operations with a drop-in neural reader network to answer each decomposed simple question. Hence, the proposed framework can be viewed as a top-down question parsing followed by a bottom-up answer backtracking. The resulting H-expressions closely guide the execution process, offering higher precision besides better interpretability while still preserving the advantages of the neural readers for resolving its primitive elements. Our extensive experiments on MuSiQue, 2WikiQA, HotpotQA, and NQ show that the proposed parsing and hybrid execution framework outperforms existing approaches in supervised, few-shot, and zero-shot settings, while also effectively exposing its underlying reasoning process.
△ Less
Submitted 5 January, 2024; v1 submitted 12 May, 2023;
originally announced May 2023.
-
Communication complexity of entanglement assisted multi-party computation
Authors:
Ruoyu Meng,
Aditya Ramamoorthy
Abstract:
We consider a quantum and classical version multi-party function computation problem with $n$ players, where players $2, \dots, n$ need to communicate appropriate information to player 1, so that a "generalized" inner product function with an appropriate promise can be calculated. The communication complexity of a protocol is the total number of bits that need to be communicated. When $n$ is prime…
▽ More
We consider a quantum and classical version multi-party function computation problem with $n$ players, where players $2, \dots, n$ need to communicate appropriate information to player 1, so that a "generalized" inner product function with an appropriate promise can be calculated. The communication complexity of a protocol is the total number of bits that need to be communicated. When $n$ is prime and for our chosen function, we exhibit a quantum protocol (with complexity $(n-1) \log n$ bits) and a classical protocol (with complexity $(n-1)^2 (\log n^2$) bits). In the quantum protocol, the players have access to entangled qudits but the communication is still classical. Furthermore, we present an integer linear programming formulation for determining a lower bound on the classical communication complexity. This demonstrates that our quantum protocol is strictly better than classical protocols.
△ Less
Submitted 5 February, 2024; v1 submitted 7 May, 2023;
originally announced May 2023.
-
Greybox Fuzzing of Distributed Systems
Authors:
Ruijie Meng,
George Pîrlea,
Abhik Roychoudhury,
Ilya Sergey
Abstract:
Grey-box fuzzing is the lightweight approach of choice for finding bugs in sequential programs. It provides a balance between efficiency and effectiveness by conducting a biased random search over the domain of program inputs using a feedback function from observed test executions. For distributed system testing, however, the state-of-practice is represented today by only black-box tools that do n…
▽ More
Grey-box fuzzing is the lightweight approach of choice for finding bugs in sequential programs. It provides a balance between efficiency and effectiveness by conducting a biased random search over the domain of program inputs using a feedback function from observed test executions. For distributed system testing, however, the state-of-practice is represented today by only black-box tools that do not attempt to infer and exploit any knowledge of the system's past behaviours to guide the search for bugs.
In this work, we present Mallory: the first framework for grey-box fuzz-testing of distributed systems. Unlike popular black-box distributed system fuzzers, such as Jepsen, that search for bugs by randomly injecting network partitions and node faults or by following human-defined schedules, Mallory is adaptive. It exercises a novel metric to learn how to maximize the number of observed system behaviors by choosing different sequences of faults, thus increasing the likelihood of finding new bugs. Our approach relies on timeline-driven testing. Mallory dynamically constructs Lamport timelines of the system behaviour and further abstracts these timelines into happens-before summaries, which serve as a feedback function guiding the fuzz campaign. Subsequently, Mallory reactively learns a policy using Q-learning, enabling it to introduce faults guided by its real-time observation of the summaries.
We have evaluated Mallory on a diverse set of widely-used industrial distributed systems. Compared to the start-of-the-art black-box fuzzer Jepsen, Mallory explores more behaviours and takes less time to find bugs. Mallory discovered 22 zero-day bugs (of which 18 were confirmed by developers), including 10 new vulnerabilities, in rigorously-tested distributed systems such as Braft, Dqlite, and Redis. 6 new CVEs have been assigned.
△ Less
Submitted 12 August, 2023; v1 submitted 4 May, 2023;
originally announced May 2023.
-
ESimCSE Unsupervised Contrastive Learning Jointly with UDA Semi-Supervised Learning for Large Label System Text Classification Mode
Authors:
Ruan Lu,
Zhou HangCheng,
Ran Meng,
Zhao Jin,
Qin JiaoYu,
Wei Feng,
Wang ChenZi
Abstract:
The challenges faced by text classification with large tag systems in natural language processing tasks include multiple tag systems, uneven data distribution, and high noise. To address these problems, the ESimCSE unsupervised comparative learning and UDA semi-supervised comparative learning models are combined through the use of joint training techniques in the models.The ESimCSE model efficient…
▽ More
The challenges faced by text classification with large tag systems in natural language processing tasks include multiple tag systems, uneven data distribution, and high noise. To address these problems, the ESimCSE unsupervised comparative learning and UDA semi-supervised comparative learning models are combined through the use of joint training techniques in the models.The ESimCSE model efficiently learns text vector representations using unlabeled data to achieve better classification results, while UDA is trained using unlabeled data through semi-supervised learning methods to improve the prediction performance of the models and stability, and further improve the generalization ability of the model. In addition, adversarial training techniques FGM and PGD are used in the model training process to improve the robustness and reliability of the model. The experimental results show that there is an 8% and 10% accuracy improvement relative to Baseline on the public dataset Ruesters as well as on the operational dataset, respectively, and a 15% improvement in manual validation accuracy can be achieved on the operational dataset, indicating that the method is effective.
△ Less
Submitted 18 April, 2023;
originally announced April 2023.
-
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer
Authors:
Ning Yu,
Chia-Chih Chen,
Zeyuan Chen,
Rui Meng,
Gang Wu,
Paul Josel,
Juan Carlos Niebles,
Caiming Xiong,
Ran Xu
Abstract:
Graphic layout designs play an essential role in visual communication. Yet handcrafting layout designs is skill-demanding, time-consuming, and non-scalable to batch production. Generative models emerge to make design automation scalable but it remains non-trivial to produce designs that comply with designers' multimodal desires, i.e., constrained by background images and driven by foreground conte…
▽ More
Graphic layout designs play an essential role in visual communication. Yet handcrafting layout designs is skill-demanding, time-consuming, and non-scalable to batch production. Generative models emerge to make design automation scalable but it remains non-trivial to produce designs that comply with designers' multimodal desires, i.e., constrained by background images and driven by foreground content. We propose LayoutDETR that inherits the high quality and realism from generative modeling, while reformulating content-aware requirements as a detection problem: we learn to detect in a background image the reasonable locations, scales, and spatial relations for multimodal foreground elements in a layout. Our solution sets a new state-of-the-art performance for layout generation on public benchmarks and on our newly-curated ad banner dataset. We integrate our solution into a graphical system that facilitates user studies, and show that users prefer our designs over baselines by significant margins. Our code, models, dataset, graphical system, and demos are available at https://github.com/salesforce/LayoutDETR.
△ Less
Submitted 24 March, 2023; v1 submitted 19 December, 2022;
originally announced December 2022.
-
AugTriever: Unsupervised Dense Retrieval by Scalable Data Augmentation
Authors:
Rui Meng,
Ye Liu,
Semih Yavuz,
Divyansh Agarwal,
Lifu Tu,
Ning Yu,
Jianguo Zhang,
Meghana Bhat,
Yingbo Zhou
Abstract:
Dense retrievers have made significant strides in text retrieval and open-domain question answering, even though most achievements were made possible only with large amounts of human supervision. In this work, we aim to develop unsupervised methods by proposing two methods that create pseudo query-document pairs and train dense retrieval models in an annotation-free and scalable manner: query extr…
▽ More
Dense retrievers have made significant strides in text retrieval and open-domain question answering, even though most achievements were made possible only with large amounts of human supervision. In this work, we aim to develop unsupervised methods by proposing two methods that create pseudo query-document pairs and train dense retrieval models in an annotation-free and scalable manner: query extraction and transferred query generation. The former method produces pseudo queries by selecting salient spans from the original document. The latter utilizes generation models trained for other NLP tasks (e.g., summarization) to produce pseudo queries. Extensive experiments show that models trained with the proposed augmentation methods can perform comparably well (or better) to multiple strong baselines. Combining those strategies leads to further improvements, achieving the state-of-the-art performance of unsupervised dense retrieval on both BEIR and ODQA datasets.
△ Less
Submitted 7 March, 2023; v1 submitted 17 December, 2022;
originally announced December 2022.
-
Traceable and Authenticable Image Tagging for Fake News Detection
Authors:
Ruohan Meng,
Zhili Zhou,
Qi Cui,
Kwok-Yan Lam,
Alex Kot
Abstract:
To prevent fake news images from misleading the public, it is desirable not only to verify the authenticity of news images but also to trace the source of fake news, so as to provide a complete forensic chain for reliable fake news detection. To simultaneously achieve the goals of authenticity verification and source tracing, we propose a traceable and authenticable image tagging approach that is…
▽ More
To prevent fake news images from misleading the public, it is desirable not only to verify the authenticity of news images but also to trace the source of fake news, so as to provide a complete forensic chain for reliable fake news detection. To simultaneously achieve the goals of authenticity verification and source tracing, we propose a traceable and authenticable image tagging approach that is based on a design of Decoupled Invertible Neural Network (DINN). The designed DINN can simultaneously embed the dual-tags, \textit{i.e.}, authenticable tag and traceable tag, into each news image before publishing, and then separately extract them for authenticity verification and source tracing. Moreover, to improve the accuracy of dual-tags extraction, we design a parallel Feature Aware Projection Model (FAPM) to help the DINN preserve essential tag information. In addition, we define a Distance Metric-Guided Module (DMGM) that learns asymmetric one-class representations to enable the dual-tags to achieve different robustness performances under malicious manipulations. Extensive experiments, on diverse datasets and unseen manipulations, demonstrate that the proposed tagging approach achieves excellent performance in the aspects of both authenticity verification and source tracing for reliable fake news detection and outperforms the prior works.
△ Less
Submitted 20 November, 2022;
originally announced November 2022.
-
Uni-Parser: Unified Semantic Parser for Question Answering on Knowledge Base and Database
Authors:
Ye Liu,
Semih Yavuz,
Rui Meng,
Dragomir Radev,
Caiming Xiong,
Yingbo Zhou
Abstract:
Parsing natural language questions into executable logical forms is a useful and interpretable way to perform question answering on structured data such as knowledge bases (KB) or databases (DB). However, existing approaches on semantic parsing cannot adapt to both modalities, as they suffer from the exponential growth of the logical form candidates and can hardly generalize to unseen data. In thi…
▽ More
Parsing natural language questions into executable logical forms is a useful and interpretable way to perform question answering on structured data such as knowledge bases (KB) or databases (DB). However, existing approaches on semantic parsing cannot adapt to both modalities, as they suffer from the exponential growth of the logical form candidates and can hardly generalize to unseen data. In this work, we propose Uni-Parser, a unified semantic parser for question answering (QA) on both KB and DB. We introduce the primitive (relation and entity in KB, and table name, column name and cell value in DB) as an essential element in our framework. The number of primitives grows linearly with the number of retrieved relations in KB and DB, preventing us from dealing with exponential logic form candidates. We leverage the generator to predict final logical forms by altering and composing topranked primitives with different operations (e.g. select, where, count). With sufficiently pruned search space by a contrastive primitive ranker, the generator is empowered to capture the composition of primitives enhancing its generalization ability. We achieve competitive results on multiple KB and DB QA benchmarks more efficiently, especially in the compositional and zero-shot settings.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Attention Diversification for Domain Generalization
Authors:
Rang Meng,
Xianfeng Li,
Weijie Chen,
Shicai Yang,
Jie Song,
Xinchao Wang,
Lei Zhang,
Mingli Song,
Di Xie,
Shiliang Pu
Abstract:
Convolutional neural networks (CNNs) have demonstrated gratifying results at learning discriminative features. However, when applied to unseen domains, state-of-the-art models are usually prone to errors due to domain shift. After investigating this issue from the perspective of shortcut learning, we find the devils lie in the fact that models trained on different domains merely bias to different…
▽ More
Convolutional neural networks (CNNs) have demonstrated gratifying results at learning discriminative features. However, when applied to unseen domains, state-of-the-art models are usually prone to errors due to domain shift. After investigating this issue from the perspective of shortcut learning, we find the devils lie in the fact that models trained on different domains merely bias to different domain-specific features yet overlook diverse task-related features. Under this guidance, a novel Attention Diversification framework is proposed, in which Intra-Model and Inter-Model Attention Diversification Regularization are collaborated to reassign appropriate attention to diverse task-related features. Briefly, Intra-Model Attention Diversification Regularization is equipped on the high-level feature maps to achieve in-channel discrimination and cross-channel diversification via forcing different channels to pay their most salient attention to different spatial locations. Besides, Inter-Model Attention Diversification Regularization is proposed to further provide task-related attention diversification and domain-related attention suppression, which is a paradigm of "simulate, divide and assemble": simulate domain shift via exploiting multiple domain-specific models, divide attention maps into task-related and domain-related groups, and assemble them within each group respectively to execute regularization. Extensive experiments and analyses are conducted on various benchmarks to demonstrate that our method achieves state-of-the-art performance over other competing methods. Code is available at https://github.com/hikvision-research/DomainGeneralization.
△ Less
Submitted 9 October, 2022;
originally announced October 2022.
-
Continual Learning for Steganalysis
Authors:
Zihao Yin,
Ruohan Meng,
Zhili Zhou
Abstract:
To detect the existing steganographic algorithms, recent steganalysis methods usually train a Convolutional Neural Network (CNN) model on the dataset consisting of corresponding paired cover/stego-images. However, it is inefficient and impractical for those steganalysis tools to completely retrain the CNN model to make it effective against both the existing steganographic algorithms and a new emer…
▽ More
To detect the existing steganographic algorithms, recent steganalysis methods usually train a Convolutional Neural Network (CNN) model on the dataset consisting of corresponding paired cover/stego-images. However, it is inefficient and impractical for those steganalysis tools to completely retrain the CNN model to make it effective against both the existing steganographic algorithms and a new emerging steganographic algorithm. Thus, existing steganalysis models usually lack dynamic extensibility for new steganographic algorithms, which limits their application in real-world scenarios. To address this issue, we propose an accurate parameter importance estimation (APIE) based-continual learning scheme for steganalysis. In this scheme, when a steganalysis model is trained on the new image dataset generated by the new steganographic algorithm, its network parameters are effectively and efficiently updated with sufficient consideration of their importance evaluated in the previous training process. This approach can guide the steganalysis model to learn the patterns of the new steganographic algorithm without significantly degrading the detectability against the previous steganographic algorithms. Experimental results demonstrate the proposed scheme has promising extensibility for new emerging steganographic algorithms.
△ Less
Submitted 3 September, 2022;
originally announced September 2022.
-
3D Path Planning and Obstacle Avoidance Algorithms for Obstacle-Overcoming Robots
Authors:
Yuanhao huang,
Shi Huang,
Hao Wang,
Ruifeng Meng
Abstract:
This article introduces a multimodal motion planning (MMP) algorithm that combines three-dimensional (3-D) path planning and a DWA obstacle avoidance algorithm. The algorithms aim to plan the path and motion of obstacle-overcoming robots in complex unstructured scenes. A novel A-star algorithm is proposed to combine the characteristics of unstructured scenes and a strategy to switch it into a gree…
▽ More
This article introduces a multimodal motion planning (MMP) algorithm that combines three-dimensional (3-D) path planning and a DWA obstacle avoidance algorithm. The algorithms aim to plan the path and motion of obstacle-overcoming robots in complex unstructured scenes. A novel A-star algorithm is proposed to combine the characteristics of unstructured scenes and a strategy to switch it into a greedy best-first strategy algorithm. Meanwhile, the algorithm of path planning is integrated with the DWA algorithm so that the robot can perform local dynamic obstacle avoidance during the movement along the global planned path. Furthermore, when the proposed global path planning algorithm combines with the local obstacle avoidance algorithm, the robot can correct the path after obstacle avoidance and obstacle overcoming. The simulation experiments in a factory with several complex environments verified the feasibility and robustness of the algorithms. The algorithms can quickly generate a reasonable 3-D path for obstacle-overcoming robots and perform reliable local obstacle avoidance under the premise of considering the characteristics of the scene and motion obstacles.
△ Less
Submitted 2 September, 2022;
originally announced September 2022.
-
General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase Generation
Authors:
Rui Meng,
Tong Wang,
Xingdi Yuan,
Yingbo Zhou,
Daqing He
Abstract:
Training keyphrase generation (KPG) models require a large amount of annotated data, which can be prohibitively expensive and often limited to specific domains. In this study, we first demonstrate that large distribution shifts among different domains severely hinder the transferability of KPG models. We then propose a three-stage pipeline, which gradually guides KPG models' learning focus from ge…
▽ More
Training keyphrase generation (KPG) models require a large amount of annotated data, which can be prohibitively expensive and often limited to specific domains. In this study, we first demonstrate that large distribution shifts among different domains severely hinder the transferability of KPG models. We then propose a three-stage pipeline, which gradually guides KPG models' learning focus from general syntactical features to domain-related semantics, in a data-efficient manner. With Domain-general Phrase pre-training, we pre-train Sequence-to-Sequence models with generic phrase annotations that are widely available on the web, which enables the models to generate phrases in a wide range of domains. The resulting model is then applied in the Transfer Labeling stage to produce domain-specific pseudo keyphrases, which help adapt models to a new domain. Finally, we fine-tune the model with limited data with true labels to fully adapt it to the target domain. Our experiment results show that the proposed process can produce good-quality keyphrases in new domains and achieve consistent improvements after adaptation with limited in-domain annotated data. All code and datasets are available at https://github.com/memray/OpenNMT-kpg-release.
△ Less
Submitted 7 May, 2023; v1 submitted 20 August, 2022;
originally announced August 2022.
-
Slimmable Domain Adaptation
Authors:
Rang Meng,
Weijie Chen,
Shicai Yang,
Jie Song,
Luojun Lin,
Di Xie,
Shiliang Pu,
Xinchao Wang,
Mingli Song,
Yueting Zhuang
Abstract:
Vanilla unsupervised domain adaptation methods tend to optimize the model with fixed neural architecture, which is not very practical in real-world scenarios since the target data is usually processed by different resource-limited devices. It is therefore of great necessity to facilitate architecture adaptation across various devices. In this paper, we introduce a simple framework, Slimmable Domai…
▽ More
Vanilla unsupervised domain adaptation methods tend to optimize the model with fixed neural architecture, which is not very practical in real-world scenarios since the target data is usually processed by different resource-limited devices. It is therefore of great necessity to facilitate architecture adaptation across various devices. In this paper, we introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank, from which models of different capacities can be sampled to accommodate different accuracy-efficiency trade-offs. The main challenge in this framework lies in simultaneously boosting the adaptation performance of numerous models in the model bank. To tackle this problem, we develop a Stochastic EnsEmble Distillation method to fully exploit the complementary knowledge in the model bank for inter-model interaction. Nevertheless, considering the optimization conflict between inter-model interaction and intra-model adaptation, we augment the existing bi-classifier domain confusion architecture into an Optimization-Separated Tri-Classifier counterpart. After optimizing the model bank, architecture adaptation is leveraged via our proposed Unsupervised Performance Evaluation Metric. Under various resource constraints, our framework surpasses other competing approaches by a very large margin on multiple benchmarks. It is also worth emphasizing that our framework can preserve the performance improvement against the source-only model even when the computing complexity is reduced to $1/64$. Code will be available at https://github.com/hikvision-research/SlimDA.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
Retrieval-Augmented Multilingual Keyphrase Generation with Retriever-Generator Iterative Training
Authors:
Yifan Gao,
Qingyu Yin,
Zheng Li,
Rui Meng,
Tong Zhao,
Bing Yin,
Irwin King,
Michael R. Lyu
Abstract:
Keyphrase generation is the task of automatically predicting keyphrases given a piece of long text. Despite its recent flourishing, keyphrase generation on non-English languages haven't been vastly investigated. In this paper, we call attention to a new setting named multilingual keyphrase generation and we contribute two new datasets, EcommerceMKP and AcademicMKP, covering six languages. Technica…
▽ More
Keyphrase generation is the task of automatically predicting keyphrases given a piece of long text. Despite its recent flourishing, keyphrase generation on non-English languages haven't been vastly investigated. In this paper, we call attention to a new setting named multilingual keyphrase generation and we contribute two new datasets, EcommerceMKP and AcademicMKP, covering six languages. Technically, we propose a retrieval-augmented method for multilingual keyphrase generation to mitigate the data shortage problem in non-English languages. The retrieval-augmented model leverages keyphrase annotations in English datasets to facilitate generating keyphrases in low-resource languages. Given a non-English passage, a cross-lingual dense passage retrieval module finds relevant English passages. Then the associated English keyphrases serve as external knowledge for keyphrase generation in the current language. Moreover, we develop a retriever-generator iterative training algorithm to mine pseudo parallel passage pairs to strengthen the cross-lingual passage retriever. Comprehensive experiments and ablations show that the proposed approach outperforms all baselines.
△ Less
Submitted 1 June, 2022; v1 submitted 20 May, 2022;
originally announced May 2022.
-
Interpretable Research Replication Prediction via Variational Contextual Consistency Sentence Masking
Authors:
Tianyi Luo,
Rui Meng,
Xin Eric Wang,
Yang Liu
Abstract:
Research Replication Prediction (RRP) is the task of predicting whether a published research result can be replicated or not. Building an interpretable neural text classifier for RRP promotes the understanding of why a research paper is predicted as replicable or non-replicable and therefore makes its real-world application more reliable and trustworthy. However, the prior works on model interpret…
▽ More
Research Replication Prediction (RRP) is the task of predicting whether a published research result can be replicated or not. Building an interpretable neural text classifier for RRP promotes the understanding of why a research paper is predicted as replicable or non-replicable and therefore makes its real-world application more reliable and trustworthy. However, the prior works on model interpretation mainly focused on improving the model interpretability at the word/phrase level, which are insufficient especially for long research papers in RRP. Furthermore, the existing methods cannot utilize a large size of unlabeled dataset to further improve the model interpretability. To address these limitations, we aim to build an interpretable neural model which can provide sentence-level explanations and apply weakly supervised approach to further leverage the large corpus of unlabeled datasets to boost the interpretability in addition to improving prediction performance as existing works have done. In this work, we propose the Variational Contextual Consistency Sentence Masking (VCCSM) method to automatically extract key sentences based on the context in the classifier, using both labeled and unlabeled datasets. Results of our experiments on RRP along with European Convention of Human Rights (ECHR) datasets demonstrate that VCCSM is able to improve the model interpretability for the long document classification tasks using the area over the perturbation curve and post-hoc accuracy as evaluation metrics.
△ Less
Submitted 27 March, 2022;
originally announced March 2022.
-
Compressed Predictive Information Coding
Authors:
Rui Meng,
Tianyi Luo,
Kristofer Bouchard
Abstract:
Unsupervised learning plays an important role in many fields, such as artificial intelligence, machine learning, and neuroscience. Compared to static data, methods for extracting low-dimensional structure for dynamic data are lagging. We developed a novel information-theoretic framework, Compressed Predictive Information Coding (CPIC), to extract useful representations from dynamic data. CPIC sele…
▽ More
Unsupervised learning plays an important role in many fields, such as artificial intelligence, machine learning, and neuroscience. Compared to static data, methods for extracting low-dimensional structure for dynamic data are lagging. We developed a novel information-theoretic framework, Compressed Predictive Information Coding (CPIC), to extract useful representations from dynamic data. CPIC selectively projects the past (input) into a linear subspace that is predictive about the compressed data projected from the future (output). The key insight of our framework is to learn representations by minimizing the compression complexity and maximizing the predictive information in latent space. We derive variational bounds of the CPIC loss which induces the latent space to capture information that is maximally predictive. Our variational bounds are tractable by leveraging bounds of mutual information. We find that introducing stochasticity in the encoder robustly contributes to better representation. Furthermore, variational approaches perform better in mutual information estimation compared with estimates under a Gaussian assumption. We demonstrate that CPIC is able to recover the latent space of noisy dynamical systems with low signal-to-noise ratios, and extracts features predictive of exogenous variables in neuroscience data.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
EmotionBox: a music-element-driven emotional music generation system using Recurrent Neural Network
Authors:
Kaitong Zheng,
Ruijie Meng,
Chengshi Zheng,
Xiaodong Li,
Jinqiu Sang,
Juanjuan Cai,
Jie Wang
Abstract:
With the development of deep neural networks, automatic music composition has made great progress. Although emotional music can evoke listeners' different emotions and it is important for artistic expression, only few researches have focused on generating emotional music. This paper presents EmotionBox -an music-element-driven emotional music generator that is capable of composing music given a sp…
▽ More
With the development of deep neural networks, automatic music composition has made great progress. Although emotional music can evoke listeners' different emotions and it is important for artistic expression, only few researches have focused on generating emotional music. This paper presents EmotionBox -an music-element-driven emotional music generator that is capable of composing music given a specific emotion, where this model does not require a music dataset labeled with emotions. Instead, pitch histogram and note density are extracted as features that represent mode and tempo respectively to control music emotions. The subjective listening tests show that the Emotionbox has a more competitive and balanced performance in arousing a specified emotion than the emotion-label-based method.
△ Less
Submitted 15 December, 2021;
originally announced December 2021.
-
Linear-time Temporal Logic guided Greybox Fuzzing
Authors:
Ruijie Meng,
Zhen Dong,
Jialin Li,
Ivan Beschastnikh,
Abhik Roychoudhury
Abstract:
Software model checking is a verification technique which is widely used for checking temporal properties of software systems. Even though it is a property verification technique, its common usage in practice is in "bug finding", that is, finding violations of temporal properties. Motivated by this observation and leveraging the recent progress in fuzzing, we build a greybox fuzzing framework to f…
▽ More
Software model checking is a verification technique which is widely used for checking temporal properties of software systems. Even though it is a property verification technique, its common usage in practice is in "bug finding", that is, finding violations of temporal properties. Motivated by this observation and leveraging the recent progress in fuzzing, we build a greybox fuzzing framework to find violations of Linear-time Temporal Logic (LTL) properties.
Our framework takes as input a sequential program written in C/C++, and an LTL property. It finds violations, or counterexample traces, of the LTL property in stateful software systems; however, it does not achieve verification. Our work substantially extends directed greybox fuzzing to witness arbitrarily complex event orderings. We note that existing directed greybox fuzzing approaches are limited to witnessing reaching a location or witnessing simple event orderings like use-after-free. At the same time, compared to model checkers, our approach finds the counterexamples faster, thereby finding more counterexamples within a given time budget.
Our LTL-Fuzzer tool, built on top of the AFL fuzzer, is shown to be effective in detecting bugs in well-known protocol implementations, such as OpenSSL and Telnet. We use LTL-Fuzzer to reproduce known vulnerabilities (CVEs), to find 15 zero-day bugs by checking properties extracted from RFCs (for which 12 CVEs have been assigned), and to find violations of both safety as well as liveness properties in real-world protocol implementations. Our work represents a practical advance over software model checkers -- while simultaneously representing a conceptual advance over existing greybox fuzzers. Our work thus provides a starting point for understanding the unexplored synergies between software model checking and greybox fuzzing.
△ Less
Submitted 19 April, 2022; v1 submitted 6 September, 2021;
originally announced September 2021.
-
Bayesian Inference in High-Dimensional Time-Serieswith the Orthogonal Stochastic Linear Mixing Model
Authors:
Rui Meng,
Kristofer Bouchard
Abstract:
Many modern time-series datasets contain large numbers of output response variables sampled for prolonged periods of time. For example, in neuroscience, the activities of 100s-1000's of neurons are recorded during behaviors and in response to sensory stimuli. Multi-output Gaussian process models leverage the nonparametric nature of Gaussian processes to capture structure across multiple outputs. H…
▽ More
Many modern time-series datasets contain large numbers of output response variables sampled for prolonged periods of time. For example, in neuroscience, the activities of 100s-1000's of neurons are recorded during behaviors and in response to sensory stimuli. Multi-output Gaussian process models leverage the nonparametric nature of Gaussian processes to capture structure across multiple outputs. However, this class of models typically assumes that the correlations between the output response variables are invariant in the input space. Stochastic linear mixing models (SLMM) assume the mixture coefficients depend on input, making them more flexible and effective to capture complex output dependence. However, currently, the inference for SLMMs is intractable for large datasets, making them inapplicable to several modern time-series problems. In this paper, we propose a new regression framework, the orthogonal stochastic linear mixing model (OSLMM) that introduces an orthogonal constraint amongst the mixing coefficients. This constraint reduces the computational burden of inference while retaining the capability to handle complex output dependence. We provide Markov chain Monte Carlo inference procedures for both SLMM and OSLMM and demonstrate superior model scalability and reduced prediction error of OSLMM compared with state-of-the-art methods on several real-world applications. In neurophysiology recordings, we use the inferred latent functions for compact visualization of population responses to auditory stimuli, and demonstrate superior results compared to a competing method (GPFA). Together, these results demonstrate that OSLMM will be useful for the analysis of diverse, large-scale time-series datasets.
△ Less
Submitted 12 March, 2022; v1 submitted 24 June, 2021;
originally announced June 2021.
-
Stochastic Collapsed Variational Inference for Structured Gaussian Process Regression Network
Authors:
Rui Meng,
Herbie Lee,
Kristofer Bouchard
Abstract:
This paper presents an efficient variational inference framework for deriving a family of structured gaussian process regression network (SGPRN) models. The key idea is to incorporate auxiliary inducing variables in latent functions and jointly treats both the distributions of the inducing variables and hyper-parameters as variational parameters. Then we propose structured variable distributions a…
▽ More
This paper presents an efficient variational inference framework for deriving a family of structured gaussian process regression network (SGPRN) models. The key idea is to incorporate auxiliary inducing variables in latent functions and jointly treats both the distributions of the inducing variables and hyper-parameters as variational parameters. Then we propose structured variable distributions and marginalize latent variables, which enables the decomposability of a tractable variational lower bound and leads to stochastic optimization. Our inference approach is able to model data in which outputs do not share a common input set with a computational complexity independent of the size of the inputs and outputs and thus easily handle datasets with missing values. We illustrate the performance of our method on synthetic data and real datasets and show that our model generally provides better imputation results on missing data than the state-of-the-art. We also provide a visualization approach for time-varying correlation across outputs in electrocoticography data and those estimates provide insight to understand the neural population dynamics.
△ Less
Submitted 17 November, 2021; v1 submitted 1 June, 2021;
originally announced June 2021.
-
Bringing Structure into Summaries: a Faceted Summarization Dataset for Long Scientific Documents
Authors:
Rui Meng,
Khushboo Thaker,
Lei Zhang,
Yue Dong,
Xingdi Yuan,
Tong Wang,
Daqing He
Abstract:
Faceted summarization provides briefings of a document from different perspectives. Readers can quickly comprehend the main points of a long document with the help of a structured outline. However, little research has been conducted on this subject, partially due to the lack of large-scale faceted summarization datasets. In this study, we present FacetSum, a faceted summarization benchmark built o…
▽ More
Faceted summarization provides briefings of a document from different perspectives. Readers can quickly comprehend the main points of a long document with the help of a structured outline. However, little research has been conducted on this subject, partially due to the lack of large-scale faceted summarization datasets. In this study, we present FacetSum, a faceted summarization benchmark built on Emerald journal articles, covering a diverse range of domains. Different from traditional document-summary pairs, FacetSum provides multiple summaries, each targeted at specific sections of a long document, including the purpose, method, findings, and value. Analyses and empirical results on our dataset reveal the importance of bringing structure into summaries. We believe FacetSum will spur further advances in summarization research and foster the development of NLP systems that can leverage the structured information in both long texts and summaries.
△ Less
Submitted 22 June, 2021; v1 submitted 31 May, 2021;
originally announced June 2021.
-
Unsupervised Deep Keyphrase Generation
Authors:
Xianjie Shen,
Yinghan Wang,
Rui Meng,
Jingbo Shang
Abstract:
Keyphrase generation aims to summarize long documents with a collection of salient phrases. Deep neural models have demonstrated a remarkable success in this task, capable of predicting keyphrases that are even absent from a document. However, such abstractiveness is acquired at the expense of a substantial amount of annotated data. In this paper, we present a novel method for keyphrase generation…
▽ More
Keyphrase generation aims to summarize long documents with a collection of salient phrases. Deep neural models have demonstrated a remarkable success in this task, capable of predicting keyphrases that are even absent from a document. However, such abstractiveness is acquired at the expense of a substantial amount of annotated data. In this paper, we present a novel method for keyphrase generation, AutoKeyGen, without the supervision of any human annotation. Motivated by the observation that an absent keyphrase in one document can appear in other places, in whole or in part, we first construct a phrase bank by pooling all phrases in a corpus. With this phrase bank, we then draw candidate absent keyphrases for each document through a partial matching process. To rank both types of candidates, we combine their lexical- and semantic-level similarities to the input document. Moreover, we utilize these top-ranked candidates as to train a deep generative model for more absent keyphrases. Extensive experiments demonstrate that AutoKeyGen outperforms all unsupervised baselines and can even beat strong supervised methods in certain cases.
△ Less
Submitted 18 April, 2021;
originally announced April 2021.
-
Teacher-Student Asynchronous Learning with Multi-Source Consistency for Facial Landmark Detection
Authors:
Rongye Meng,
Sanping Zhou,
Xingyu Wan,
Mengliu Li,
Jinjun Wang
Abstract:
Due to the high annotation cost of large-scale facial landmark detection tasks in videos, a semi-supervised paradigm that uses self-training for mining high-quality pseudo-labels to participate in training has been proposed by researchers. However, self-training based methods often train with a gradually increasing number of samples, whose performances vary a lot depending on the number of pseudo-…
▽ More
Due to the high annotation cost of large-scale facial landmark detection tasks in videos, a semi-supervised paradigm that uses self-training for mining high-quality pseudo-labels to participate in training has been proposed by researchers. However, self-training based methods often train with a gradually increasing number of samples, whose performances vary a lot depending on the number of pseudo-labeled samples added.
In this paper, we propose a teacher-student asynchronous learning~(TSAL) framework based on the multi-source supervision signal consistency criterion, which implicitly mines pseudo-labels through consistency constraints. Specifically, the TSAL framework contains two models with exactly the same structure. The radical student uses multi-source supervision signals from the same task to update parameters, while the calm teacher uses a single-source supervision signal to update parameters. In order to reasonably absorb student's suggestions, teacher's parameters are updated again through recursive average filtering. The experimental results prove that asynchronous-learning framework can effectively filter noise in multi-source supervision signals, thereby mining the pseudo-labels which are more significant for network parameter updating. And extensive experiments on 300W, AFLW, and 300VW benchmarks show that the TSAL framework achieves state-of-the-art performance.
△ Less
Submitted 11 December, 2020;
originally announced December 2020.
-
Predicting User Engagement Status for Online Evaluation of Intelligent Assistants
Authors:
Rui Meng,
Zhen Yue,
Alyssa Glass
Abstract:
Evaluation of intelligent assistants in large-scale and online settings remains an open challenge. User behavior-based online evaluation metrics have demonstrated great effectiveness for monitoring large-scale web search and recommender systems. Therefore, we consider predicting user engagement status as the very first and critical step to online evaluation for intelligent assistants. In this work…
▽ More
Evaluation of intelligent assistants in large-scale and online settings remains an open challenge. User behavior-based online evaluation metrics have demonstrated great effectiveness for monitoring large-scale web search and recommender systems. Therefore, we consider predicting user engagement status as the very first and critical step to online evaluation for intelligent assistants. In this work, we first proposed a novel framework for classifying user engagement status into four categories -- fulfillment, continuation, reformulation and abandonment. We then demonstrated how to design simple but indicative metrics based on the framework to quantify user engagement levels. We also aim for automating user engagement prediction with machine learning methods. We compare various models and features for predicting engagement status using four real-world datasets. We conducted detailed analyses on features and failure cases to discuss the performance of current models as well as challenges.
△ Less
Submitted 31 May, 2021; v1 submitted 1 October, 2020;
originally announced October 2020.
-
An Empirical Study on Neural Keyphrase Generation
Authors:
Rui Meng,
Xingdi Yuan,
Tong Wang,
Sanqiang Zhao,
Adam Trischler,
Daqing He
Abstract:
Recent years have seen a flourishing of neural keyphrase generation (KPG) works, including the release of several large-scale datasets and a host of new models to tackle them. Model performance on KPG tasks has increased significantly with evolving deep learning research. However, there lacks a comprehensive comparison among different model designs, and a thorough investigation on related factors…
▽ More
Recent years have seen a flourishing of neural keyphrase generation (KPG) works, including the release of several large-scale datasets and a host of new models to tackle them. Model performance on KPG tasks has increased significantly with evolving deep learning research. However, there lacks a comprehensive comparison among different model designs, and a thorough investigation on related factors that may affect a KPG system's generalization performance. In this empirical study, we aim to fill this gap by providing extensive experimental results and analyzing the most crucial factors impacting the generalizability of KPG models. We hope this study can help clarify some of the uncertainties surrounding the KPG task and facilitate future research on this topic.
△ Less
Submitted 15 April, 2021; v1 submitted 21 September, 2020;
originally announced September 2020.
-
Spatiotemporal Attention for Multivariate Time Series Prediction and Interpretation
Authors:
Tryambak Gangopadhyay,
Sin Yong Tan,
Zhanhong Jiang,
Rui Meng,
Soumik Sarkar
Abstract:
Multivariate time series modeling and prediction problems are abundant in many machine learning application domains. Accurate interpretation of such prediction outcomes from a machine learning model that explicitly captures temporal correlations can significantly benefit the domain experts. In this context, temporal attention has been successfully applied to isolate the important time steps for th…
▽ More
Multivariate time series modeling and prediction problems are abundant in many machine learning application domains. Accurate interpretation of such prediction outcomes from a machine learning model that explicitly captures temporal correlations can significantly benefit the domain experts. In this context, temporal attention has been successfully applied to isolate the important time steps for the input time series. However, in multivariate time series problems, spatial interpretation is also critical to understand the contributions of different variables on the model outputs. We propose a novel deep learning architecture, called spatiotemporal attention mechanism (STAM) for simultaneous learning of the most important time steps and variables. STAM is a causal (i.e., only depends on past inputs and does not use future inputs) and scalable (i.e., scales well with an increase in the number of variables) approach that is comparable to the state-of-the-art models in terms of computational tractability. We demonstrate our models' performance on two popular public datasets and a domain-specific dataset. When compared with the baseline models, the results show that STAM maintains state-of-the-art prediction accuracy while offering the benefit of accurate spatiotemporal interpretability. The learned attention weights are validated from a domain knowledge perspective for these real-world datasets.
△ Less
Submitted 26 October, 2020; v1 submitted 11 August, 2020;
originally announced August 2020.
-
Neural Inheritance Relation Guided One-Shot Layer Assignment Search
Authors:
Rang Meng,
Weijie Chen,
Di Xie,
Yuan Zhang,
Shiliang Pu
Abstract:
Layer assignment is seldom picked out as an independent research topic in neural architecture search. In this paper, for the first time, we systematically investigate the impact of different layer assignments to the network performance by building an architecture dataset of layer assignment on CIFAR-100. Through analyzing this dataset, we discover a neural inheritance relation among the networks w…
▽ More
Layer assignment is seldom picked out as an independent research topic in neural architecture search. In this paper, for the first time, we systematically investigate the impact of different layer assignments to the network performance by building an architecture dataset of layer assignment on CIFAR-100. Through analyzing this dataset, we discover a neural inheritance relation among the networks with different layer assignments, that is, the optimal layer assignments for deeper networks always inherit from those for shallow networks. Inspired by this neural inheritance relation, we propose an efficient one-shot layer assignment search approach via inherited sampling. Specifically, the optimal layer assignment searched in the shallow network can be provided as a strong sampling priori to train and search the deeper ones in supernet, which extremely reduces the network search space. Comprehensive experiments carried out on CIFAR-100 illustrate the efficiency of our proposed method. Our search results are strongly consistent with the optimal ones directly selected from the architecture dataset. To further confirm the generalization of our proposed method, we also conduct experiments on Tiny-ImageNet and ImageNet. Our searched results are remarkably superior to the handcrafted ones under the unchanged computational budgets. The neural inheritance relation discovered in this paper can provide insights to the universal neural architecture search.
△ Less
Submitted 28 February, 2020;
originally announced February 2020.
-
Regularized Sparse Gaussian Processes
Authors:
Rui Meng,
Herbert Lee,
Soper Braden,
Priyadip Ray
Abstract:
Gaussian processes are a flexible Bayesian nonparametric modelling approach that has been widely applied but poses computational challenges. To address the poor scaling of exact inference methods, approximation methods based on sparse Gaussian processes (SGP) are attractive. An issue faced by SGP, especially in latent variable models, is the inefficient learning of the inducing inputs, which leads…
▽ More
Gaussian processes are a flexible Bayesian nonparametric modelling approach that has been widely applied but poses computational challenges. To address the poor scaling of exact inference methods, approximation methods based on sparse Gaussian processes (SGP) are attractive. An issue faced by SGP, especially in latent variable models, is the inefficient learning of the inducing inputs, which leads to poor model prediction. We propose a regularization approach by balancing the reconstruction performance of data and the approximation performance of the model itself. This regularization improves both inference and prediction performance. We extend this regularization approach into latent variable models with SGPs and show that performing variational inference (VI) on those models is equivalent to performing VI on a related empirical Bayes model.
△ Less
Submitted 30 May, 2021; v1 submitted 13 October, 2019;
originally announced October 2019.
-
Does Order Matter? An Empirical Study on Generating Multiple Keyphrases as a Sequence
Authors:
Rui Meng,
Xingdi Yuan,
Tong Wang,
Peter Brusilovsky,
Adam Trischler,
Daqing He
Abstract:
Recently, concatenating multiple keyphrases as a target sequence has been proposed as a new learning paradigm for keyphrase generation. Existing studies concatenate target keyphrases in different orders but no study has examined the effects of ordering on models' behavior. In this paper, we propose several orderings for concatenation and inspect the important factors for training a successful keyp…
▽ More
Recently, concatenating multiple keyphrases as a target sequence has been proposed as a new learning paradigm for keyphrase generation. Existing studies concatenate target keyphrases in different orders but no study has examined the effects of ordering on models' behavior. In this paper, we propose several orderings for concatenation and inspect the important factors for training a successful keyphrase generation model. By running comprehensive comparisons, we observe one preferable ordering and summarize a number of empirical findings and challenges, which can shed light on future research on this line of work.
△ Less
Submitted 28 February, 2022; v1 submitted 8 September, 2019;
originally announced September 2019.
-
An active smartphone authentication method based on daily cyclical activity
Authors:
Chunmin Mi,
Runjie Xu,
Ching-Torng Lin,
Run Yu Meng
Abstract:
Smartphones have become an important tool for people's daily lives, which brings higher security requirements in high-risk application areas, for example, mobile payment. Although the combination of physical password, fingerprint and facial recognition have improved the security to a certain extent, there still exists a high risk of being decrepted. This paper attempts an algorithm which is more s…
▽ More
Smartphones have become an important tool for people's daily lives, which brings higher security requirements in high-risk application areas, for example, mobile payment. Although the combination of physical password, fingerprint and facial recognition have improved the security to a certain extent, there still exists a high risk of being decrepted. This paper attempts an algorithm which is more suitable for studying human partial periodic activity, namely Prophet algorithm. This algorithm has strong robustness for missing data and trend change, and can deal with outliers well. The experimental results on the UniMiB SHAR DATA show that the user simply needs to do 5 cycles of specified actions to realize the prediction of the next time series. The Error analysis of cross validation was applied to 4 different indicators, and the Mean Squared Error of the optimal result "Jumping" behavior was only 8.20%. With these appealing features, The main contribution of this paper is to propose a smart phone user identification system based on behavioral activity cycle, which can be replicated in other behavioral studies. Another outstanding feature of such a system is the capability of fitting models using small data set by exploiting behavioral characteristics derived from periodicity and thus reducing dependence on sensor scanning frequency, therefore the system balances among energy consumption, data quantity and fitting accuracy.
△ Less
Submitted 8 March, 2020; v1 submitted 30 August, 2019;
originally announced September 2019.
-
Continuous Regular Functions
Authors:
Alexi Block Gorman,
Philipp Hieronymi,
Elliot Kaplan,
Ruoyu Meng,
Erik Walsberg,
Zihe Wang,
Ziqin Xiong,
Hongru Yang
Abstract:
Following Chaudhuri, Sankaranarayanan, and Vardi, we say that a function $f:[0,1] \to [0,1]$ is $r$-regular if there is a Büchi automaton that accepts precisely the set of base $r \in \mathbb{N}$ representations of elements of the graph of $f$. We show that a continuous $r$-regular function $f$ is locally affine away from a nowhere dense, Lebesgue null, subset of $[0,1]$. As a corollary we establi…
▽ More
Following Chaudhuri, Sankaranarayanan, and Vardi, we say that a function $f:[0,1] \to [0,1]$ is $r$-regular if there is a Büchi automaton that accepts precisely the set of base $r \in \mathbb{N}$ representations of elements of the graph of $f$. We show that a continuous $r$-regular function $f$ is locally affine away from a nowhere dense, Lebesgue null, subset of $[0,1]$. As a corollary we establish that every differentiable $r$-regular function is affine. It follows that checking whether an $r$-regular function is differentiable is in $\operatorname{PSPACE}$. Our proofs rely crucially on connections between automata theory and metric geometry developed by Charlier, Leroy, and Rigo.
△ Less
Submitted 13 February, 2020; v1 submitted 10 January, 2019;
originally announced January 2019.
-
Integrating Transformer and Paraphrase Rules for Sentence Simplification
Authors:
Sanqiang Zhao,
Rui Meng,
Daqing He,
Saptono Andi,
Parmanto Bambang
Abstract:
Sentence simplification aims to reduce the complexity of a sentence while retaining its original meaning. Current models for sentence simplification adopted ideas from ma- chine translation studies and implicitly learned simplification mapping rules from normal- simple sentence pairs. In this paper, we explore a novel model based on a multi-layer and multi-head attention architecture and we pro- p…
▽ More
Sentence simplification aims to reduce the complexity of a sentence while retaining its original meaning. Current models for sentence simplification adopted ideas from ma- chine translation studies and implicitly learned simplification mapping rules from normal- simple sentence pairs. In this paper, we explore a novel model based on a multi-layer and multi-head attention architecture and we pro- pose two innovative approaches to integrate the Simple PPDB (A Paraphrase Database for Simplification), an external paraphrase knowledge base for simplification that covers a wide range of real-world simplification rules. The experiments show that the integration provides two major benefits: (1) the integrated model outperforms multiple state- of-the-art baseline models for sentence simplification in the literature (2) through analysis of the rule utilization, the model seeks to select more accurate simplification rules. The code and models used in the paper are available at https://github.com/ Sanqiang/text_simplification.
△ Less
Submitted 26 October, 2018;
originally announced October 2018.
-
One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases
Authors:
Xingdi Yuan,
Tong Wang,
Rui Meng,
Khushboo Thaker,
Peter Brusilovsky,
Daqing He,
Adam Trischler
Abstract:
Different texts shall by nature correspond to different number of keyphrases. This desideratum is largely missing from existing neural keyphrase generation models. In this study, we address this problem from both modeling and evaluation perspectives.
We first propose a recurrent generative model that generates multiple keyphrases as delimiter-separated sequences. Generation diversity is further…
▽ More
Different texts shall by nature correspond to different number of keyphrases. This desideratum is largely missing from existing neural keyphrase generation models. In this study, we address this problem from both modeling and evaluation perspectives.
We first propose a recurrent generative model that generates multiple keyphrases as delimiter-separated sequences. Generation diversity is further enhanced with two novel techniques by manipulating decoder hidden states. In contrast to previous approaches, our model is capable of generating diverse keyphrases and controlling number of outputs.
We further propose two evaluation metrics tailored towards the variable-number generation. We also introduce a new dataset StackEx that expands beyond the only existing genre (i.e., academic writing) in keyphrase generation tasks. With both previous and new evaluation metrics, our model outperforms strong baselines on all datasets.
△ Less
Submitted 12 May, 2020; v1 submitted 11 October, 2018;
originally announced October 2018.
-
PIRM Challenge on Perceptual Image Enhancement on Smartphones: Report
Authors:
Andrey Ignatov,
Radu Timofte,
Thang Van Vu,
Tung Minh Luu,
Trung X Pham,
Cao Van Nguyen,
Yongwoo Kim,
Jae-Seok Choi,
Munchurl Kim,
Jie Huang,
Jiewen Ran,
Chen Xing,
Xingguang Zhou,
Pengfei Zhu,
Mingrui Geng,
Yawei Li,
Eirikur Agustsson,
Shuhang Gu,
Luc Van Gool,
Etienne de Stoutz,
Nikolay Kobyshev,
Kehui Nie,
Yan Zhao,
Gen Li,
Tong Tong
, et al. (23 additional authors not shown)
Abstract:
This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones. The challenge consisted of two tracks. In the first one, participants were solving the classical image super-resolution problem with a bicubic downscaling factor of 4. The second track was aimed at real-world photo enhancement, and the goal was to map lo…
▽ More
This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones. The challenge consisted of two tracks. In the first one, participants were solving the classical image super-resolution problem with a bicubic downscaling factor of 4. The second track was aimed at real-world photo enhancement, and the goal was to map low-quality photos from the iPhone 3GS device to the same photos captured with a DSLR camera. The target metric used in this challenge combined the runtime, PSNR scores and solutions' perceptual results measured in the user study. To ensure the efficiency of the submitted models, we additionally measured their runtime and memory requirements on Android smartphones. The proposed solutions significantly improved baseline results defining the state-of-the-art for image enhancement on smartphones.
△ Less
Submitted 3 October, 2018;
originally announced October 2018.