Skip to main content

Showing 1–50 of 62 results for author: Su, P

  1. arXiv:2406.19531  [pdf, other

    stat.ML cs.LG

    Forward and Backward State Abstractions for Off-policy Evaluation

    Authors: Meiling Hao, Pingfan Su, Liyuan Hu, Zoltan Szabo, Qingyuan Zhao, Chengchun Shi

    Abstract: Off-policy evaluation (OPE) is crucial for evaluating a target policy's impact offline before its deployment. However, achieving accurate OPE in large state spaces remains challenging.This paper studies state abstractions-originally designed for policy learning-in the context of OPE. Our contributions are three-fold: (i) We define a set of irrelevance conditions central to learning state abstracti… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 42 pages, 5 figures

    ACM Class: G.3; I.2.6; G.1.2

  2. arXiv:2405.00526  [pdf, other

    cs.CR

    JNI Global References Are Still Vulnerable: Attacks and Defenses

    Authors: Yi He, Yuan Zhou, Yacong Gu, Purui Su, Qi Li, Yajin Zhou, Yong Jiang

    Abstract: System services and resources in Android are accessed through IPC based mechanisms. Previous research has demonstrated that they are vulnerable to the denial-of-service attack (DoS attack). For instance, the JNI global reference (JGR), which is widely used by system services, can be exhausted to cause the system reboot (hence the name JGRE attack). Even though the Android team tries to fix the pro… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  3. arXiv:2403.18166  [pdf, other

    eess.SY cs.MA econ.TH math.OC

    Incentive-Compatible Vertiport Reservation in Advanced Air Mobility: An Auction-Based Approach

    Authors: Pan-Yang Su, Chinmay Maheshwari, Victoria Tuck, Shankar Sastry

    Abstract: The rise of advanced air mobility (AAM) is expected to become a multibillion-dollar industry in the near future. Market-based mechanisms are touted to be an integral part of AAM operations, which comprise heterogeneous operators with private valuations. In this work, we study the problem of designing a mechanism to coordinate the movement of electric vertical take-off and landing (eVTOL) aircraft,… ▽ More

    Submitted 7 July, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: 26 pages, 2 figures, 1 table

    MSC Class: 91B03; 91A68; 90B06; 90C27

  4. arXiv:2403.02972  [pdf

    cs.HC cs.CY

    Bodioid: philosophical reflections on the hybrid of bodies and artefacts towards post-human

    Authors: Jiang Xu, Gang Sun, Jingyu Xu, Pujie Su

    Abstract: The advent of the post-human era has blurred the boundary between the body and artefacts. Further, external materials and information are more deeply integrated into the body, making emerging technology a key driving force for shaping post-human existence and promoting bodily evolution. Based on this, this study analyses the transformation process of three technological forms, namely tools, machin… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  5. arXiv:2312.12439  [pdf, other

    cs.CV physics.optics

    Single-pixel 3D imaging based on fusion temporal data of single photon detector and millimeter-wave radar

    Authors: Tingqin Lai, Xiaolin Liang, Yi Zhu, Xinyi Wu, Lianye Liao, Xuelin Yuan, Ping Su, Shihai Sun

    Abstract: Recently, there has been increased attention towards 3D imaging using single-pixel single-photon detection (also known as temporal data) due to its potential advantages in terms of cost and power efficiency. However, to eliminate the symmetry blur in the reconstructed images, a fixed background is required. This paper proposes a fusion-data-based 3D imaging method that utilizes a single-pixel sing… ▽ More

    Submitted 20 October, 2023; originally announced December 2023.

    Comments: Accepted by Chinese Optics Letters, and comments are welcome

    Journal ref: Chinese Optics Letters, Vol.2, No.2, 2024

  6. arXiv:2311.10960  [pdf, other

    cs.CR

    Reveal the Mathematical Structures of Honeyword Security Metrics

    Authors: Pengcheng Su, Haibo Cheng, Wenting Li, Ping Wang

    Abstract: Honeyword is a representative ``honey" technique to detect intruders by luring them with decoy data. This kind of honey technique blends a primary object (from distribution $P$) with decoy samples (from distribution $Q$). In this research, we focus on two key Honeyword security metrics: the flatness function and the success-number function. Previous researchers are engaged in designing experimenta… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  7. arXiv:2311.02311  [pdf, other

    cs.NI

    A Brief Survey of Open Radio Access Network (O-RAN) Security

    Authors: Yi-Zih Chen, Terrance Yu-Hao Chen, Po-Jung Su, Chi-Ting Liu

    Abstract: Open Radio Access Network (O-RAN), a novel architecture that separates the traditional radio access network (RAN) into multiple disaggregated components, leads a revolution in the telecommunication ecosystems. Compared to the traditional RAN, the proposed O-RAN paradigm is more flexible and more cost-effective for the operators, vendors, and the public. The key design considerations of O-RAN inclu… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  8. arXiv:2310.17152  [pdf

    cs.CV cs.AI cs.LG q-bio.QM

    Technical Note: Feasibility of translating 3.0T-trained Deep-Learning Segmentation Models Out-of-the-Box on Low-Field MRI 0.55T Knee-MRI of Healthy Controls

    Authors: Rupsa Bhattacharjee, Zehra Akkaya, Johanna Luitjens, Pan Su, Yang Yang, Valentina Pedoia, Sharmila Majumdar

    Abstract: In the current study, our purpose is to evaluate the feasibility of applying deep learning (DL) enabled algorithms to quantify bilateral knee biomarkers in healthy controls scanned at 0.55T and compared with 3.0T. The current study assesses the performance of standard in-practice bone, and cartilage segmentation algorithms at 0.55T, both qualitatively and quantitatively, in terms of comparing segm… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: 11 Pages, 3 Figures, 2 Tables

  9. arXiv:2309.08628  [pdf, other

    cs.CL cs.CR cs.LG

    Recovering from Privacy-Preserving Masking with Large Language Models

    Authors: Arpita Vats, Zhe Liu, Peng Su, Debjyoti Paul, Yingyi Ma, Yutong Pang, Zeeshan Ahmed, Ozlem Kalinli

    Abstract: Model adaptation is crucial to handle the discrepancy between proxy training data and actual users data received. To effectively perform adaptation, textual data of users is typically stored on servers or their local devices, where downstream natural language processing (NLP) models can be directly trained using such in-domain data. However, this might raise privacy and security concerns due to th… ▽ More

    Submitted 13 December, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP

  10. arXiv:2307.14051  [pdf, other

    cs.CV

    3D Semantic Subspace Traverser: Empowering 3D Generative Model with Shape Editing Capability

    Authors: Ruowei Wang, Yu Liu, Pei Su, Jianwei Zhang, Qijun Zhao

    Abstract: Shape generation is the practice of producing 3D shapes as various representations for 3D content creation. Previous studies on 3D shape generation have focused on shape quality and structure, without or less considering the importance of semantic information. Consequently, such generative models often fail to preserve the semantic consistency of shape structure or enable manipulation of the seman… ▽ More

    Submitted 15 August, 2023; v1 submitted 26 July, 2023; originally announced July 2023.

    Comments: Published in ICCV 2023. Code: https://github.com/TrepangCat/3D_Semantic_Subspace_Traverser

  11. arXiv:2305.06378  [pdf, other

    quant-ph cs.LG

    Discovery of Optimal Quantum Error Correcting Codes via Reinforcement Learning

    Authors: Vincent Paul Su, ChunJun Cao, Hong-Ye Hu, Yariv Yanay, Charles Tahan, Brian Swingle

    Abstract: The recently introduced Quantum Lego framework provides a powerful method for generating complex quantum error correcting codes (QECCs) out of simple ones. We gamify this process and unlock a new avenue for code design and discovery using reinforcement learning (RL). One benefit of RL is that we can specify \textit{arbitrary} properties of the code to be optimized. We train on two such properties,… ▽ More

    Submitted 12 June, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: 10 pages + appendices; v2 figure updated and note added

  12. arXiv:2301.08506  [pdf, other

    cs.CL cs.LG

    Language Agnostic Data-Driven Inverse Text Normalization

    Authors: Szu-Jui Chen, Debjyoti Paul, Yutong Pang, Peng Su, Xuedong Zhang

    Abstract: With the emergence of automatic speech recognition (ASR) models, converting the spoken form text (from ASR) to the written form is in urgent need. This inverse text normalization (ITN) problem attracts the attention of researchers from various fields. Recently, several works show that data-driven ITN methods can output high-quality written form text. Due to the scarcity of labeled spoken-written d… ▽ More

    Submitted 23 January, 2023; v1 submitted 20 January, 2023; originally announced January 2023.

  13. arXiv:2205.14081  [pdf, other

    quant-ph cs.ET gr-qc hep-th

    Towards Quantum Gravity in the Lab on Quantum Processors

    Authors: Illya Shapoval, Vincent Paul Su, Wibe de Jong, Miro Urbanek, Brian Swingle

    Abstract: The holographic principle and its realization in the AdS/CFT correspondence led to unexpected connections between general relativity and quantum information. This set the stage for studying aspects of quantum gravity models, which are otherwise difficult to access, in table-top quantum-computational experiments. Recent works have designed a special teleportation protocol that realizes a surprising… ▽ More

    Submitted 11 October, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: 21 pages, 6 figures, 2 tables, 1 listing; updated to match journal

    Journal ref: Quantum 7, 1138 (2023)

  14. OJXPerf: Featherlight Object Replica Detection for Java Programs

    Authors: Bolun Li, Hao Xu, Qidong Zhao, Pengfei Su, Milind Chabbi, Shuyin Jiao, Xu Liu

    Abstract: Memory bloat is an important source of inefficiency in complex production software, especially in software written in managed languages such as Java. Prior approaches to this problem have focused on identifying objects that outlive their life span. Few studies have, however, looked into whether and to what extent myriad objects of the same type are identical. A quantitative assessment of identical… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Journal ref: 44th International Conference on Software Engineering (ICSE 2022)

  15. A Systematic Study of Android Non-SDK (Hidden) Service API Security

    Authors: Yi He, Yacong Gu, Purui Su, Kun Sun, Yajin Zhou, Zhi Wang, Qi Li

    Abstract: Android allows apps to communicate with its system services via system service helpers so that these apps can use various functions provided by the system services. Meanwhile, the system services rely on their service helpers to enforce security checks for protection. Unfortunately, the security checks in the service helpers may be bypassed via directly exploiting the non-SDK (hidden) APIs, degrad… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Journal ref: 10.1109/TDSC.2022.3160872

  16. arXiv:2201.02365  [pdf, other

    cs.CV

    Motion Prediction via Joint Dependency Modeling in Phase Space

    Authors: Pengxiang Su, Zhenguang Liu, Shuang Wu, Lei Zhu, Yifang Yin, Xuanjing Shen

    Abstract: Motion prediction is a classic problem in computer vision, which aims at forecasting future motion given the observed pose sequence. Various deep learning models have been proposed, achieving state-of-the-art performance on motion prediction. However, existing methods typically focus on modeling temporal dynamics in the pose space. Unfortunately, the complicated and high dimensionality nature of h… ▽ More

    Submitted 7 January, 2022; originally announced January 2022.

  17. Measuring Outcomes in Healthcare Economics using Artificial Intelligence: with Application to Resource Management

    Authors: Chih-Hao Huang, Feras A. Batarseh, Adel Boueiz, Ajay Kulkarni, Po-Hsuan Su, Jahan Aman

    Abstract: The quality of service in healthcare is constantly challenged by outlier events such as pandemics (i.e. Covid-19) and natural disasters (such as hurricanes and earthquakes). In most cases, such events lead to critical uncertainties in decision making, as well as in multiple medical and economic aspects at a hospital. External (geographic) or internal factors (medical and managerial), lead to shift… ▽ More

    Submitted 14 November, 2021; originally announced November 2021.

    Comments: This paper is published at Cambridge University Press Journal of Data & Policy

    Journal ref: Data & Policy, 3, E30

  18. arXiv:2109.10126  [pdf, other

    cs.CL

    ConvFiT: Conversational Fine-Tuning of Pretrained Language Models

    Authors: Ivan Vulić, Pei-Hao Su, Sam Coope, Daniela Gerz, Paweł Budzianowski, Iñigo Casanueva, Nikola Mrkšić, Tsung-Hsien Wen

    Abstract: Transformer-based language models (LMs) pretrained on large text collections are proven to store a wealth of semantic knowledge. However, 1) they are not effective as sentence encoders when used off-the-shelf, and 2) thus typically lag behind conversationally pretrained (e.g., via response selection) encoders on conversational tasks such as intent detection (ID). In this work, we propose ConvFiT,… ▽ More

    Submitted 21 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 (long paper)

  19. Quantum-Inspired Keyword Search on Multi-Model Databases

    Authors: Gongsheng Yuan, Jiaheng Lu, Peifeng Su

    Abstract: With the rising applications implemented in different domains, it is inevitable to require databases to adopt corresponding appropriate data models to store and exchange data derived from various sources. To handle these data models in a single platform, the community of databases introduces a multi-model database. And many vendors are improving their products from supporting a single data model t… ▽ More

    Submitted 31 August, 2021; originally announced September 2021.

    Comments: 16 pages, 5 figures, Dasfaa

  20. arXiv:2104.13913  [pdf, other

    cs.CL

    Improving BERT Model Using Contrastive Learning for Biomedical Relation Extraction

    Authors: Peng Su, Yifan Peng, K. Vijay-Shanker

    Abstract: Contrastive learning has been used to learn a high-quality representation of the image in computer vision. However, contrastive learning is not widely utilized in natural language processing due to the lack of a general method of data augmentation for text data. In this work, we explore the method of employing contrastive learning to improve the text representation from the BERT model for relation… ▽ More

    Submitted 28 April, 2021; originally announced April 2021.

    Comments: Accepted by BioNLP 2021

  21. arXiv:2104.08524  [pdf, other

    cs.CL

    Multilingual and Cross-Lingual Intent Detection from Spoken Data

    Authors: Daniela Gerz, Pei-Hao Su, Razvan Kusztos, Avishek Mondal, Michał Lis, Eshan Singhal, Nikola Mrkšić, Tsung-Hsien Wen, Ivan Vulić

    Abstract: We present a systematic study on multilingual and cross-lingual intent detection from spoken data. The study leverages a new resource put forth in this work, termed MInDS-14, a first training and evaluation resource for the intent detection task with spoken data. It covers 14 intents extracted from a commercial system in the e-banking domain, associated with spoken examples in 14 diverse language… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

  22. arXiv:2104.03388  [pdf, other

    cs.PF cs.PL

    DJXPerf: Identifying Memory Inefficiencies via Object-centric Profiling for Java

    Authors: Bolun Li, Pengfei Su, Milind Chabbi, Shuyin Jiao, Xu Liu

    Abstract: Java is the "go-to" programming language choice for developing scalable enterprise cloud applications. In such systems, even a few percent CPU time savings can offer a significant competitive advantage and cost saving. Although performance tools abound in Java, those that focus on the data locality in the memory hierarchy are rare. In this paper, we present DJXPerf, a lightweight, object-centric… ▽ More

    Submitted 7 April, 2021; originally announced April 2021.

    Comments: 13 pages (including 2-page reference), 5 figures, 2 tables

  23. arXiv:2103.12294  [pdf, other

    cs.CV

    Gradient Regularized Contrastive Learning for Continual Domain Adaptation

    Authors: Shixiang Tang, Peng Su, Dapeng Chen, Wanli Ouyang

    Abstract: Human beings can quickly adapt to environmental changes by leveraging learning experience. However, adapting deep neural networks to dynamic environments by machine learning algorithms remains a challenge. To better understand this issue, we study the problem of continual domain adaptation, where the model is presented with a labelled source domain and a sequence of unlabelled target domains. The… ▽ More

    Submitted 23 March, 2021; originally announced March 2021.

    Comments: Accepted by AAAI2021 (poster). arXiv admin note: text overlap with arXiv:2007.12942

  24. arXiv:2012.02551  [pdf, other

    cs.DS math.CO

    An O(n) time algorithm for finding Hamilton cycles with high probability

    Authors: Rajko Nenadov, Angelika Steger, Pascal Su

    Abstract: We design a randomized algorithm that finds a Hamilton cycle in $\mathcal{O}(n)$ time with high probability in a random graph $G_{n,p}$ with edge probability $p\ge C \log n / n$. This closes a gap left open in a seminal paper by Angluin and Valiant from 1979.

    Submitted 4 December, 2020; originally announced December 2020.

  25. arXiv:2011.05921  [pdf, other

    math.CO cs.DS

    Mastermind with a Linear Number of Queries

    Authors: Anders Martinsson, Pascal Su

    Abstract: Since the 1960s Mastermind has been studied for the combinatorial and information theoretical interest the game has to offer. Many results have been discovered starting with Erdős and Rényi determining the optimal number of queries needed for two colors. For $k$ colors and $n$ positions, Chvátal found asymptotically optimal bounds when $k \le n^{1-ε}$. Following a sequence of gradual improvements… ▽ More

    Submitted 19 September, 2023; v1 submitted 11 November, 2020; originally announced November 2020.

  26. arXiv:2011.00398  [pdf, other

    cs.CL

    Investigation of BERT Model on Biomedical Relation Extraction Based on Revised Fine-tuning Mechanism

    Authors: Peng Su, K. Vijay-Shanker

    Abstract: With the explosive growth of biomedical literature, designing automatic tools to extract information from the literature has great significance in biomedical research. Recently, transformer-based BERT models adapted to the biomedical domain have produced leading results. However, all the existing BERT models for relation classification only utilize partial knowledge from the last layer. In this pa… ▽ More

    Submitted 31 October, 2020; originally announced November 2020.

  27. arXiv:2010.07486  [pdf, other

    eess.IV cs.CV

    CS2-Net: Deep Learning Segmentation of Curvilinear Structures in Medical Imaging

    Authors: Lei Mou, Yitian Zhao, Huazhu Fu, Yonghuai Liu, Jun Cheng, Yalin Zheng, Pan Su, Jianlong Yang, Li Chen, Alejandro F Frang, Masahiro Akiba, Jiang Liu

    Abstract: Automated detection of curvilinear structures, e.g., blood vessels or nerve fibres, from medical and biomedical images is a crucial early step in automatic image interpretation associated to the management of many diseases. Precise measurement of the morphological changes of these curvilinear organ structures informs clinicians for understanding the mechanism, diagnosis, and treatment of e.g. card… ▽ More

    Submitted 19 October, 2020; v1 submitted 14 October, 2020; originally announced October 2020.

  28. arXiv:2009.04488  [pdf, other

    hep-th cs.LG quant-ph

    Variational Preparation of the Sachdev-Ye-Kitaev Thermofield Double

    Authors: Vincent Paul Su

    Abstract: We provide an algorithm for preparing the thermofield double (TFD) state of the Sachdev-Ye-Kitaev model without the need for an auxiliary bath. Following previous work, the TFD can be cast as the approximate ground state of a Hamiltonian, $H_{\text{TFD}}$. Using variational quantum circuits, we propose and implement a gradient-based algorithm for learning parameters that find this ground state, an… ▽ More

    Submitted 10 December, 2020; v1 submitted 9 September, 2020; originally announced September 2020.

    Comments: 20 pages, 8 figures; v2 references added

    Journal ref: Phys. Rev. A 104, 012427 (2021)

  29. arXiv:2009.04450  [pdf, other

    cs.LG cs.CV cs.RO stat.ML

    Map-Adaptive Goal-Based Trajectory Prediction

    Authors: Lingyao Zhang, Po-Hsun Su, Jerrick Hoang, Galen Clark Haynes, Micol Marchetti-Bowick

    Abstract: We present a new method for multi-modal, long-term vehicle trajectory prediction. Our approach relies on using lane centerlines captured in rich maps of the environment to generate a set of proposed goal paths for each vehicle. Using these paths -- which are generated at run time and therefore dynamically adapt to the scene -- as spatial anchors, we predict a set of goal-based trajectories along w… ▽ More

    Submitted 13 November, 2020; v1 submitted 9 September, 2020; originally announced September 2020.

    Comments: Published at CoRL 2020

    Journal ref: Conference on Robot Learning (CoRL) 2020

  30. arXiv:2007.13135  [pdf, other

    cs.CV eess.IV

    Contrastive Visual-Linguistic Pretraining

    Authors: Lei Shi, Kai Shuang, Shijie Geng, Peng Su, Zhengkai Jiang, Peng Gao, Zuohui Fu, Gerard de Melo, Sen Su

    Abstract: Several multi-modality representation learning approaches such as LXMERT and ViLBERT have been proposed recently. Such approaches can achieve superior performance due to the high-level semantic information captured during large-scale multimodal pretraining. However, as ViLBERT and LXMERT adopt visual region regression and classification loss, they often suffer from domain gap and noisy label probl… ▽ More

    Submitted 26 July, 2020; originally announced July 2020.

  31. arXiv:2007.12942  [pdf, other

    cs.CV cs.LG

    Gradient Regularized Contrastive Learning for Continual Domain Adaptation

    Authors: Peng Su, Shixiang Tang, Peng Gao, Di Qiu, Ni Zhao, Xiaogang Wang

    Abstract: Human beings can quickly adapt to environmental changes by leveraging learning experience. However, the poor ability of adapting to dynamic environments remains a major challenge for AI models. To better understand this issue, we study the problem of continual domain adaptation, where the model is presented with a labeled source domain and a sequence of unlabeled target domains. There are two majo… ▽ More

    Submitted 25 July, 2020; originally announced July 2020.

  32. arXiv:2005.04277  [pdf, other

    cs.CL cs.LG

    Adversarial Learning for Supervised and Semi-supervised Relation Extraction in Biomedical Literature

    Authors: Peng Su, K. Vijay-Shanker

    Abstract: Adversarial training is a technique of improving model performance by involving adversarial examples in the training process. In this paper, we investigate adversarial training with multiple adversarial examples to benefit the relation extraction task. We also apply adversarial training technique in semi-supervised scenarios to utilize unlabeled data. The evaluation results on protein-protein inte… ▽ More

    Submitted 25 September, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

  33. arXiv:2003.07071  [pdf, other

    cs.CV

    Adapting Object Detectors with Conditional Domain Normalization

    Authors: Peng Su, Kun Wang, Xingyu Zeng, Shixiang Tang, Dapeng Chen, Di Qiu, Xiaogang Wang

    Abstract: Real-world object detectors are often challenged by the domain gaps between different datasets. In this work, we present the Conditional Domain Normalization (CDN) to bridge the domain gap. CDN is designed to encode different domain inputs into a shared latent space, where the features from different domains carry the same domain attribute. To achieve this, we first disentangle the domain-specific… ▽ More

    Submitted 22 July, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

    Comments: Accepted at ECCV 2020

  34. arXiv:2002.05317  [pdf, other

    quant-ph cs.IT hep-th

    The Quantum Entropy Cone of Hypergraphs

    Authors: Ning Bao, Newton Cheng, Sergio Hernández-Cuenca, Vincent P. Su

    Abstract: In this work, we generalize the graph-theoretic techniques used for the holographic entropy cone to study hypergraphs and their analogously-defined entropy cone. This allows us to develop a framework to efficiently compute entropies and prove inequalities satisfied by hypergraphs. In doing so, we discover a class of quantum entropy vectors which reach beyond those of holographic states and obey co… ▽ More

    Submitted 12 February, 2020; originally announced February 2020.

    Comments: 40+6 pages, 7 figures

    Journal ref: SciPost Phys. 9 (2020) 5, 067

  35. arXiv:1911.03688  [pdf, other

    cs.CL

    ConveRT: Efficient and Accurate Conversational Representations from Transformers

    Authors: Matthew Henderson, Iñigo Casanueva, Nikola Mrkšić, Pei-Hao Su, Tsung-Hsien Wen, Ivan Vulić

    Abstract: General-purpose pretrained sentence encoders such as BERT are not ideal for real-world conversational AI applications; they are computationally heavy, slow, and expensive to train. We propose ConveRT (Conversational Representations from Transformers), a pretraining framework for conversational tasks satisfying all the following requirements: it is effective, affordable, and quick to train. We pret… ▽ More

    Submitted 29 April, 2020; v1 submitted 9 November, 2019; originally announced November 2019.

  36. arXiv:1909.01296  [pdf, other

    cs.CL

    PolyResponse: A Rank-based Approach to Task-Oriented Dialogue with Application in Restaurant Search and Booking

    Authors: Matthew Henderson, Ivan Vulić, Iñigo Casanueva, Paweł Budzianowski, Daniela Gerz, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su

    Abstract: We present PolyResponse, a conversational search engine that supports task-oriented dialogue. It is a retrieval-based approach that bypasses the complex multi-component design of traditional task-oriented dialogue systems and the use of explicit semantics in the form of task-specific ontologies. The PolyResponse engine is trained on hundreds of millions of examples extracted from real conversation… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019 (Demo paper)

  37. arXiv:1906.12066  [pdf, other

    cs.PF cs.PL cs.SE

    Pinpointing Performance Inefficiencies in Java

    Authors: Pengfei Su, Qingsen Wang, Milind Chabbi, Xu Liu

    Abstract: Many performance inefficiencies such as inappropriate choice of algorithms or data structures, developers' inattention to performance, and missed compiler optimizations show up as wasteful memory operations. Wasteful memory operations are those that produce/consume data to/from memory that may have been avoided. We present, JXPerf, a lightweight performance analysis tool for pinpointing wasteful m… ▽ More

    Submitted 28 June, 2019; originally announced June 2019.

    Comments: This is a full-version of our ESEC/FSE'2019 paper

  38. arXiv:1906.01543  [pdf, other

    cs.CL

    Training Neural Response Selection for Task-Oriented Dialogue Systems

    Authors: Matthew Henderson, Ivan Vulić, Daniela Gerz, Iñigo Casanueva, Paweł Budzianowski, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su

    Abstract: Despite their popularity in the chatbot literature, retrieval-based models have had modest impact on task-oriented dialogue systems, with the main obstacle to their application being the low-data regime of most task-oriented dialogue tasks. Inspired by the recent success of pretraining in language modelling, we propose an effective method for deploying response selection in task-oriented dialogue.… ▽ More

    Submitted 7 June, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: ACL 2019 long paper

  39. arXiv:1904.10613  [pdf

    cs.CV

    Defocused images removal of axial overlapping scattering particles by using three-dimensional nonlinear diffusion based on digital holography

    Authors: Wei-Na Li, Zhengyun Zhang, Jianshe Ma, Xiaohao Wang, Ping Su

    Abstract: We propose a three-dimensional nonlinear diffusion method to implement the similar autofocusing function of multiple micro-objects and simultaneously remove the defocused images, which can distinguish the locations of certain sized scattering particles that are overlapping along z-axis. It is applied to all of the reconstruction slices that are generated from the captured hologram after each back… ▽ More

    Submitted 14 August, 2019; v1 submitted 23 April, 2019; originally announced April 2019.

    Comments: no

  40. arXiv:1904.06472  [pdf, other

    cs.CL

    A Repository of Conversational Datasets

    Authors: Matthew Henderson, Paweł Budzianowski, Iñigo Casanueva, Sam Coope, Daniela Gerz, Girish Kumar, Nikola Mrkšić, Georgios Spithourakis, Pei-Hao Su, Ivan Vulić, Tsung-Hsien Wen

    Abstract: Progress in Machine Learning is often driven by the availability of large datasets, and consistent evaluation metrics for comparing modeling approaches. To this end, we present a repository of conversational datasets consisting of hundreds of millions of examples, and a standardised evaluation procedure for conversational response selection models using '1-of-100 accuracy'. The repository contains… ▽ More

    Submitted 28 May, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

    Journal ref: Proceedings of the Workshop on NLP for Conversational AI (2019)

  41. arXiv:1902.05462  [pdf, other

    cs.PF cs.PL cs.SE

    Redundant Loads: A Software Inefficiency Indicator

    Authors: Pengfei Su, Shasha Wen, Hailong Yang, Milind Chabbi, Xu Liu

    Abstract: Modern software packages have become increasingly complex with millions of lines of code and references to many external libraries. Redundant operations are a common performance limiter in these code bases. Missed compiler optimization opportunities, inappropriate data structure and algorithm choices, and developers' inattention to performance are some common reasons for the existence of redundant… ▽ More

    Submitted 14 February, 2019; originally announced February 2019.

    Comments: This paper is a full-version of our ICSE paper

  42. arXiv:1803.03232  [pdf, other

    cs.CL cs.AI cs.NE

    Feudal Reinforcement Learning for Dialogue Management in Large Domains

    Authors: Iñigo Casanueva, Paweł Budzianowski, Pei-Hao Su, Stefan Ultes, Lina Rojas-Barahona, Bo-Hsiang Tseng, Milica Gašić

    Abstract: Reinforcement learning (RL) is a promising approach to solve dialogue policy optimisation. Traditional RL algorithms, however, fail to scale to large domains due to the curse of dimensionality. We propose a novel Dialogue Management architecture, based on Feudal RL, which decomposes the decision into two steps; a first step where a master policy selects a subset of primitive actions, and a second… ▽ More

    Submitted 8 March, 2018; originally announced March 2018.

    Comments: Accepted as a short paper in NAACL 2018

  43. arXiv:1802.03753  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Sample Efficient Deep Reinforcement Learning for Dialogue Systems with Large Action Spaces

    Authors: Gellért Weisz, Paweł Budzianowski, Pei-Hao Su, Milica Gašić

    Abstract: In spoken dialogue systems, we aim to deploy artificial intelligence to build automated dialogue agents that can converse with humans. A part of this effort is the policy optimisation task, which attempts to find a policy describing how to respond to humans, in the form of a function taking the current state of the dialogue and returning the response of the system. In this paper, we investigate de… ▽ More

    Submitted 11 February, 2018; originally announced February 2018.

  44. arXiv:1711.11023  [pdf, other

    stat.ML cs.CL cs.NE

    A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management

    Authors: Iñigo Casanueva, Paweł Budzianowski, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Stefan Ultes, Lina Rojas-Barahona, Steve Young, Milica Gašić

    Abstract: Dialogue assistants are rapidly becoming an indispensable daily aid. To avoid the significant effort needed to hand-craft the required dialogue flow, the Dialogue Management (DM) module can be cast as a continuous Markov Decision Process (MDP) and trained through Reinforcement Learning (RL). Several RL models have been investigated over recent years. However, the lack of a common benchmarking fram… ▽ More

    Submitted 6 April, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

    Comments: Accepted at the Deep Reinforcement Learning Symposium, 31st Conference on Neural Information Processing Systems (NIPS 2017) Paper updated with minor changes

  45. arXiv:1707.08807  [pdf, other

    cs.DS

    Nearest Common Ancestors: Universal Trees and Improved Labeling Schemes

    Authors: Fabian Kuhn, Konstantinos Panagiotou, Pascal Su

    Abstract: We investigate the nearest common ancestor (NCA) function in rooted trees. As the main conceptual contribution, the paper introduces universal trees for the NCA function: For a given family of rooted trees, an NCA-universal tree $S$ is a rooted tree such that any tree $T$ of the family can be embedded into $S$ such that the embedding of the NCA in $T$ of two nodes of $T$ is equal to the NCA in… ▽ More

    Submitted 27 July, 2017; originally announced July 2017.

  46. arXiv:1707.06299  [pdf, other

    cs.CL stat.ML

    Reward-Balancing for Statistical Spoken Dialogue Systems using Multi-objective Reinforcement Learning

    Authors: Stefan Ultes, Paweł Budzianowski, Iñigo Casanueva, Nikola Mrkšić, Lina Rojas-Barahona, Pei-Hao Su, Tsung-Hsien Wen, Milica Gašić, Steve Young

    Abstract: Reinforcement learning is widely used for dialogue policy optimization where the reward function often consists of more than one component, e.g., the dialogue success and the dialogue length. In this work, we propose a structured method for finding a good balance between these components by searching for the optimal reward component weighting. To render this search feasible, we use multi-objective… ▽ More

    Submitted 19 July, 2017; originally announced July 2017.

    Comments: Accepted at SIGDial 2017

  47. arXiv:1707.00130  [pdf, other

    cs.CL cs.AI cs.LG

    Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management

    Authors: Pei-Hao Su, Pawel Budzianowski, Stefan Ultes, Milica Gasic, Steve Young

    Abstract: Deep reinforcement learning (RL) methods have significant potential for dialogue policy optimisation. However, they suffer from a poor performance in the early stages of learning. This is especially problematic for on-line learning with real users. Two approaches are introduced to tackle this problem. Firstly, to speed up the learning process, two sample-efficient neural networks algorithms: trust… ▽ More

    Submitted 5 July, 2017; v1 submitted 1 July, 2017; originally announced July 2017.

    Comments: Accepted as a long paper in SigDial 2017

  48. arXiv:1706.06210  [pdf, other

    cs.CL cs.AI

    Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning

    Authors: Paweł Budzianowski, Stefan Ultes, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Iñigo Casanueva, Lina Rojas-Barahona, Milica Gašić

    Abstract: Human conversation is inherently complex, often spanning many different topics/domains. This makes policy learning for dialogue systems very challenging. Standard flat reinforcement learning methods do not provide an efficient framework for modelling such dialogues. In this paper, we focus on the under-explored problem of multi-domain dialogue management. First, we propose a new method for hierarc… ▽ More

    Submitted 17 July, 2017; v1 submitted 19 June, 2017; originally announced June 2017.

    Comments: Update of the section 4 and the bibliography

  49. arXiv:1705.04524  [pdf, other

    cs.LG cs.AI math.DS stat.ML

    Long-term Blood Pressure Prediction with Deep Recurrent Neural Networks

    Authors: Peng Su, Xiao-Rong Ding, Yuan-Ting Zhang, Jing Liu, Fen Miao, Ni Zhao

    Abstract: Existing methods for arterial blood pressure (BP) estimation directly map the input physiological signals to output BP values without explicitly modeling the underlying temporal dependencies in BP dynamics. As a result, these models suffer from accuracy decay over a long time and thus require frequent calibration. In this work, we address this issue by formulating BP estimation as a sequence predi… ▽ More

    Submitted 14 January, 2018; v1 submitted 12 May, 2017; originally announced May 2017.

    Comments: To appear in IEEE BHI 2018

  50. arXiv:1610.04120  [pdf, other

    cs.AI cs.CL cs.NE

    Exploiting Sentence and Context Representations in Deep Neural Models for Spoken Language Understanding

    Authors: Lina M. Rojas Barahona, Milica Gasic, Nikola Mrkšić, Pei-Hao Su, Stefan Ultes, Tsung-Hsien Wen, Steve Young

    Abstract: This paper presents a deep learning architecture for the semantic decoder component of a Statistical Spoken Dialogue System. In a slot-filling dialogue, the semantic decoder predicts the dialogue act and a set of slot-value pairs from a set of n-best hypotheses returned by the Automatic Speech Recognition. Most current models for spoken language understanding assume (i) word-aligned semantic annot… ▽ More

    Submitted 13 October, 2016; originally announced October 2016.