Skip to main content

Showing 1–50 of 75 results for author: Shi, E

  1. arXiv:2406.06918  [pdf, other

    cs.SE

    Towards more realistic evaluation of LLM-based code generation: an experimental study and beyond

    Authors: Dewu Zheng, Yanlin Wang, Ensheng Shi, Ruikai Zhang, Yuchi Ma, Hongyu Zhang, Zibin Zheng

    Abstract: To evaluate the code generation capabilities of Large Language Models (LLMs) in complex real-world software development scenarios, many evaluation approaches have been developed. They typically leverage contextual code from the latest version of a project to facilitate LLMs in accurately generating the desired function. However, such evaluation approaches fail to consider the dynamic evolution of… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  2. arXiv:2405.09200  [pdf, ps, other

    cs.IT

    Performance Analysis of RIS-aided MISO Systems with EMI and Channel Aging

    Authors: Taoyu Song, Enyu Shi, Yu Lu, Yiyang Zhu, Jiayi Zhang, Bo Ai

    Abstract: In this paper, we investigate a reconfigurable intelligent surface (RIS)-aided multiple-input single-output (MISO) system in the presence of electromagnetic interference (EMI) and channel aging with a Rician fading channel model between the base station (BS) and user equipment (UE). Specifically, we derive the closed-form expression for downlink spectral efficiency (SE) with maximum ratio transmis… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  3. Multi-agent Reinforcement Learning-based Joint Precoding and Phase Shift Optimization for RIS-aided Cell-Free Massive MIMO Systems

    Authors: Yiyang Zhu, Enyu Shi, Ziheng Liu, Jiayi Zhang, Bo Ai

    Abstract: Cell-free (CF) massive multiple-input multiple-output (mMIMO) is a promising technique for achieving high spectral efficiency (SE) using multiple distributed access points (APs). However, harsh propagation environments often lead to significant communication performance degradation due to high penetration loss. To overcome this issue, we introduce the reconfigurable intelligent surface (RIS) into… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  4. arXiv:2404.04898  [pdf, other

    cs.IT

    Graph Neural Network Meets Multi-Agent Reinforcement Learning: Fundamentals, Applications, and Future Directions

    Authors: Ziheng Liu, Jiayi Zhang, Enyu Shi, Zhilong Liu, Dusit Niyato, Bo Ai, Xuemin, Shen

    Abstract: Multi-agent reinforcement learning (MARL) has become a fundamental component of next-generation wireless communication systems. Theoretically, although MARL has the advantages of low computational complexity and fast convergence rate, there exist several challenges including partial observability, non-stationary, and scalability. In this article, we investigate a novel MARL with graph neural netwo… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  5. arXiv:2403.10116  [pdf, other

    cs.CR cs.DS

    Instance-optimal Clipping for Summation Problems in the Shuffle Model of Differential Privacy

    Authors: Wei Dong, Qiyao Luo, Giulia Fanti, Elaine Shi, Ke Yi

    Abstract: Differentially private mechanisms achieving worst-case optimal error bounds (e.g., the classical Laplace mechanism) are well-studied in the literature. However, when typical data are far from the worst case, \emph{instance-specific} error bounds -- which depend on the largest value in the dataset -- are more meaningful. For example, consider the sum estimation problem, where each user has an integ… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  6. arXiv:2402.09357  [pdf, ps, other

    cs.GT cs.CG

    Mechanism Design for Automated Market Makers

    Authors: T-H. Hubert Chan, Ke Wu, Elaine Shi

    Abstract: Blockchains have popularized automated market makers (AMMs). An AMM exchange is an application running on a blockchain which maintains a pool of crypto-assets and automatically trades assets with users governed by some pricing function that prices the assets based on their relative demand/supply. AMMs have created an important challenge commonly known as the Miner Extractable Value (MEV). In parti… ▽ More

    Submitted 21 April, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: 1 title page and 23 pages for the main body

  7. Collusion-Resilience in Transaction Fee Mechanism Design

    Authors: Hao Chung, Tim Roughgarden, Elaine Shi

    Abstract: Users bid in a transaction fee mechanism (TFM) to get their transactions included and confirmed by a blockchain protocol. Roughgarden (EC'21) initiated the formal treatment of TFMs and proposed three requirements: user incentive compatibility (UIC), miner incentive compatibility (MIC), and a form of collusion-resilience called OCA-proofness. Ethereum's EIP-1559 mechanism satisfies all three proper… ▽ More

    Submitted 19 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  8. arXiv:2401.04334  [pdf, other

    cs.RO cs.AI

    Large Language Models for Robotics: Opportunities, Challenges, and Perspectives

    Authors: Jiaqi Wang, Zihao Wu, Yiwei Li, Hanqi Jiang, Peng Shu, Enze Shi, Huawen Hu, Chong Ma, Yiheng Liu, Xuhui Wang, Yincheng Yao, Xuan Liu, Huaqin Zhao, Zhengliang Liu, Haixing Dai, Lin Zhao, Bao Ge, Xiang Li, Tianming Liu, Shu Zhang

    Abstract: Large language models (LLMs) have undergone significant expansion and have been increasingly integrated across various domains. Notably, in the realm of robot task planning, LLMs harness their advanced reasoning and language comprehension capabilities to formulate precise and efficient action plans based on natural language instructions. However, for embodied tasks, where robots interact with comp… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  9. arXiv:2401.01572  [pdf, other

    cs.CL cs.SD eess.AS

    Hallucinations in Neural Automatic Speech Recognition: Identifying Errors and Hallucinatory Models

    Authors: Rita Frieske, Bertram E. Shi

    Abstract: Hallucinations are a type of output error produced by deep neural networks. While this has been studied in natural language processing, they have not been researched previously in automatic speech recognition. Here, we define hallucinations in ASR as transcriptions generated by a model that are semantically unrelated to the source utterance, yet still fluent and coherent. The similarity of halluci… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  10. arXiv:2312.02332  [pdf, ps, other

    cs.DS

    Connected Components in Linear Work and Near-Optimal Time

    Authors: Alireza Farhadi, S. Cliff Liu, Elaine Shi

    Abstract: Computing the connected components of a graph is a fundamental problem in algorithmic graph theory. A major question in this area is whether we can compute connected components in $o(\log n)$ parallel time. Recent works showed an affirmative answer in the Massively Parallel Computation (MPC) model for a wide class of graphs. Specifically, Behnezhad et al. (FOCS'19) showed that connected components… ▽ More

    Submitted 20 May, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

  11. arXiv:2310.00263  [pdf, ps, other

    cs.IT eess.SP

    RIS-Aided Cell-Free Massive MIMO Systems for 6G: Fundamentals, System Design, and Applications

    Authors: Enyu Shi, Jiayi Zhang, Hongyang Du, Bo Ai, Chau Yuen, Dusit Niyato, Khaled B. Letaief, Xuemin Shen

    Abstract: An introduction of intelligent interconnectivity for people and things has posed higher demands and more challenges for sixth-generation (6G) networks, such as high spectral efficiency and energy efficiency, ultra-low latency, and ultra-high reliability. Cell-free (CF) massive multiple-input multiple-output (mMIMO) and reconfigurable intelligent surface (RIS), also called intelligent reflecting su… ▽ More

    Submitted 22 May, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

    Comments: Proceedings of the IEEE, Accept, 2024

  12. arXiv:2308.13416  [pdf, other

    cs.SE cs.AI

    SoTaNa: The Open-Source Software Development Assistant

    Authors: Ensheng Shi, Fengji Zhang, Yanlin Wang, Bei Chen, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, Hongbin Sun

    Abstract: Software development plays a crucial role in driving innovation and efficiency across modern societies. To meet the demands of this dynamic field, there is a growing need for an effective software development assistant. However, existing large language models represented by ChatGPT suffer from limited accessibility, including training data and model weights. Although other large open-source models… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  13. arXiv:2307.01187  [pdf, other

    cs.CV cs.AI

    SAMAug: Point Prompt Augmentation for Segment Anything Model

    Authors: Haixing Dai, Chong Ma, Zhiling Yan, Zhengliang Liu, Enze Shi, Yiwei Li, Peng Shu, Xiaozheng Wei, Lin Zhao, Zihao Wu, Fang Zeng, Dajiang Zhu, Wei Liu, Quanzheng Li, Lichao Sun, Shu Zhang Tianming Liu, Xiang Li

    Abstract: This paper introduces SAMAug, a novel visual point augmentation method for the Segment Anything Model (SAM) that enhances interactive image segmentation performance. SAMAug generates augmented point prompts to provide more information about the user's intention to SAM. Starting with an initial point prompt, SAM produces an initial mask, which is then fed into our proposed SAMAug to generate augmen… ▽ More

    Submitted 19 March, 2024; v1 submitted 3 July, 2023; originally announced July 2023.

  14. arXiv:2307.00855  [pdf, other

    cs.CV cs.AI

    Review of Large Vision Models and Visual Prompt Engineering

    Authors: Jiaqi Wang, Zhengliang Liu, Lin Zhao, Zihao Wu, Chong Ma, Sigang Yu, Haixing Dai, Qiushi Yang, Yiheng Liu, Songyao Zhang, Enze Shi, Yi Pan, Tuo Zhang, Dajiang Zhu, Xiang Li, Xi Jiang, Bao Ge, Yixuan Yuan, Dinggang Shen, Tianming Liu, Shu Zhang

    Abstract: Visual prompt engineering is a fundamental technology in the field of visual and image Artificial General Intelligence, serving as a key component for achieving zero-shot capabilities. As the development of large vision models progresses, the importance of prompt engineering becomes increasingly evident. Designing suitable prompts for specific visual tasks has emerged as a meaningful research dire… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  15. arXiv:2306.08278  [pdf, ps, other

    cs.IT eess.SP

    Uplink Performance of RIS-aided Cell-Free Massive MIMO System with Electromagnetic Interference

    Authors: Enyu Shi, Jiayi Zhang, Derrick Wing Kwan Ng, Bo Ai

    Abstract: Cell-free (CF) massive multiple-input multiple-output (MIMO) and reconfigurable intelligent surface (RIS) are two promising technologies for realizing future beyond-fifth generation (B5G) networks. In this paper, we consider a practical spatially correlated RIS-aided CF massive MIMO system with multi-antenna access points (APs) over spatially correlated fading channels. Different from previous wor… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: to appear in IEEE Journal on Selected Areas in Communications

  16. arXiv:2304.14670  [pdf, other

    cs.AI

    Prompt Engineering for Healthcare: Methodologies and Applications

    Authors: Jiaqi Wang, Enze Shi, Sigang Yu, Zihao Wu, Chong Ma, Haixing Dai, Qiushi Yang, Yanqing Kang, Jinru Wu, Huawen Hu, Chenxi Yue, Haiyang Zhang, Yiheng Liu, Yi Pan, Zhengliang Liu, Lichao Sun, Xiang Li, Bao Ge, Xi Jiang, Dajiang Zhu, Yixuan Yuan, Dinggang Shen, Tianming Liu, Shu Zhang

    Abstract: Prompt engineering is a critical technique in the field of natural language processing that involves designing and optimizing the prompts used to input information into models, aiming to enhance their performance on specific tasks. With the recent advancements in large language models, prompt engineering has shown significant superiority across various domains and has become increasingly important… ▽ More

    Submitted 23 March, 2024; v1 submitted 28 April, 2023; originally announced April 2023.

  17. arXiv:2304.05216  [pdf, other

    cs.SE cs.AI cs.CL

    Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond

    Authors: Ensheng Shi, Yanlin Wang, Hongyu Zhang, Lun Du, Shi Han, Dongmei Zhang, Hongbin Sun

    Abstract: Recently, fine-tuning pre-trained code models such as CodeBERT on downstream tasks has achieved great success in many software testing and analysis tasks. While effective and prevalent, fine-tuning the pre-trained parameters incurs a large computational cost. In this paper, we conduct an extensive experimental study to explore what happens to layer-wise pre-trained representations and their encode… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

    Comments: Accepted by ISSTA 2023 (The 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis)

  18. arXiv:2302.12895  [pdf, ps, other

    cs.GT

    Maximizing Miner Revenue in Transaction Fee Mechanism Design

    Authors: Ke Wu, Elaine Shi, Hao Chung

    Abstract: Transaction fee mechanism design is a new decentralized mechanism design problem where users bid for space on the blockchain. Several recent works showed that the transaction fee mechanism design fundamentally departs from classical mechanism design. They then systematically explored the mathematical landscape of this new decentralized mechanism design problem in two settings: in the plain setting… ▽ More

    Submitted 21 April, 2024; v1 submitted 24 February, 2023; originally announced February 2023.

  19. arXiv:2302.00095  [pdf, ps, other

    cs.AR cs.CR cs.ET

    XCRYPT: Accelerating Lattice Based Cryptography with Memristor Crossbar Arrays

    Authors: Sarabjeet Singh, Xiong Fan, Ananth Krishna Prasad, Lin Jia, Anirban Nag, Rajeev Balasubramonian, Mahdi Nazm Bojnordi, Elaine Shi

    Abstract: This paper makes a case for accelerating lattice-based post quantum cryptography (PQC) with memristor based crossbars, and shows that these inherently error-tolerant algorithms are a good fit for noisy analog MAC operations in crossbars. We compare different NIST round-3 lattice-based candidates for PQC, and identify that SABER is not only a front-runner when executing on traditional systems, but… ▽ More

    Submitted 31 January, 2023; originally announced February 2023.

  20. arXiv:2212.05176  [pdf, other

    cs.DB cs.CR

    Adore: Differentially Oblivious Relational Database Operators

    Authors: Lianke Qin, Rajesh Jayaram, Elaine Shi, Zhao Song, Danyang Zhuo, Shumo Chu

    Abstract: There has been a recent effort in applying differential privacy on memory access patterns to enhance data privacy. This is called differential obliviousness. Differential obliviousness is a promising direction because it provides a principled trade-off between performance and desired level of privacy. To date, it is still an open question whether differential obliviousness can speed up database pr… ▽ More

    Submitted 29 September, 2023; v1 submitted 9 December, 2022; originally announced December 2022.

    Comments: VLDB 2023

  21. arXiv:2210.10133  [pdf, other

    cs.CR

    Efficient Privacy-Preserving Machine Learning with Lightweight Trusted Hardware

    Authors: Pengzhi Huang, Thang Hoang, Yueying Li, Elaine Shi, G. Edward Suh

    Abstract: In this paper, we propose a new secure machine learning inference platform assisted by a small dedicated security processor, which will be easier to protect and deploy compared to today's TEEs integrated into high-performance processors. Our platform provides three main advantages over the state-of-the-art: (i) We achieve significant performance improvements compared to state-of-the-art distribu… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: IEEE S&P'24 submitted

  22. arXiv:2209.14645  [pdf, other

    cs.HC

    Reducing Stress and Anxiety in the Metaverse: A Systematic Review of Meditation, Mindfulness and Virtual Reality

    Authors: Xian Wang, Xiaoyu Mo, Mingming Fan, Lik-Hang Lee, Bertram E. Shi, Pan Hui

    Abstract: Meditation, or mindfulness, is widely used to improve mental health. With the emergence of Virtual Reality technology, many studies have provided evidence that meditation with VR can bring health benefits. However, to our knowledge, there are no guidelines and comprehensive reviews in the literature on how to conduct such research in virtual reality. In order to understand the role of VR technolog… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

  23. arXiv:2209.14504  [pdf, ps, other

    cs.IT

    Decentralized Coordinated Precoding Design in Cell-Free Massive MIMO Systems for URLLC

    Authors: Enyu Shi, Jing Zhang, Jiayi Zhang, Derrick Wing Kwan Ng, Bo Ai

    Abstract: Cell-free massive multiple-input multiple-output (MIMO) is a promising network to offer huge improvement of the achievable rate compared with conventional cellular massive MIMO systems. However, the commonly adopted Shannon-type achievable rate is only valid in the long block length regime that is not applicable to the emerging short-packet communication. To realize ultra-reliable and low-latency… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

    Comments: 6 pages,3 figures

  24. arXiv:2209.14462  [pdf, ps, other

    cs.GT cs.CR

    What Can Cryptography Do For Decentralized Mechanism Design

    Authors: Elaine Shi, Hao Chung, Ke Wu

    Abstract: Recent works of Roughgarden (EC'21) and Chung and Shi (SODA'23) initiate the study of a new decentralized mechanism design problem called transaction fee mechanism design (TFM). Unlike the classical mechanism design literature, in the decentralized environment, even the auctioneer (i.e., the miner) can be a strategic player, and it can even collude with a subset of the users facilitated by binding… ▽ More

    Submitted 19 February, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

  25. arXiv:2209.13845  [pdf, ps, other

    cs.IT eess.SP

    Uplink Performance of RIS-aided Cell-Free Massive MIMO System Over Spatially Correlated Channels

    Authors: Enyu Shi, Jiayi Zhang, Zhe Wang, Derrick Wing Kwan Ng, Bo Ai

    Abstract: We consider a practical spatially correlated reconfigurable intelligent surface (RIS)-aided cell-free (CF) massive multiple-input-multiple-output (mMIMO) system with multi-antenna access points (APs) over spatially correlated Rician fading channels. The minimum mean square error (MMSE) channel estimator is adopted to estimate the aggregated RIS channels. Then, we investigate the uplink spectral ef… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

    Comments: 6 pages, 5 figures

    Journal ref: early access,Globecom 2022

  26. arXiv:2204.03293  [pdf, other

    cs.SE cs.AI cs.LG

    CoCoSoDa: Effective Contrastive Learning for Code Search

    Authors: Ensheng Shi, Yanlin Wang, Wenchao Gu, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, Hongbin Sun

    Abstract: Code search aims to retrieve semantically relevant code snippets for a given natural language query. Recently, many approaches employing contrastive learning have shown promising results on code representation learning and greatly improved the performance of code search. However, there is still a lot of room for improvement in using contrastive learning for code search. In this paper, we propose C… ▽ More

    Submitted 12 February, 2023; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: Accepted by ICSE 2023 (The 45th International Conference on Software Engineering)

  27. arXiv:2203.02700  [pdf, other

    cs.SE cs.AI cs.LG

    RACE: Retrieval-Augmented Commit Message Generation

    Authors: Ensheng Shi, Yanlin Wang, Wei Tao, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, Hongbin Sun

    Abstract: Commit messages are important for software development and maintenance. Many neural network-based approaches have been proposed and shown promising results on automatic commit message generation. However, the generated commit messages could be repetitive or redundant. In this paper, we propose RACE, a new retrieval-augmented neural commit message generation method, which treats the retrieved simil… ▽ More

    Submitted 22 October, 2022; v1 submitted 5 March, 2022; originally announced March 2022.

    Comments: Accepted by EMNLP 2022 (The 2022 Conference on Empirical Methods in Natural Language Processing)

  28. arXiv:2201.11302  [pdf, ps, other

    cs.IT

    Wireless Energy Transfer in RIS-Aided Cell-Free Massive MIMO Systems: Opportunities and Challenges

    Authors: Enyu Shi, Jiayi Zhang, Shuaifei Chen, Jiakang Zheng, Yan Zhang, Derrick Wing Kwan Ng, Bo Ai

    Abstract: In future sixth-generation (6G) mobile networks, the Internet-of-Everything (IoE) is expected to provide extremely massive connectivity for small battery-powered devices. Indeed, massive devices with limited energy storage capacity impose persistent energy demand hindering the lifetime of communication networks. As a remedy, wireless energy transfer (WET) is a key technology to address these criti… ▽ More

    Submitted 28 January, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

    Comments: to appear IEEE ComMag

  29. arXiv:2201.09622  [pdf, ps, other

    cs.IT eess.SP

    Uplink Performance of High-Mobility Cell-Free Massive MIMO-OFDM Systems

    Authors: Jiakang Zheng, Jiayi Zhang, Enyu Shi, Jing Jiang, Bo Ai

    Abstract: High-speed train (HST) communications with orthogonal frequency division multiplexing (OFDM) techniques have received significant attention in recent years. Besides, cell-free (CF) massive multiple-input multiple-output (MIMO) is considered a promising technology to achieve the ultimate performance limit. In this paper, we focus on the performance of CF massive MIMO-OFDM systems with both matched… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

    Comments: Accepted in IEEE ICC 2022

  30. arXiv:2201.03804  [pdf, other

    cs.CL cs.AI

    CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition

    Authors: Wenliang Dai, Samuel Cahyawijaya, Tiezheng Yu, Elham J. Barezi, Peng Xu, Cheuk Tung Shadow Yiu, Rita Frieske, Holy Lovenia, Genta Indra Winata, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

    Abstract: With the rise of deep learning and intelligent vehicle, the smart assistant has become an essential in-car component to facilitate driving and provide extra functionalities. In-car smart assistants should be able to process general as well as car-related commands and perform corresponding actions, which eases driving and improves safety. However, there is a data scarcity issue for low resource lan… ▽ More

    Submitted 14 March, 2022; v1 submitted 11 January, 2022; originally announced January 2022.

    Comments: 6 pages

  31. arXiv:2201.02419  [pdf, other

    cs.CL cs.SD eess.AS

    Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset

    Authors: Tiezheng Yu, Rita Frieske, Peng Xu, Samuel Cahyawijaya, Cheuk Tung Shadow Yiu, Holy Lovenia, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

    Abstract: Automatic speech recognition (ASR) on low resource languages improves the access of linguistic minorities to technological advantages provided by artificial intelligence (AI). In this paper, we address the problem of data scarcity for the Hong Kong Cantonese language by creating a new Cantonese dataset. Our dataset, Multi-Domain Cantonese Corpus (MDCC), consists of 73.6 hours of clean read speech… ▽ More

    Submitted 17 January, 2022; v1 submitted 7 January, 2022; originally announced January 2022.

  32. arXiv:2112.06223  [pdf, other

    cs.CL

    ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation

    Authors: Holy Lovenia, Samuel Cahyawijaya, Genta Indra Winata, Peng Xu, Xu Yan, Zihan Liu, Rita Frieske, Tiezheng Yu, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

    Abstract: Code-switching is a speech phenomenon occurring when a speaker switches language during a conversation. Despite the spontaneous nature of code-switching in conversational spoken language, most existing works collect code-switching data from read speech instead of spontaneous speech. ASCEND (A Spontaneous Chinese-English Dataset) is a high-quality Mandarin Chinese-English code-switching corpus buil… ▽ More

    Submitted 3 May, 2022; v1 submitted 12 December, 2021; originally announced December 2021.

    Journal ref: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022)

  33. arXiv:2112.03449  [pdf, other

    cs.CR

    Locally Differentially Private Sparse Vector Aggregation

    Authors: Mingxun Zhou, Tianhao Wang, T-H. Hubert Chan, Giulia Fanti, Elaine Shi

    Abstract: Vector mean estimation is a central primitive in federated analytics. In vector mean estimation, each user $i \in [n]$ holds a real-valued vector $v_i\in [-1, 1]^d$, and a server wants to estimate the mean of all $n$ vectors. Not only so, we would like to protect each individual user's privacy. In this paper, we consider the $k$-sparse version of the vector mean estimation problem, that is, suppos… ▽ More

    Submitted 26 February, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

  34. arXiv:2111.03151  [pdf, ps, other

    cs.GT econ.TH

    Foundations of Transaction Fee Mechanism Design

    Authors: Hao Chung, Elaine Shi

    Abstract: In blockchains such as Bitcoin and Ethereum, users compete in a transaction fee auction to get their transactions confirmed in the next block. A line of recent works set forth the desiderata for a "dream" transaction fee mechanism (TFM), and explored whether such a mechanism existed. A dream TFM should satisfy 1) user incentive compatibility (UIC), i.e., truthful bidding should be a user's dominan… ▽ More

    Submitted 4 November, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

    Journal ref: SODA 2023

  35. arXiv:2110.03155  [pdf, other

    cs.LG

    The Benefits of Being Categorical Distributional: Uncertainty-aware Regularized Exploration in Reinforcement Learning

    Authors: Ke Sun, Yingnan Zhao, Enze Shi, Yafei Wang, Xiaodong Yan, Bei Jiang, Linglong Kong

    Abstract: The theoretical advantages of distributional reinforcement learning~(RL) over classical RL remain elusive despite its remarkable empirical performance. Starting from Categorical Distributional RL~(CDRL), we attribute the potential superiority of distributional RL to a derived distribution-matching regularization by applying a return density function decomposition technique. This unexplored regular… ▽ More

    Submitted 2 February, 2024; v1 submitted 6 October, 2021; originally announced October 2021.

  36. arXiv:2108.12987  [pdf, other

    cs.SE

    CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees

    Authors: Ensheng Shi, Yanlin Wang, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, Hongbin Sun

    Abstract: Code summarization aims to generate concise natural language descriptions of source code, which can help improve program comprehension and maintenance. Recent studies show that syntactic and structural information extracted from abstract syntax trees (ASTs) is conducive to summary generation. However, existing approaches fail to fully capture the rich information in ASTs because of the large size/… ▽ More

    Submitted 30 November, 2021; v1 submitted 30 August, 2021; originally announced August 2021.

    Comments: Accepted by EMNLP 2021

  37. arXiv:2108.04228  [pdf, other

    cs.CV cs.LG

    Iterative Distillation for Better Uncertainty Estimates in Multitask Emotion Recognition

    Authors: Didan Deng, Liang Wu, Bertram E. Shi

    Abstract: When recognizing emotions, subtle nuances in displays of emotion generate ambiguity or uncertainty in emotion perception. Emotion uncertainty has been previously interpreted as inter-rater disagreement among multiple annotators. In this paper, we consider a more common and challenging scenario: modeling emotion uncertainty when only single emotion labels are available. From a Bayesian perspective,… ▽ More

    Submitted 17 October, 2021; v1 submitted 21 July, 2021; originally announced August 2021.

    Comments: Accepted as a Workshop paper in ICCV2021 proceeding

  38. On the Evaluation of Neural Code Summarization

    Authors: Ensheng Shi, Yanlin Wang, Lun Du, Junjie Chen, Shi Han, Hongyu Zhang, Dongmei Zhang, Hongbin Sun

    Abstract: Source code summaries are important for program comprehension and maintenance. However, there are plenty of programs with missing, outdated, or mismatched summaries. Recently, deep learning techniques have been exploited to automatically generate summaries for given code snippets. To achieve a profound understanding of how far we are from solving this problem and provide suggestions to future rese… ▽ More

    Submitted 11 February, 2022; v1 submitted 15 July, 2021; originally announced July 2021.

    Comments: Accepted by ICSE 2022 (The 44th International Conference on Software Engineering)

  39. arXiv:2107.05373  [pdf, other

    cs.SE cs.AI

    On the Evaluation of Commit Message Generation Models: An Experimental Study

    Authors: Wei Tao, Yanlin Wang, Ensheng Shi, Lun Du, Shi Han, Hongyu Zhang, Dongmei Zhang, Wenqiang Zhang

    Abstract: Commit messages are natural language descriptions of code changes, which are important for program understanding and maintenance. However, writing commit messages manually is time-consuming and laborious, especially when the code is updated frequently. Various approaches utilizing generation or retrieval techniques have been proposed to automatically generate commit messages. To achieve a better u… ▽ More

    Submitted 26 July, 2021; v1 submitted 12 July, 2021; originally announced July 2021.

    Comments: Accepted to International Conference on Software Maintenance and Evolution (ICSME) 2021

  40. arXiv:2107.04773  [pdf, other

    cs.SE cs.LG

    Is a Single Model Enough? MuCoS: A Multi-Model Ensemble Learning for Semantic Code Search

    Authors: Lun Du, Xiaozhou Shi, Yanlin Wang, Ensheng Shi, Shi Han, Dongmei Zhang

    Abstract: Recently, deep learning methods have become mainstream in code search since they do better at capturing semantic correlations between code snippets and search queries and have promising performance. However, code snippets have diverse information from different dimensions, such as business logic, specific algorithm, and hardware communication, so it is hard for a single code representation module… ▽ More

    Submitted 12 July, 2021; v1 submitted 10 July, 2021; originally announced July 2021.

    Comments: 5 pages

  41. arXiv:2107.01933  [pdf, other

    cs.SE

    CoCoSum: Contextual Code Summarization with Multi-Relational Graph Neural Network

    Authors: Yanlin Wang, Ensheng Shi, Lun Du, Xiaodi Yang, Yuxuan Hu, Shi Han, Hongyu Zhang, Dongmei Zhang

    Abstract: Source code summaries are short natural language descriptions of code snippets that help developers better understand and maintain source code. There has been a surge of work on automatic code summarization to reduce the burden of writing summaries manually. However, most contemporary approaches mainly leverage the information within the boundary of the method being summarized (i.e., local context… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

  42. arXiv:2106.00508  [pdf, other

    cs.CR cs.DS cs.GT

    Differentially Private Densest Subgraph

    Authors: Alireza Farhadi, MohammadTaghi Hajiaghayi, Elaine Shi

    Abstract: Given a graph, the densest subgraph problem asks for a set of vertices such that the average degree among these vertices is maximized. Densest subgraph has numerous applications in learning, e.g., community detection in social networks, link spam detection, correlation mining, bioinformatics, and so on. Although there are efficient algorithms that output either exact or approximate solutions to th… ▽ More

    Submitted 14 November, 2022; v1 submitted 1 June, 2021; originally announced June 2021.

  43. arXiv:2103.08007  [pdf, other

    cs.CR

    Selfish Mining Attacks Exacerbated by Elastic Hash Supply

    Authors: Yoko Shibuya, Go Yamamoto, Fuhito Kojima, Elaine Shi, Shin'ichiro Matsuo, Aron Laszka

    Abstract: Several attacks have been proposed against Proof-of-Work blockchains, which may increase the attacker's share of mining rewards (e.g., selfish mining, block withholding). A further impact of such attacks, which has not been considered in prior work, is that decreasing the profitability of mining for honest nodes incentivizes them to stop mining or to leave the attacked chain for a more profitable… ▽ More

    Submitted 14 March, 2021; originally announced March 2021.

  44. Learning Hierarchical Integration of Foveal and Peripheral Vision for Vergence Control by Active Efficient Coding

    Authors: Zhetuo Zhao, Jochen Triesch, Bertram E. Shi

    Abstract: The active efficient coding (AEC) framework parsimoniously explains the joint development of visual processing and eye movements, e.g., the emergence of binocular disparity selective neurons and fusional vergence, the disjunctive eye movements that align left and right eye images. Vergence can be driven by information in both the fovea and periphery, which play complementary roles. The high resolu… ▽ More

    Submitted 29 January, 2021; originally announced March 2021.

  45. arXiv:2102.11489  [pdf, other

    cs.DS cs.CC

    Optimal Sorting Circuits for Short Keys

    Authors: Wei-Kai Lin, Elaine Shi

    Abstract: A long-standing open question in the algorithms and complexity literature is whether there exist sorting circuits of size $o(n \log n)$. A recent work by Asharov, Lin, and Shi (SODA'21) showed that if the elements to be sorted have short keys whose length $k = o(\log n)$, then one can indeed overcome the $n\log n$ barrier for sorting circuits, by leveraging non-comparison-based techniques. More sp… ▽ More

    Submitted 6 November, 2021; v1 submitted 22 February, 2021; originally announced February 2021.

  46. arXiv:2101.11391  [pdf, ps, other

    cs.CV cs.AI

    Self-Calibrating Active Binocular Vision via Active Efficient Coding with Deep Autoencoders

    Authors: Charles Wilmot, Bertram E. Shi, Jochen Triesch

    Abstract: We present a model of the self-calibration of active binocular vision comprising the simultaneous learning of visual representations, vergence, and pursuit eye movements. The model follows the principle of Active Efficient Coding (AEC), a recent extension of the classic Efficient Coding Hypothesis to active perception. In contrast to previous AEC models, the present model uses deep autoencoders to… ▽ More

    Submitted 27 January, 2021; originally announced January 2021.

  47. arXiv:2101.05682  [pdf, other

    cs.CV cs.RO

    AVGCN: Trajectory Prediction using Graph Convolutional Networks Guided by Human Attention

    Authors: Congcong Liu, Yuying Chen, Ming Liu, Bertram E. Shi

    Abstract: Pedestrian trajectory prediction is a critical yet challenging task, especially for crowded scenes. We suggest that introducing an attention mechanism to infer the importance of different neighbors is critical for accurate trajectory prediction in scenes with varying crowd size. In this work, we propose a novel method, AVGCN, for trajectory prediction utilizing graph convolutional networks (GCN) b… ▽ More

    Submitted 14 January, 2021; originally announced January 2021.

    Comments: 7 pages, 4 figures

  48. arXiv:2010.09884  [pdf, ps, other

    cs.DS cs.CC

    Sorting Short Keys in Circuits of Size o(n log n)

    Authors: Gilad Asharov, Wei-Kai Lin, Elaine Shi

    Abstract: We consider the classical problem of sorting an input array containing $n$ elements, where each element is described with a $k$-bit comparison-key and a $w$-bit payload. A long-standing open problem is whether there exist $(k + w) \cdot o(n \log n)$-sized boolean circuits for sorting. We show that one can overcome the $n\log n$ barrier when the keys to be sorted are short. Specifically, we prove t… ▽ More

    Submitted 26 October, 2020; v1 submitted 15 October, 2020; originally announced October 2020.

    Comments: SODA'21

  49. arXiv:2009.07140  [pdf, other

    cs.CV

    HGCN-GJS: Hierarchical Graph Convolutional Network with Groupwise Joint Sampling for Trajectory Prediction

    Authors: Yuying Chen, Congcong Liu, Xiaodong Mei, Bertram E. Shi, Ming Liu

    Abstract: Accurate pedestrian trajectory prediction is of great importance for downstream tasks such as autonomous driving and mobile robot navigation. Fully investigating the social interactions within the crowd is crucial for accurate pedestrian trajectory prediction. However, most existing methods do not capture group level interactions well, focusing only on pairwise interactions and neglecting group-wi… ▽ More

    Submitted 15 September, 2023; v1 submitted 15 September, 2020; originally announced September 2020.

    Comments: 6 pages, 8 figures, accepted by IROS 2022

  50. arXiv:2008.01765  [pdf, other

    cs.DS cs.CR

    Bucket Oblivious Sort: An Extremely Simple Oblivious Sort

    Authors: Gilad Asharov, T-H. Hubert Chan, Kartik Nayak, Rafael Pass, Ling Ren, Elaine Shi

    Abstract: We propose a conceptually simple oblivious sort and oblivious random permutation algorithms called bucket oblivious sort and bucket oblivious random permutation. Bucket oblivious sort uses $6n\log n$ time (measured by the number of memory accesses) and $2Z$ client storage with an error probability exponentially small in $Z$. The above runtime is only $3\times$ slower than a non-oblivious merge sor… ▽ More

    Submitted 29 April, 2021; v1 submitted 4 August, 2020; originally announced August 2020.

    Comments: Appears in SOSA@SODA 2020