Skip to main content

Showing 1–28 of 28 results for author: Lin, F X

  1. arXiv:2311.18188  [pdf, other

    eess.AS cs.LG

    Speech Understanding on Tiny Devices with A Learning Cache

    Authors: Afsara Benazir, Zhiming Xu, Felix Xiaozhu Lin

    Abstract: This paper addresses spoken language understanding (SLU) on microcontroller-like embedded devices, integrating on-device execution with cloud offloading in a novel fashion. We leverage temporal locality in the speech inputs to a device and reuse recent SLU inferences accordingly. Our idea is simple: let the device match incoming inputs against cached results, and only offload inputs not matched to… ▽ More

    Submitted 8 May, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: accepted at MobiSys'24

  2. arXiv:2311.17065  [pdf, other

    eess.AS cs.CL cs.LG

    Efficient Deep Speech Understanding at the Edge

    Authors: Rongxiang Wang, Felix Xiaozhu Lin

    Abstract: In contemporary speech understanding (SU), a sophisticated pipeline is employed, encompassing the ingestion of streaming voice input. The pipeline executes beam search iteratively, invoking a deep neural network to generate tentative outputs (referred to as hypotheses) in an autoregressive manner. Periodically, the pipeline assesses attention and Connectionist Temporal Classification (CTC) scores.… ▽ More

    Submitted 4 December, 2023; v1 submitted 22 November, 2023; originally announced November 2023.

  3. arXiv:2310.02373  [pdf, other

    cs.LG cs.CR

    Secure and Effective Data Appraisal for Machine Learning

    Authors: Xu Ouyang, Changhong Yang, Felix Xiaozhu Lin, Yangfeng Ji

    Abstract: Essential for an unfettered data market is the ability to discreetly select and evaluate training data before finalizing a transaction between the data owner and model owner. To safeguard the privacy of both data and model, this process involves scrutinizing the target model through Multi-Party Computation (MPC). While prior research has posited that the MPC-based evaluation of Transformer models… ▽ More

    Submitted 24 January, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

  4. Federated Few-Shot Learning for Mobile NLP

    Authors: Dongqi Cai, Shangguang Wang, Yaozong Wu, Felix Xiaozhu Lin, Mengwei Xu

    Abstract: Natural language processing (NLP) sees rich mobile applications. To support various language understanding tasks, a foundation NLP model is often fine-tuned in a federated, privacy-preserving setting (FL). This process currently relies on at least hundreds of thousands of labeled training samples from mobile clients; yet mobile users often lack willingness or knowledge to label their data. Such an… ▽ More

    Submitted 19 August, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

    Comments: MobiCom 2023

  5. Towards Practical Few-shot Federated NLP

    Authors: Dongqi Cai, Yaozong Wu, Haitao Yuan, Shangguang Wang, Felix Xiaozhu Lin, Mengwei Xu

    Abstract: Transformer-based pre-trained models have emerged as the predominant solution for natural language processing (NLP). Fine-tuning such pre-trained models for downstream tasks often requires a considerable amount of labeled private data. In practice, private data is often distributed across heterogeneous mobile devices and may be prohibited from being uploaded. Moreover, well-curated labeled data is… ▽ More

    Submitted 19 August, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

    Comments: EuroSys23 workshop

  6. arXiv:2207.14386  [pdf, other

    cs.CL

    Efficient NLP Model Finetuning via Multistage Data Filtering

    Authors: Xu Ouyang, Shahina Mohd Azam Ansari, Felix Xiaozhu Lin, Yangfeng Ji

    Abstract: As model finetuning is central to the modern NLP, we set to maximize its efficiency. Motivated by redundancy in training examples and the sheer sizes of pretrained models, we exploit a key opportunity: training only on important data. To this end, we set to filter training examples in a streaming fashion, in tandem with training the target model. Our key techniques are two: (1) automatically deter… ▽ More

    Submitted 18 May, 2023; v1 submitted 28 July, 2022; originally announced July 2022.

  7. STI: Turbocharge NLP Inference at the Edge via Elastic Pipelining

    Authors: Liwei Guo, Wonkyo Choe, Felix Xiaozhu Lin

    Abstract: Natural Language Processing (NLP) inference is seeing increasing adoption by mobile applications, where on-device inference is desirable for crucially preserving user data privacy and avoiding network roundtrips. Yet, the unprecedented size of an NLP model stresses both latency and memory, creating a tension between the two key resources of a mobile device. To meet a target latency, holding the wh… ▽ More

    Submitted 31 January, 2023; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: ASPLOS'23

  8. arXiv:2205.10963  [pdf, other

    cs.CR cs.OS

    Protecting File Activities via Deception for ARM TrustZone

    Authors: Liwei Guo, Kaiyang Zhao, Yiying Zhang, Felix Xiaozhu Lin

    Abstract: A TrustZone TEE often invokes an external filesystem. While filedata can be encrypted, the revealed file activities can leak secrets. To hide the file activities from the filesystem and its OS, we propose Enigma, a deception-based defense injecting sybil file activities as the cover of the actual file activities. Enigma contributes three new designs. (1) To make the deception credible, the TEE g… ▽ More

    Submitted 24 May, 2022; v1 submitted 22 May, 2022; originally announced May 2022.

    Comments: Under submission

  9. arXiv:2205.10162  [pdf, other

    cs.LG

    FedAdapter: Efficient Federated Learning for Modern NLP

    Authors: Dongqi Cai, Yaozong Wu, Shangguang Wang, Felix Xiaozhu Lin, Mengwei Xu

    Abstract: Transformer-based pre-trained models have revolutionized NLP for superior performance and generality. Fine-tuning pre-trained models for downstream tasks often requires private data, for which federated learning is the de-facto approach (i.e., FedNLP). However, our measurements show that FedNLP is prohibitively slow due to the large model sizes and the resultant high network/computation cost. Towa… ▽ More

    Submitted 8 May, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: Accepted by MobiCom 2023

  10. arXiv:2111.03065  [pdf, other

    cs.DC cs.CR cs.OS

    Safe and Practical GPU Acceleration in TrustZone

    Authors: Heejin Park, Felix Xiaozhu Lin

    Abstract: We present a holistic design for GPU-accelerated computation in TrustZone TEE. Without pulling the complex GPU software stack into the TEE, we follow a simple approach: record the CPU/GPU interactions ahead of time, and replay the interactions in the TEE at run time. This paper addresses the approach's key missing piece -- the recording environment, which needs both strong security and access to d… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

  11. Minimum Viable Device Drivers for ARM TrustZone

    Authors: Liwei Guo, Felix Xiaozhu Lin

    Abstract: While TrustZone can isolate IO hardware, it lacks drivers for modern IO devices. Rather than porting drivers, we propose a novel approach to deriving minimum viable drivers: developers exercise a full driver and record the driver/device interactions; the processed recordings, dubbed driverlets, are replayed in the TEE at run time to access IO devices. Driverlets address two key challenges: corre… ▽ More

    Submitted 15 March, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: Eurosys 2022

  12. arXiv:2105.05085  [pdf, other

    cs.DC cs.CR

    GPUReplay: A 50-KB GPU Stack for Client ML

    Authors: Heejin Park, Felix Xiaozhu Lin

    Abstract: GPUReplay (GR) is a novel way for deploying GPU-accelerated computation on mobile and embedded devices. It addresses high complexity of a modern GPU stack for deployment ease and security. The idea is to record GPU executions on the full GPU stack ahead of time and replay the executions on new input at run time. We address key challenges towards making GR feasible, sound, and practical to use. The… ▽ More

    Submitted 3 April, 2022; v1 submitted 4 May, 2021; originally announced May 2021.

    Comments: in Proc. ASPLOS, Mar. 2022

  13. arXiv:2101.08744  [pdf, other

    cs.AR cs.OS

    Enabling Large Neural Networks on Tiny Microcontrollers with Swapping

    Authors: Hongyu Miao, Felix Xiaozhu Lin

    Abstract: Running neural networks (NNs) on microcontroller units (MCUs) is becoming increasingly important, but is very difficult due to the tiny SRAM size of MCU. Prior work proposes many algorithm-level techniques to reduce NN memory footprints, but all at the cost of sacrificing accuracy and generality, which disqualifies MCUs for many important use cases. We investigate a system solution for MCUs to exe… ▽ More

    Submitted 1 September, 2021; v1 submitted 14 January, 2021; originally announced January 2021.

  14. arXiv:2012.09329  [pdf, other

    cs.DB cs.CV

    Clique: Spatiotemporal Object Re-identification at the City Scale

    Authors: Tiantu Xu, Kaiwen Shen, Yang Fu, Humphrey Shi, Felix Xiaozhu Lin

    Abstract: Object re-identification (ReID) is a key application of city-scale cameras. While classic ReID tasks are often considered as image retrieval, we treat them as spatiotemporal queries for locations and times in which the target object appeared. Spatiotemporal reID is challenged by the accuracy limitation in computer vision algorithms and the colossal videos from city cameras. We present Clique, a pr… ▽ More

    Submitted 16 December, 2020; originally announced December 2020.

  15. arXiv:1912.11598  [pdf, other

    cs.CR cs.CY cs.NI

    Grand Challenges in Resilience: Autonomous System Resilience through Design and Runtime Measures

    Authors: Saurabh Bagchi, Vaneet Aggarwal, Somali Chaterji, Fred Douglis, Aly El Gamal, Jiawei Han, Brian J. Henz, Hank Hoffmann, Suman Jana, Milind Kulkarni, Felix Xiaozhu Lin, Karen Marais, Prateek Mittal, Shaoshuai Mou, Xiaokang Qiu, Gesualdo Scutari

    Abstract: A set of about 80 researchers, practitioners, and federal agency program managers participated in the NSF-sponsored Grand Challenges in Resilience Workshop held on Purdue campus on March 19-21, 2019. The workshop was divided into three themes: resilience in cyber, cyber-physical, and socio-technical systems. About 30 attendees in all participated in the discussions of cyber resilience. This articl… ▽ More

    Submitted 9 May, 2020; v1 submitted 25 December, 2019; originally announced December 2019.

    ACM Class: C.4; D.4.5

    Journal ref: IEEE Open Journal of the Computer Society, 2020

  16. Approximate Query Service on Autonomous IoT Cameras

    Authors: Mengwei Xu, Xiwen Zhang, Yunxin Liu, Gang Huang, Xuanzhe Liu, Felix Xiaozhu Lin

    Abstract: Elf is a runtime for an energy-constrained camera to continuously summarize video scenes as approximate object counts. Elf's novelty centers on planning the camera's count actions under energy constraint. (1) Elf explores the rich action space spanned by the number of sample image frames and the choice of per-frame object counters; it unifies errors from both sources into one single bounded error.… ▽ More

    Submitted 5 May, 2020; v1 submitted 2 September, 2019; originally announced September 2019.

  17. arXiv:1904.12342  [pdf, other

    cs.DB cs.CV

    Video Analytics with Zero-streaming Cameras

    Authors: Mengwei Xu, Tiantu Xu, Yunxin Liu, Felix Xiaozhu Lin

    Abstract: Low-cost cameras enable powerful analytics. An unexploited opportunity is that most captured videos remain "cold" without being queried. For efficiency, we advocate for these cameras to be zero streaming: capturing videos to local storage and communicating with the cloud only when analytics is requested. How to query zero-streaming cameras efficiently? Our response is a camera/cloud runtime system… ▽ More

    Submitted 17 June, 2021; v1 submitted 28 April, 2019; originally announced April 2019.

    Comments: Mengwei Xu and Tiantu Xu contributed equally to the paper

  18. arXiv:1902.06327  [pdf, other

    cs.CR

    Let the Cloud Watch Over Your IoT File Systems

    Authors: Liwei Guo, Yiying Zhang, Felix Xiaozhu Lin

    Abstract: Smart devices produce security-sensitive data and keep them in on-device storage for persistence. The current storage stack on smart devices, however, offers weak security guarantees: not only because the stack depends on a vulnerable commodity OS, but also because smart device deployment is known weak on security measures. To safeguard such data on smart devices, we present a novel storage stac… ▽ More

    Submitted 17 February, 2019; originally announced February 2019.

  19. StreamBox-HBM: Stream Analytics on High Bandwidth Hybrid Memory

    Authors: Hongyu Miao, Myeongjae Jeon, Gennady Pekhimenko, Kathryn S. McKinley, Felix Xiaozhu Lin

    Abstract: Stream analytics have an insatiable demand for memory and performance. Emerging hybrid memories combine commodity DDR4 DRAM with 3D-stacked High Bandwidth Memory (HBM) DRAM to meet such demands. However, achieving this promise is challenging because (1) HBM is capacity-limited and (2) HBM boosts performance best for sequential access and high parallelism workloads. At first glance, stream analytic… ▽ More

    Submitted 28 January, 2019; v1 submitted 4 January, 2019; originally announced January 2019.

  20. arXiv:1812.05448  [pdf, other

    cs.LG cs.CY

    A First Look at Deep Learning Apps on Smartphones

    Authors: Mengwei Xu, Jiawei Liu, Yuanqiang Liu, Felix Xiaozhu Lin, Yunxin Liu, Xuanzhe Liu

    Abstract: We are in the dawn of deep learning explosion for smartphones. To bridge the gap between research and practice, we present the first empirical study on 16,500 the most popular Android apps, demystifying how smartphone apps exploit deep learning in the wild. To this end, we build a new static tool that dissects apps and analyzes their deep learning functions. Our study answers threefold questions:… ▽ More

    Submitted 12 January, 2021; v1 submitted 8 November, 2018; originally announced December 2018.

  21. arXiv:1811.05000  [pdf, other

    cs.OS

    Transkernel: Bridging Monolithic Kernels to Peripheral Cores

    Authors: Liwei Guo, Shuang Zhai, Yi Qiao, Felix Xiaozhu Lin

    Abstract: Smart devices see a large number of ephemeral tasks driven by background activities. In order to execute such a task, the OS kernel wakes up the platform beforehand and puts it back to sleep afterwards. In doing so, the kernel operates various IO devices and orchestrates their power state transitions. Such kernel executions are inefficient as they mismatch typical CPU hardware. They are better off… ▽ More

    Submitted 5 June, 2019; v1 submitted 12 November, 2018; originally announced November 2018.

    Comments: The camera-ready version of this paper, will appear at USENIX ATC'19

  22. VStore: A Data Store for Analytics on Large Videos

    Authors: Tiantu Xu, Luis Materon Botelho, Felix Xiaozhu Lin

    Abstract: We present VStore, a data store for supporting fast, resource-efficient analytics over large archival videos. VStore manages video ingestion, storage, retrieval, and consumption. It controls video formats along the video data path. It is challenged by i) the huge combinatorial space of video format knobs; ii) the complex impacts of these knobs and their high profiling cost; iii) optimizing for mul… ▽ More

    Submitted 17 February, 2019; v1 submitted 3 October, 2018; originally announced October 2018.

  23. arXiv:1808.05078  [pdf, other

    cs.CR cs.DC cs.OS

    StreamBox-TZ: Secure Stream Analytics at the Edge with TrustZone

    Authors: Heejin Park, Shuang Zhai, Long Lu, Felix Xiaozhu Lin

    Abstract: While it is compelling to process large streams of IoT data on the cloud edge, doing so exposes the data to a sophisticated, vulnerable software stack on the edge and hence security threats. To this end, we advocate isolating the data and its computations in a trusted execution environment (TEE) on the edge, shielding them from the remaining edge software stack which we deem untrusted. This approa… ▽ More

    Submitted 5 June, 2019; v1 submitted 2 August, 2018; originally announced August 2018.

  24. DeepCache: Principled Cache for Mobile Deep Vision

    Authors: Mengwei Xu, Mengze Zhu, Yunxin Liu, Felix Xiaozhu Lin, Xuanzhe Liu

    Abstract: We present DeepCache, a principled cache design for deep learning inference in continuous mobile vision. DeepCache benefits model execution efficiency by exploiting temporal locality in input video streams. It addresses a key challenge raised by mobile vision: the cache must operate under video scene variation, while trading off among cacheability, overhead, and loss in model accuracy. At the inpu… ▽ More

    Submitted 30 March, 2020; v1 submitted 1 December, 2017; originally announced December 2017.

    Comments: Accepted for publication in the MobiCom 2018, copyright the ACM, posted with permission

  25. arXiv:1404.1320  [pdf, other

    cs.HC

    Draining our Glass: An Energy and Heat Characterization of Google Glass

    Authors: Robert LiKamWa, Zhen Wang, Aaron Carroll, Felix Xiaozhu Lin, Lin Zhong

    Abstract: The Google Glass is a mobile device designed to be worn as eyeglasses. This form factor enables new usage possibilities, such as hands-free video chats and instant web search. However, its shape also hampers its potential: (1) battery size, and therefore lifetime, is limited by a need for the device to be lightweight, and (2) high-power processing leads to significant heat, which should be limited… ▽ More

    Submitted 26 March, 2014; originally announced April 2014.

    Report number: Rice University ECE Technical Report 2014-03-23

  26. arXiv:1212.5170  [pdf, other

    cs.OH

    Guadalupe: a browser design for heterogeneous hardware

    Authors: Zhen Wang, Felix Xiaozhu Lin, Lin Zhong, Mansoor Chishtie

    Abstract: Mobile systems are embracing heterogeneous architectures by getting more types of cores and more specialized cores, which allows applications to be faster and more efficient. We aim at exploiting the hardware heterogeneity from the browser without requiring any changes to either the OS or the web applications. Our design, Guadalupe, can use hardware processing units with different degrees of capab… ▽ More

    Submitted 19 December, 2012; originally announced December 2012.

    Report number: Rice University ECE Technical Report 2012-12-19

  27. arXiv:1112.3691  [pdf

    cs.NI

    How Far Can Client-Only Solutions Go for Mobile Browser Speed?

    Authors: Zhen Wang, Felix Xiaozhu Lin, Lin Zhong, Mansoor Chishtie

    Abstract: Mobile browser is known to be slow because of the bottleneck in resource loading. Client-only solutions to improve resource loading are attractive because they are immediately deployable, scalable, and secure. We present the first publicly known treatment of client-only solutions to understand how much they can improve mobile browser speed without infrastructure support. Leveraging an unprecedente… ▽ More

    Submitted 15 December, 2011; originally announced December 2011.

    Report number: TR1215-2011, Rice University and Texas Instruments

  28. arXiv:1103.2348  [pdf

    cs.OS cs.PL

    Transparent Programming of Heterogeneous Smartphones for Sensing

    Authors: Felix Xiaozhu Lin, Zhen Wang, Robert LiKamWa, Lin Zhong

    Abstract: Sensing on smartphones is known to be power-hungry. It has been shown that this problem can be solved by adding an ultra low-power processor to execute simple, frequent sensor data processing. While very effective in saving energy, this resulting heterogeneous, distributed architecture poses a significant challenge to application development. We present Reflex, a suite of runtime and compilation… ▽ More

    Submitted 11 March, 2011; originally announced March 2011.