Electrical Engineering and Systems Science > Systems and Control

arXiv:2405.19213 (eess)

[Submitted on 29 May 2024]

Title:HawkVision: Low-Latency Modeless Edge AI Serving

Authors:ChonLam Lao, Jiaqi Gao, Ganesh Ananthanarayanan, Aditya Akella, Minlan Yu

Abstract:The trend of modeless ML inference is increasingly growing in popularity as it hides the complexity of model inference from users and caters to diverse user and application accuracy requirements. Previous work mostly focuses on modeless inference in data centers. To provide low-latency inference, in this paper, we promote modeless inference at the edge. The edge environment introduces additional challenges related to low power consumption, limited device memory, and volatile network environments.
To address these challenges, we propose HawkVision, which provides low-latency modeless serving of vision DNNs. HawkVision leverages a two-layer edge-DC architecture that employs confidence scaling to reduce the number of model options while meeting diverse accuracy requirements. It also supports lossy inference under volatile network environments. Our experimental results show that HawkVision outperforms current serving systems by up to 1.6X in P99 latency for providing modeless service. Our FPGA prototype demonstrates similar performance at certain accuracy levels with up to a 3.34X reduction in power consumption.

Subjects:	Systems and Control (eess.SY); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Networking and Internet Architecture (cs.NI)
Cite as:	arXiv:2405.19213 [eess.SY]
	(or arXiv:2405.19213v1 [eess.SY] for this version)
	https://doi.org/10.48550/arXiv.2405.19213

Submission history

From: ChonLam Lao [view email]
[v1] Wed, 29 May 2024 15:56:33 UTC (2,966 KB)

Electrical Engineering and Systems Science > Systems and Control

Title:HawkVision: Low-Latency Modeless Edge AI Serving

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Systems and Control

Title:HawkVision: Low-Latency Modeless Edge AI Serving

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators