Accelerate Cooperative Deep Inference via Layer-wise Processing Schedule Optimization

Ning Wang, Yubin Duan, Jie Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Computation offloading is proposed to solve one obstacle of enabling high-accurate and real-time deep inference in resource-constrained Internet of Things (IoT) devices. Cooperative deep inference is proposed recently to further trade-off the introduced communication latency in computation offloading, which partitions a Deep Neural Network (DNN) model into two parts and utilizes the IoT end device and the server to process the DNN model cooperatively. We observe one important but ignored fact in all previous works: DNN computation and communication processing cbe conducted simultaneously in cooperative deep inference. As a result, the DNN layer-wise processing schedule has an impact on inference latency and it is non-trivial to find the optimal schedule in State-Of-The-Art (SOTA) DNNs with Directed Acyclic Graph (DAG) computational architectures. The contributions of this paper are as follows. (1) The proposed Deep Inference Optimization with Layer-wise Schedule, Deep-Inference-L, is a unique pipeline-based DAG schedule problem, which turns out to be NP-hard. (2) We categorize SOTA DNNs into three different categories and discuss the corresponding optimal processing schedule in special cases and efficient heuristic schedules in the general case. (3) The proposed solutions are extensively tested via a proof-of-concept prototype. (4) Results indicate that our algorithms can achieve an 8x speedup compared with local inference in the best case.

Original languageEnglish (US)
Title of host publication30th International Conference on Computer Communications and Networks, ICCCN 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9780738113302
DOIs
StatePublished - Jul 2021
Event30th International Conference on Computer Communications and Networks, ICCCN 2021 - Virtual, Athens, Greece
Duration: Jul 19 2021Jul 22 2021

Publication series

NameProceedings - International Conference on Computer Communications and Networks, ICCCN
Volume2021-July
ISSN (Print)1095-2055

Conference

Conference30th International Conference on Computer Communications and Networks, ICCCN 2021
Country/TerritoryGreece
CityVirtual, Athens
Period7/19/217/22/21

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Hardware and Architecture
  • Software

Fingerprint

Dive into the research topics of 'Accelerate Cooperative Deep Inference via Layer-wise Processing Schedule Optimization'. Together they form a unique fingerprint.

Cite this