SwitchFlow: Preemptive Multitasking for Deep Learning

Wu, Xiaofeng; Rao, Jia; Wei, Chen; Huang, Heng; Ding, Chris; Huang, Hang

View/Open

Journal Article (1.473Mb)

Date

2021-12-10

Author

Wu, Xiaofeng

Rao, Jia

Wei, Chen

Huang, Heng

Ding, Chris

Huang, Hang

Metadata

Show full item record

Abstract

Accelerators, such as GPU, are a scarce resource in deep learning (DL). Effectively and efficiently sharing GPU leads to improved hardware utilization as well as user experiences, who may need to wait for hours to access GPU before a long training job is done. Spatial and temporal multitasking on GPU have been studied in the literature, but popular deep learning frameworks, such as TensorFlow and PyTorch, lack the support of GPU sharing among multiple DL models, which are typically represented as computation graphs, heavily optimized by underlying DL libraries, and run on a complex pipeline spanning CPU and GPU. Our study shows that GPU kernels, spawned from computation graphs, can barely execute simultaneously on a single GPU and time slicing may lead to low GPU utilization. This paper presents SwitchFlow, a scheduling framework for DL multitasking. It centers on two designs. First, instead of scheduling a computation graph as a whole, SwitchFlow schedules its subgraphs and prevents subgraphs from different models to run simultaneously on a GPU. This results in less interference and the elimination of out-of-memory errors. Moreover, subgraphs running on different devices can overlap with each other, leading to a more efficient execution pipeline. Second, SwitchFlow maintains multiple versions of each subgraph. This allows subgraphs to be migrated across devices at a low cost, thereby enabling low-latency preemption. Results on representative DL models show that SwitchFlow achieves up to an order of magnitude lower tail latency for inference requests collocated with a training job.

URI

http://hdl.handle.net/10106/31596

Collections

Articles published in Association for Computing Machinery (ACM) Journals - DO NOT EDIT