Print Email Facebook Twitter A Survey on Accelerating Sparse CNN Inference on GPUs Title A Survey on Accelerating Sparse CNN Inference on GPUs Author Chen, Qilin (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Mohamed, Hasan (mentor) Liu, Shih-Chii (mentor) Tömen, N. (mentor) Zuniga, Marco (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2022-06-24 Abstract Convolutional neural networks (CNNs) are often pruned to achieve faster training and inference speed while also requiring less memory. Nevertheless, during computation, most modern GPUs cannot take advantage of the sparsity automatically, especially on networks with unstructured sparsity. Therefore, many libraries that exploit sparsity, have been proposed for accelerating CNN inference on GPUs. However, there is little research on systematically comparing them. In this paper, some state-of-the-art libraries for accelerating sparse CNN inference on GPUs are reviewed and benchmarked. Most of the libraries speedup the convolution and/or pooling operations by skipping zero calculations, therefore, they are able to perform sparse matrix calculations faster. However, many of them have hardware and software restrictions and are hard to integrate into a new model to perform end-to-end inference. Subject Convolutional Neural Networks (CNNs)SparsityAcceleratorsInference To reference this document use: http://resolver.tudelft.nl/uuid:615a9965-3685-439e-8599-9c913b9902da Part of collection Student theses Document type bachelor thesis Rights © 2022 Qilin Chen Files PDF A_Survey_on_Accelerating_ ... n_GPUs.pdf 2.33 MB Close viewer /islandora/object/uuid:615a9965-3685-439e-8599-9c913b9902da/datastream/OBJ/view