Print Email Facebook Twitter Compressing code generation language models on CPUs Title Compressing code generation language models on CPUs: Using Group Lasso pruning and post-training quantization Author Sochirca, Dan (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Al-Kaswan, A. (mentor) Izadi, M. (mentor) van Deursen, A. (mentor) Anand, A. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2023-06-28 Abstract Code generation models have become more popular recently, due to the fact that they assist developers in writing code in a more productive manner. While these large models deliver impressive performance, they require significant computational resources and memory, making them difficult to deploy and expensive to train. Additionally, their large carbon footprint raises environmental concerns. To address these challenges, there is a need to develop techniques for compressing these models while maintaining their performance.In this work, we study the effectiveness of Group lasso pruning and post-training quantization techniques on CPUs, applied to the code generation model CodeGPT. We evaluate the performance of the compressed model using the Exact Match (EM) and Edit Similarity (ES) metrics and study the model size on disk, memory footprint, and CPU inference. In contrast with the original CodeGPT model, our solution offers a 48% relative reduction in disk size, with only a mild drop in the accuracy metrics: 8.51% absolute drop in ES and a 5.5% in EM. Using the ONNX runtime on a regular laptop, we are able to deliver a 2x inference speedup at a 32.6% reduction in size. Our code is publicly available at https://github.com/AISE-TUDelft/LLM4CodeCompression/tree/main/CodeGPT-on-Intel. Subject Code generationTransformersCompressionCodeGPTGroup Lasso PruningPost-Training Quantization To reference this document use: http://resolver.tudelft.nl/uuid:47817baa-9c64-4cca-b206-09544ac5a75b Part of collection Student theses Document type bachelor thesis Rights © 2023 Dan Sochirca Files PDF Final_paper_Dan.pdf 1.66 MB Close viewer /islandora/object/uuid:47817baa-9c64-4cca-b206-09544ac5a75b/datastream/OBJ/view