Print Email Facebook Twitter How well does GPT-3.5 perform on course assignments from the TU Delft Computer science and engineering Bachelor? Title How well does GPT-3.5 perform on course assignments from the TU Delft Computer science and engineering Bachelor?: Finding themes in course assignments GPT-3.5 performs well on and does not perform well on Author Segers, Mike (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Aivaloglou, E.A. (mentor) Zhang, Xiaoling (mentor) Viering, T.J. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2023-06-28 Abstract Since large language models (LLM) have been emerged, they took a present role in today’s soci- ety. From society, they also found their way into the field of education that is why in this research paper, we looked into assignments and exams from the TU Delft Computer science and engineering bach- elor and assessed which problems Generative pre- trained transformer (GPT) version 3.5, the current version used by ChatGPT, performs well on (i.e. at least above a pass rate) and on which problems it performs less good (i.e. below pass rate). For our research, we collected assignments by asking professors for consent, to make sure our research was ethically correct. Upon receiving consent, pro- fessors had the option to send material, which al- lowed a deeper analysis, or they could also allow a Brightspace (site where TU Delft courses are hosted) course page scrapping. Once all the ques- tions were gathered, we processed them by prompt- ing them into ChatGPT. We gathered the results and categorized them as wrong or right. We did this all with as few modifications to the questions as pos- sible. The only modifications we did were correc- tions of copy errors from a PDF, for example: C becoming e after copying. From the results, we found that ChatGPT has its limitations, particularly in large code understanding and complex mathe- matical reasoning. However, the model performed well in defining concepts and connecting different ideas. We suggest that GPT lacks a comprehensive understanding of coding principles, which hinders its ability to comprehend code. Future work could include exploring other LLMs like GPT-4 and com- paring their performance. Further work could also look at assignments from other universities, pos- sibly in different educational fields. Additionally, investigating different prompting techniques to en- hance the model’s accuracy and reliability could be done as well. To reference this document use: http://resolver.tudelft.nl/uuid:4f33dfab-289d-435c-a47e-c2d069ee0578 Part of collection Student theses Document type bachelor thesis Rights © 2023 Mike Segers Files PDF CSE3000_Final_Paper_Mike.pdf 213.82 KB Close viewer /islandora/object/uuid:4f33dfab-289d-435c-a47e-c2d069ee0578/datastream/OBJ/view