Print Email Facebook Twitter Predicting software vulnerabilities with unsupervised learning techniques Title Predicting software vulnerabilities with unsupervised learning techniques Author Man, K.W. (TU Delft Electrical Engineering, Mathematics and Computer Science; TU Delft Cyber Security) Contributor Verwer, S.E. (mentor) Panichella, A. (mentor) Lagendijk, R.L. (graduation committee) Degree granting institution Delft University of Technology Date 2020-08-20 Abstract As software is produced more and more every year, software also gets exploited more. This exploitation can lead to huge monetary losses and other damages to companies and users. The exploitation can be reduced by automatically detecting the software vulnerabilities that leads to exploitation. Unfortunately, the state-of-the-art methods for this automated process are not perfect and thus more research is needed to address this issue.This research was partly done at ING, one of the banks of The Netherlands, in order to find a software vulnerabilities prediction method that is more efficient than their already deployed static code analysis tool Fortify Static Code Analyzer. This report proposes a method to predict software vulnerabilities in code using unsupervised learning methods. The data set is comprised of software metrics of code written by developers of ING, in conjunction with its corresponding label whether the code was vulnerable or non-vulnerable, confirmed by a security expert. Principal component analysis reduced the dimensions of the data set. From here on, the unsupervised learning technique k-means was used to build our prediction model and a distance-based anomaly detection technique was applied to find the software vulnerabilities. This produced poor results. In a final attempt to find better results, k-nearest neighbor was used to build a new prediction model and another distance-based anomaly detection technique was applied. The outcome of this latter method was surprisingly good. Subject k-meansunsupervised learningsoftware fault predictionsoftware vulnerability detectionk-nearest neighborsFortifyanomaly detectionclustering To reference this document use: http://resolver.tudelft.nl/uuid:80c1b078-b8ca-4c29-b0ba-866fdc5f656b Part of collection Student theses Document type master thesis Rights © 2020 K.W. Man Files PDF kwman_MSc_thesis.pdf 7.16 MB Close viewer /islandora/object/uuid:80c1b078-b8ca-4c29-b0ba-866fdc5f656b/datastream/OBJ/view