Print Email Facebook Twitter Authorship Identification and Verification of JavaScript Source Code: An Evaluation of Techniques Title Authorship Identification and Verification of JavaScript Source Code: An Evaluation of Techniques Author Wilco, W.C. Contributor Zaidman, A.E. (mentor) Faculty Electrical Engineering, Mathematics and Computer Science Department Software Technology Programme Software Engineering Date 2014-12-18 Abstract The increasing number of criminals that exploit the speed and anonymity of the Web has become of increasing concern. Little effort has been spent to trace the authors of malicious code. To that end we investigated authorship identification and verification of JavaScript source code. We evaluated three character based approaches and propose a new domain specific approach. What is new in the domain specific analysis approach, is that it represents code by a parse tree to extract structural features. The evaluation of the techniques with open source code from GitHub, turned out that the approaches that use character n-gram features achieved the best performance. However, the combination of n-gram and domain specific features turned out to be complementary, resulting in a higher performance. Techniques that used similarity based classification were especially successful if a limited amount of training data were available, while feature vector based techniques were mainly successful when a large amount of training data were available and in an authorship verification context. By means of code minification we evaluated how the classification accuracy is affected by removing authorship information from the source code. Code minification has shown to significantly deteriorate the performance of the authorship analysis methods. Especially the compression based technique is robust against code minification. Subject authorship analysisauthorship identificationauthorhip verificationsource coden-gramJavaScriptminification To reference this document use: http://resolver.tudelft.nl/uuid:f6aa2f88-e657-4fef-b684-188a212c71ad Part of collection Student theses Document type master thesis Rights (c) 2014 Wilco, W.C. Files PDF MSc_Thesis_Wilco_Wisse.pdf 4.37 MB Close viewer /islandora/object/uuid:f6aa2f88-e657-4fef-b684-188a212c71ad/datastream/OBJ/view