Print Email Facebook Twitter The Future of Fraud Detection Title The Future of Fraud Detection: Detecting Fraudulent Insurance Claims Using Machine Learning Methods Author Plaisant van der Wal, Renzo (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Al-Ars, Zaid (mentor) Verwer, Sicco (graduation committee) de Voogd, G.W.H. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Engineering Date 2018-08-17 Abstract Machine learning methods are explored in an attempt to achieve better predictive performance than the legacy rule-based fraud detection systems that are currently used to detect fraudulent car insurance claims. There are two key principles that lead the exploration of machine learning techniques and algorithms in this thesis, namely, the applicability to imbalanced data, and the interpretability of predictions. The dataset used for model training and evaluation contains only 0.3\% fraudulent claims compared to 99.7\% non-fraudulent claims, which can therefore be considered highly imbalanced. Furthermore, prediction interpretability is of great importance, since fraud experts are directly interfacing with the output of the machine learning models. With the key principles in mind, this thesis considers four algorithms, Logistic Regression, Random Forest, LightGBM and a Stacking classifier. The algorithms are trained on the imbalanced learning problem by using a combination of undersampling (random and Edited Nearest Neighbors), oversampling (SMOTE) and class weighting. Conclusively, each trained model meets the objective, with the Stacking classifier combining the best performance with the lowest variance. By benchmarking the baseline for two different parameters, the models can be evaluated for two boundary conditions, which leads to tunable performance between the two conditions. Ultimately, the performance of the Stacking classifier is tunable (by moving its classification threshold) to roughly a 70-80\% increase in extra fraud caught or a 75\% reduction in effort. Extra fraud will increase the amount of real fraudulent claims that fraud experts get to see, and effort reduction leads to an increase in capacity, which enables fraud experts to spend more time on other more relevant tasks. Subject InsuranceMachine Learningfraud detectionfraudimbalanced To reference this document use: http://resolver.tudelft.nl/uuid:935a0d46-2e26-4af5-b308-32b5fe54926b Embargo date 2019-08-17 Part of collection Student theses Document type master thesis Rights © 2018 Renzo Plaisant van der Wal Files PDF 20180813_final_thesis_rv_ ... wal_v1.pdf 2.03 MB Close viewer /islandora/object/uuid:935a0d46-2e26-4af5-b308-32b5fe54926b/datastream/OBJ/view