Print Email Facebook Twitter Improving Existing Optimal Decision Trees Algorithmsby Redefining Their Binarisation Strategy Title Improving Existing Optimal Decision Trees Algorithmsby Redefining Their Binarisation Strategy Author Wolska, Ola (TU Delft Electrical Engineering, Mathematics and Computer Science; TU Delft Software Technology) Contributor Demirović, E. (mentor) Pouwelse, J.A. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2021-07-02 Abstract Optimal decision trees are not easily improvable in terms of accuracy. However, improving the pre-processing of underlying dataset can be the answer to creating more accurate decision trees. In this paper, multiple methods of binarising datasets are considered and the resulting decision trees compared. The binarisation is divided into two stages: discretisation and encoding, with various algorithms considered for both of the stages. Additionally, processing the data during the decision tree building, referred to as online processing, instead of beforehand, was considered. It was discovered that for smaller datasets, unsupervised discretisation was preferred, and extending one-hot encoding to also consider multiple categories at once as target gave better accuracy for trees with lower depth. For bigger datasets, online processing has shown to be beneficial. Subject Optimal Decision Treedata processingdiscretisation techniques To reference this document use: http://resolver.tudelft.nl/uuid:7e586f38-fa2d-423c-9bcc-d88dc91f7ca9 Part of collection Student theses Document type bachelor thesis Rights © 2021 Ola Wolska Files PDF research_paper_2.pdf 573.22 KB Close viewer /islandora/object/uuid:7e586f38-fa2d-423c-9bcc-d88dc91f7ca9/datastream/OBJ/view