Print Email Facebook Twitter Similarity metrics for binary cell clustering Title Similarity metrics for binary cell clustering: How close can we get to state-of-the-art ? Author Golik, Bartek (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Reinders, M.J.T. (mentor) Bouland, G.A. (mentor) Gerritsen, B.H.M. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2023-06-28 Abstract Analysing single-cell RNA sequencing data is becoming an increasingly tedious task as the size of data sets grows. As a proposed solution, recent discoveries suggest that these data sets can be binarized without losing much information. This in turn should allow for memory and time efficient methods of storage and computation. Numerous analyses techniques require cell clustering as a preliminary procedure, which suggests the need to evaluate binary representation performance under that context. In this work we present a comparison between binary clustering results and the state-of-the-art, with a focus on similarity metric choice and the impact on intermediate steps of the procedure (i.e. similarity matrices and kNN graphs). The method was evaluated on single-cell transcriptomic data sets, utilizing a combination of R and C++ as an evaluation framework. Through these means we found that some of the similarity metrics operating on continuous input can possibly be reproduced with similarity metrics operating on binary input. Subject single-cell RNA sequencingsingle-cell analysisclusteringSimilarity Metrics To reference this document use: http://resolver.tudelft.nl/uuid:744f8be7-37bc-4dd4-a66e-3003f72429ad Part of collection Student theses Document type bachelor thesis Rights © 2023 Bartek Golik Files PDF CSE3000_Final_Paper.pdf 2.15 MB Close viewer /islandora/object/uuid:744f8be7-37bc-4dd4-a66e-3003f72429ad/datastream/OBJ/view