Print Email Facebook Twitter Intensity-Aware Rank Estimation for Dimensionality Reduction in Imaging Mass Spectrometry Title Intensity-Aware Rank Estimation for Dimensionality Reduction in Imaging Mass Spectrometry Author van Winden, Thijs (TU Delft Mechanical, Maritime and Materials Engineering) Contributor van de Plas, R. (mentor) Degree granting institution Delft University of Technology Programme Mechanical Engineering | Systems and Control Date 2019-04-17 Abstract Imaging Mass Spectrometry (IMS) is a spectral imaging technique, which enables detection of the spatial distribution of molecules by collecting a mass spectrum for every pixel across a tissue sample. As such, IMS enables the detection of disease-introduced anomalies in tissue samples as well as the gaining of deeper insight on a molecular level into biological processes. The dimensionality of IMS data is high, considering that every bin (or ion) along a mass spectrum represents a separate image and the number of pixels per image is relatively high. Manual analysis of the data suffers from this high dimensionality as visualization becomes increasingly difficult. Furthermore, analysis of such large datasets becomes problematic or infeasible for computational techniques both in time and computational resources. Moreover, the dimensionality of current IMS measurements hampers new applications capturing even more data. Linear dimensionality reduction methods, such as Principal Component Analysis (PCA) and Nonnegative Matrix Factorization (NMF), seek to reduce these datasets to a set of (principal) components. These components span an underlying feature subspace within the original measurement space. Rank estimation determines the quantity of such components, estimating the number needed to represent the original dataset in a lower-dimensional space while incurring minimal information loss. In the context of IMS, this task is typically performed without the use of domain-specific knowledge. Intensity-aware rank estimation seeks to utilize domain knowledge - in the form of an ion intensity threshold - to help estimate the rank. This threshold emerges naturally from IMS, due to prior knowledge on instrument and ionization process inaccuracies in the low ion intensity region. The ion intensity threshold defines a lower bound for which variations in measurements are reliable. Establishing an intensity-aware version of rank estimation requires the threshold, defined in the original measurement space, to be linked to the abstract feature subspace, defined by NMF or PCA, where the rank estimation takes place. This connection is nontrivial to make and is, therefore, a central topic of this thesis. Furthermore, intensity-aware rank estimation requires the abstract subspace to represent the majority of the information above the threshold in the first set of components, which is not guaranteed in pure NMF and PCA formulations. In this thesis, we demonstrate threshold-aware rank estimation and residual-fraction rank estimation which make rank estimation for PCA intensity-aware. Threshold-aware rank estimation applies a histogram transformation to the intensities in the original measurement space to emphasize threshold-exceeding intensities. Consecutively, we estimate the rank based on the percentage of explained variance. Residual-fraction rank estimation uses untransformed measurements but instead estimates rank based on the ratio of the above- and below-threshold residuals. We demonstrate that both rank estimations are able to find the correct rank in a synthetic dataset. With threshold-aware rank estimation applied to an IMS dataset, we show that the transformation before application of PCA leads to a lower overall estimate of rank based on a percentage of the explained variance. With residual-fraction rank estimation applied to an IMS dataset, we show that we can obtain rank estimates based on the structure of dataset close to cross-validation rank estimates for the same dataset. Subject Imaging Mass SpectrometryDimensionality ReductionRank EstimationPrincipal Component AnalysisNonnegative Matrix Factorization To reference this document use: http://resolver.tudelft.nl/uuid:c6ddb2ce-5551-4bd1-807a-a234486cfc9b Embargo date 2020-04-17 Part of collection Student theses Document type master thesis Rights © 2019 Thijs van Winden Files PDF thesis_thijsvanwinden_final.pdf 8.65 MB Close viewer /islandora/object/uuid:c6ddb2ce-5551-4bd1-807a-a234486cfc9b/datastream/OBJ/view