Print Email Facebook Twitter Holistic Schema Matching at Scale Title Holistic Schema Matching at Scale Author Psarakis, Kyriakos (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Katsifodimos, A (mentor) Houben, G.J.P.M. (graduation committee) van Deursen, A. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science | Software Technology Date 2020-12-03 Abstract Schema matching is a fundamental task in the data integration pipeline and has been studied extensively in the past decades, leading to many novel schema matching methods. However, these methods do not follow a standard evaluation process, leading to uncertainty in which one performs best in matching accuracy and runtime constraints, and in which specific schema matching category, and with what hyperparameters. To clear the confusion, the need for a scalable benchmarking suite to determine the field's progress became apparent, leading to the first contribution of this work, a scalable benchmarking suite for schema matching tasks. In the meantime, we realized that the literature lacked a scalable holistic schema matching system, leading to our second contribution. By considering the knowledge gained from our proposed benchmark, we developed a system that can incorporate any algorithm and data source while running the schema matching jobs in parallel across multiple machines in a scalable fashion. Furthermore, we decided to give a leading role to the users of such a system. The reason behind that is that it became apparent in the benchmark that no algorithm is perfect in every situation, and in mission-critical applications, we cannot afford any mistakes. Thus, the users would have to approve the proposed matches, and we focused on making this task scalable, fast, and straightforward. Subject Schema MatchingScalabilityData Management To reference this document use: http://resolver.tudelft.nl/uuid:f4ebeda3-6465-49da-813b-f1e6e0820c60 Part of collection Student theses Document type master thesis Rights © 2020 Kyriakos Psarakis Files PDF KyriakosPsarakisMasterThesis.pdf 6.13 MB Close viewer /islandora/object/uuid:f4ebeda3-6465-49da-813b-f1e6e0820c60/datastream/OBJ/view