Print Email Facebook Twitter Detect the watermark through the training model Title Detect the watermark through the training model: A watermarking scheme to protect numerical classification datasets Author Li, Ruonan (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Isler, Devris (mentor) Kellnhofer, P. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2023-02-03 Abstract Datasets play an important role in machine learning technology. The quality of a machine learning model is highly dependent on the quality of the training dataset. Datasets are of great economic value and should be viewed as intellectual property. To protect the property rights of machine learning training datasets, we can make use of the watermarking technique. In this paper, we propose a dataset watermarking method for numerical datasets. Our method is modified from the radioactive data method, which is proposed for image datasets. Our method can detect if a linear classifier machine learning model has been trained with the watermarked dataset. The experiment results show that we can detect the watermark with more than 99% confidence with only 1% of data being modified. The watermarking method is not robust against data normalization but is robust against column dropping when the dimension of the dataset is high. To reference this document use: http://resolver.tudelft.nl/uuid:54eac195-05ea-4227-aebf-ddc1a33a9a4b Embargo date 2023-02-03 Part of collection Student theses Document type bachelor thesis Rights © 2023 Ruonan Li Files PDF thesis_9_1_.pdf 198.59 KB Close viewer /islandora/object/uuid:54eac195-05ea-4227-aebf-ddc1a33a9a4b/datastream/OBJ/view