Detect the watermark through the training model: A watermarking scheme to protect numerical classification datasets

Li, Ruonan

Detect the watermark through the training model

Title

Detect the watermark through the training model: A watermarking scheme to protect numerical classification datasets

Author

Li, Ruonan (TU Delft Electrical Engineering, Mathematics and Computer Science)

Contributor

Isler, Devris (mentor)
Kellnhofer, P. (graduation committee)

Degree granting institution

Delft University of Technology

Programme

Computer Science and Engineering

Project

CSE3000 Research Project

Date

2023-02-03

Abstract

Datasets play an important role in machine learning technology. The quality of a machine learning model is highly dependent on the quality of the training dataset. Datasets are of great economic value and should be viewed as intellectual property. To protect the property rights of machine learning training datasets, we can make use of the watermarking technique. In this paper, we propose a dataset watermarking method for numerical datasets. Our method is modified from the radioactive data method, which is proposed for image datasets. Our method can detect if a linear classifier machine learning model has been trained with the watermarked dataset. The experiment results show that we can detect the watermark with more than 99% confidence with only 1% of data being modified. The watermarking method is not robust against data normalization but is robust against column dropping when the dimension of the dataset is high.

To reference this document use:

http://resolver.tudelft.nl/uuid:54eac195-05ea-4227-aebf-ddc1a33a9a4b

Embargo date

2023-02-03

Part of collection

Student theses

Document type

bachelor thesis

Rights

Files

PDF

thesis_9_1_.pdf

198.59 KB

Close viewer