Generalization and Data Transformation Invariance of Visual Attention Models

de Kruijff, Pepijn

Generalization and Data Transformation Invariance of Visual Attention Models

Title

Generalization and Data Transformation Invariance of Visual Attention Models

Author

de Kruijff, Pepijn (TU Delft Electrical Engineering, Mathematics and Computer Science)

Contributor

Bohmer, Wendelin (mentor)
Poulsen, C.B. (mentor)

Degree granting institution

Delft University of Technology

Programme

Computer Science and Engineering

Project

CSE3000 Research Project

Date

2022-06-24

Abstract

This paper compares the generalizing capability of multi-head attention (MHA) models with that of convolutional neural networks (CNNs). This is done by comparing their performance on out-ofdistribution data. The dataset that is used to train both models is created by coupling digits from the MNIST dataset with a set amount of background images from the CIFAR-10 dataset. An out of distribution sample is generated by using a background not used during training. This paper compares the accuracy of both models on such out-ofdistribution samples to indicate the generalizability of both models. Furthermore, the invariance of MHA models towards certain affine data transformations is compared to that of CNNs. The results indicate that MHAs might be slightly better at generalizing to unseen data, but that CNNs are better able to generalize to the data transformations performed in this papers experiments.

To reference this document use:

http://resolver.tudelft.nl/uuid:7768060e-984d-4b77-9b76-2f2f2079bcac

Part of collection

Student theses

Document type

bachelor thesis

Rights

Files

PDF

Generalization_and_Data_T ... Models.pdf

1.36 MB

Close viewer