Print Email Facebook Twitter Positive Class Localization Map Title Positive Class Localization Map: A framework for weakly supervised object localization Author van Rosmalen, N.C. Contributor Loog, M. (mentor) Jonker, P.P. (mentor) Faculty Mechanical, Maritime and Materials Engineering Department Biomechanical Engineering Programme Mechanical Engineering - Biomechanical Design - Biorobotics Date 2016-11-04 Abstract To possess a computer algorithm that can perform the popular task of object localization with only weak supervision is valuable for numerous reasons. Often enough a certain localization task (e.g. bird localization) simply does not have properly annotated training data available. In this thesis a novel approach called Positive Class Localization Map is proposed for Weakly Supervised Object Localization (WSOL) in visual data. Currently a Deep Learning algorithm called a Convolutional Neural Network (CNN) is used for most tasks on visual data. A traditional CNN consists of two parts of which the first is shown to learn features for the task it is applied to. In this thesis we obtain the multidimensional representation stored in this part of the network and effectively filter the information related to the target class into the namesake of this thesis, a Positive Class Localization Map (PCLM). We show that from this PCLM the network is still able to perform classification of the input image. We also show that the PCLM holds the location of the target objects. Additionally we study how with padding and pooling we can optimize the information in the PCLM and process any size of input image. Based on this study we propose the optimal PCLM implementation which includes a novel type of pooling we call Spatial Pyramid of Average and Max (SPAM) pooling. We validate the PCLM on the PASCAL Visual Object Classes Challenge (VOC) dataset against the state of the art for WSOL. On PASCAL VOC 2007 we achieve an Mean Average Precision (mAP) of 65.7 when the algorithm is allowed to return the x,y-location of an object. When it is tasked to return a bounding rectangle we achieve 27.5 mAP. On the 2012 dataset we obtain 64.5 and 25.4 mAP respectively. Especially the results with a bounding rectangle as response are very close to the state of the art (30.9) in terms of mAP. In terms of processing time we significantly outperform all other frameworks. We also evaluated performance on a real world bird localization problem. On large 2560x1600 images containing tiny birds we achieve an mAP of 97.3 when the algorithm is allowed to return the x,y-location and 86.8 mAP for bounding boxes. These results show the promise of our PCLM framework. Subject weakly supervisedobject localizationbirdsdeep learningneural networksconvolutionalPCLMPositive Class Localization Map To reference this document use: http://resolver.tudelft.nl/uuid:23a5bdb8-2610-4e01-9586-a6faf34a6cbd Part of collection Student theses Document type master thesis Rights (c) 2016 van Rosmalen, N.C. Files PDF mscThesis_1510355_NCvanRosmalen.pdf 13.42 MB Close viewer /islandora/object/uuid:23a5bdb8-2610-4e01-9586-a6faf34a6cbd/datastream/OBJ/view