Print Email Facebook Twitter Enhancing Real-Time Twitter Filtering and Classification using a Semi-Automatic Dynamic Machine Learning setup approach Title Enhancing Real-Time Twitter Filtering and Classification using a Semi-Automatic Dynamic Machine Learning setup approach Author De Jong, N. Contributor Houben, G.J.P.M. (mentor) Hauff, C. (mentor) Stronkman, R.J.P. (mentor) Faculty Electrical Engineering, Mathematics and Computer Science Department Software Technology Programme Web Information Systems Date 2015-08-31 Abstract Twitter contains massive amounts of user generated content that also contains a lot of valuable information for various interested parties. Twitcident has been developed to process and filter this information in real-time for interested parties by monitoring a set of predefined topics, exploiting humans as sensors. An analysis of the relevant information by an operator can result in an estimation of severity, and an operator can act accordingly. However, among all relevant and useful content that is extracted, also a lot of irrelevant noise is present. Our goal is to improve the filter in such a way that the majority of information presented by Twitcident is relevant. To this end we designed an artifact consisting of several components, developed within a dynamic framework. Its major components include a machine learning classifier operating on dynamic features, a semi-automatic setup approach and a training approach. Our prototype operates on Dutch content, but it can be adapted to operate on any language. With a partially implemented prototype of our designed artifact we achieve F2-scores of 0.7 up to 0.9 for our Dutch test-sets using 10-fold cross validation, which is on average a 30% improvement over the existing Twitcident filtering architecture. The artifact is robustly designed, allowing for many forms of future improvements and extensions. We also make some side-contributions, like an approximate matching algorithm for variable length strings. Subject real-time filteringsocial filteringmachine learningclassificationtwittertwitcident To reference this document use: http://resolver.tudelft.nl/uuid:1bfec308-2c16-4cd3-baf6-35062a885ad7 Embargo date 2017-08-31 Part of collection Student theses Document type master thesis Rights (c) 2015 De Jong, N. Files PDF 2015-08-11_Final_Thesis_N ... 308130.pdf 7.51 MB Close viewer /islandora/object/uuid:1bfec308-2c16-4cd3-baf6-35062a885ad7/datastream/OBJ/view