Print Email Facebook Twitter Proximity of Terms, Texts and Semantic Vectors in Information Retrieval Title Proximity of Terms, Texts and Semantic Vectors in Information Retrieval Author Vuurens, J.B.P. (TU Delft Multimedia Computing) Contributor de Vries, A.P. (promotor) Degree granting institution Delft University of Technology Date 2017-04-26 Abstract Information Retrieval (IR) is finding content of an unstructured nature with respect to an information need. A retrieval system typically uses a retrieval model to rank the available content by their estimated relevance to an information need. For decades, state-of-the-art retrieval models have used the assumption that terms appear independently in text documents. Chapter 1 of this thesis describes how the relevance likelihood of a document changes by the observed distance between co-occurring query terms in its text.Nowadays, news is abundantly available online, allowing users to discover and follow news events. However, online news is often very redundant; most sources basing their stories on previously published works and add only limited new information. Thus, a user often ends up spending significant amount of effort re-reading the same parts of a story before finding relevant and novel information. In Chapter 2 and Chapter 3, we present a novel approach to construct an online news summary for a given topic. Salient sentences are identified by clustering the sentences in the news stream based on the relative proximity of the sentences and the temporal proximity of their publication times. To improve the coherence of a long summary that describes a news topic, we propose to automatically cluster sentences by subtopics in Chapter 4. In Chapter 5, we show how new topics can be detected in the news stream using the same clustering technique.In real-life decision making, people are often faced with an overload of choices. A recommender system aids the user by reducing the available choices to a shortlist of items that are of interest to the user. In Chapter 6, we learn high-dimensional representations for movies that allow to effectively recommend movies based on a user’s most recently rated movies. Subject Information retrievalretrieval algorithmsclusteringrecommender systems To reference this document use: https://doi.org/10.4233/uuid:2dcad546-6cbd-45ca-abe7-ffcf613b1376 ISBN 978-94-6186-803-9 Bibliographical note SIKS Dissertation Series No. 2017-19 The research reported in this thesis has been carried out under the auspices of SIKS, the Dutch Research School for Information and Knowledge Systems. Part of collection Institutional Repository Document type doctoral thesis Rights © 2017 J.B.P. Vuurens Files PDF dissertation_JVuurens_web.pdf 1.37 MB Close viewer /islandora/object/uuid:2dcad546-6cbd-45ca-abe7-ffcf613b1376/datastream/OBJ/view