Toward Fine-grained Causality Reasoning and Question Answering

Wang, Zhen

Toward Fine-grained Causality Reasoning and Question Answering

Title

Toward Fine-grained Causality Reasoning and Question Answering

Author

Wang, Zhen (TU Delft Electrical Engineering, Mathematics and Computer Science)

Contributor

Houben, G.J.P.M. (mentor)
Yang, J. (mentor)
Zhang, X. (graduation committee)

Degree granting institution

Delft University of Technology

Programme

Computer Science

Date

2022-06-29

Abstract

This thesis mainly studies the causality in natural language processing. Understanding causality is key to the success of NLP applications, especially in high-stakes domains. Causality comes in various perspectives such as enable and prevent that, despite their importance, have been largely ignored in the literature. In view of the lack of a dataset that can be used for causality-related research, in this thesis, we first build a first-of-its-kind, fine-grained causal reasoning dataset - FineCR, that contains new causality relations such as enable and prevent, with the help of human annotators. Our dataset contains human annotations of 25K cause-effect event pairs and 24K question-answering pairs within multi-sentence samples, where each can contain multiple causal relationships. To study current NLP models' ability to deal with the causality-related dataset and to figure out the problems that still exist, we define a series of NLP tasks based on FineCR, including causality detection, causality event extraction and causality question answering. Our experimental results with state-of-the-art deep learning models prove that there is still much room for improvement on those causal reasoning tasks. We found that those models have different shortcomings for different tasks. For the causality detection task, current classification models are easily affected by keywords, while the model cannot accurately extract the events for the causality event extraction task. And for the causality question answering task, it is sometimes difficult for the model to find the corresponding answer due to its inability to understand the semantics well. Those discoveries indicate the need to design better solutions to event causality research. In conclusion, our novel datasets and tasks provide a challenging benchmark for evaluating models' causal ability, and the experimental results shed light on future directions for improving neural language models.

Subject

Causality
Dataset
Natural Language Processing
Question Answering

To reference this document use:

http://resolver.tudelft.nl/uuid:3a7d839d-5766-4419-b9a6-7eb829a9db58

Part of collection

Student theses

Document type

master thesis

Rights

Files

PDF

msc_thesis_zhen_wang.pdf

13 MB

Close viewer