Print Email Facebook Twitter Log Differencing using State Machines for Anomaly Detection Title Log Differencing using State Machines for Anomaly Detection Author Tsoni, Sofia (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Verwer, Sicco (mentor) van Deursen, Arie (graduation committee) Finavaro Aniche, Mauricio (graduation committee) Wieman, Rick (mentor) Degree granting institution Delft University of Technology Date 2019-08-16 Abstract Huge amounts of log data are generated every day by software. These data contain valuable information about the behavior and the health of the system, which is rarely exploited, because of their volume and unstructured nature. Manually going through log files is a time-consuming and labor-intensive procedure for developers. Nonetheless logging information can expose the problematic execution of the software, even though the final outcome seem to be normal. Nowadays the automatic analysis of the log files is crucial for detecting problems, but mainly for understanding how the software behaves, which would be beneficial for the prevention of failures and improvement of the software itself. Towards that direction, this project aims the identifications of unexpected executions of the software and the determination of the root cause behind them. In more details, the expected behavior of the software can be approximated using model inference techniques and the newly incoming observed data can be analyzed to verify if they are conformed by the expected behavior. The conformance checking method that will be used is called replay. The incoming traces will be replayed in the graph, at the point they are not validated, the alignment algorithm will take over. The sequence alignment is performed in three different ways. Two of the methods are looking for the best alignment at a specific radius around the problematic node. Additionally a global alignment technique is implemented, which is based on the famous algorithm by Needleman and Wunsch for DNA sequences. Our goal required the modification of the aforementioned algorithm to not only align two sequences, but a sequence with a tree structured model. Finally the implemented tool provides the visualization of the differences in a way that makes it intuitive for the developers to understand what went wrong. Some additional information are also provided to make the investigation of the "anomaly" easier. Subject log analysislog differencinganomaly detectionstate machinessoftware engineeringsequence alignmentmodel checkerslog comparison To reference this document use: http://resolver.tudelft.nl/uuid:b0b39832-c921-412c-b6f8-9ac4c52b57f6 Part of collection Student theses Document type master thesis Rights © 2019 Sofia Tsoni Files PDF LogDifferencingUsingState ... _tsoni.pdf 4.78 MB Close viewer /islandora/object/uuid:b0b39832-c921-412c-b6f8-9ac4c52b57f6/datastream/OBJ/view