Print Email Facebook Twitter Automated data exfiltration detection using netflow metadata Title Automated data exfiltration detection using netflow metadata Author Etta Tabe, Takang Kajikaw (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Verwer, S.E. (mentor) Cooper, Peter (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science | Cyber Security Date 2019-09-19 Abstract The volume and sophistication of data exfiltration attacks over networks have significantly increased in the last decade. This has resulted in the need for defense mechanisms, to effectively detect both known and unknown data exfiltration scenarios over the network. While methods such as DPI (Deep Packet Inspection) are commonly used to detect data exfiltrations, this mechanism requires a thorough inspection of every payload or packet going out of the network, making it unsuitable for use in some environments, as it is quite resource intensive and can lead to severe data privacy implications. In our work, we use lightweight netflows which are non-privacy invasive to detect data exfiltrations at connection-level granularity. The key intuition behind our proposed solution is that connections involved in data exfiltration tend to differentiate themselves from normal network connections based on certain feature values. The result of this research shows that features extracted from netflows such as the duration of a netflow, the source bytes, the source bytes sent per second, the source bytes sent per packet and the producer-consumer ratio can be used to effectively detect data exfiltration. Subsequently, connections are grouped using k-means, and the robust Z-score of their distances from their respective cluster centroid is used as a statistical and distance-based technique to detect connections involved in a data exfiltration. While this method detects some data exfiltration scenarios, it results in a significant number of false positives. Combining this with the results from the LOF (local outlier factor) and the LoOP (local outlier probability), which are density-based techniques, leads to a more robust model, as it significantly reduces the number of false positives and false negatives. Also, we show that using the smallest clusters formed from k-means for analysis leads to similar detection results as the entire datasets, with a significant reduction in computation time. Subject Data exfiltration detectionnetflowslocal outlier probabilitylocal outlier factoranomaly detectionnetwork traffic analysis To reference this document use: http://resolver.tudelft.nl/uuid:19aa873d-b38d-4133-bcf8-7c6c625af739 Part of collection Student theses Document type master thesis Rights © 2019 Takang Kajikaw Etta Tabe Files PDF report.pdf 3.15 MB Close viewer /islandora/object/uuid:19aa873d-b38d-4133-bcf8-7c6c625af739/datastream/OBJ/view