A dataset and analyses for a paper Behavior-Aware Network Segmentation using IP Flows
Network segmentation is a powerful tool for network defense. In contemporary complex, dynamic, and multilayer networks, network segmentation suffers from lack of visibility into processes in the network which results in less strict segment definition and loosen network security. Moreover, the dynamics of the networks makes the manual identification of network segments nearly impossible. In this paper, we inspect the possibilities of the behavior-aware network segmentation using IP flows and machine learning approaches that would enable to identify segments automatically even in a complex network. We evaluate the suitability of clustering algorithms for identification of behavior-consistent segments in a network. We show that the clustering algorithms can identify relevant behavior-consistent clusters that overlap with those identified manually by experts. Apart from the segment identification, we investigate the other essential task of network segmentation process: assignment of an unknown host to an existing segment. We evaluate the performance of four different classification mechanisms on a real-world dataset. We show that it is possible to assign an unknown host to an appropriate network segment with up to 92% precision. Moreover, we release the whole dataset and experiment steps available for public use.
- Jupyter notebook (http://jupyter.org/)
- Python 3 (https://www.python.org/)
- Numpy (http://www.numpy.org/)
- Pandas (https://pandas.pydata.org/)
- Matplotlib (https://matplotlib.org/)
- Seaborn (https://seaborn.pydata.org/)
- Scikit-learn (http://scikit-learn.org/)
- imbalanced-learn (https://pypi.org/project/imbalanced-learn/)
- Cython (http://docs.cython.org/en/latest/index.html)
-
Go to
dataset/
directory$ cd dataset/
-
Download dataset from Zenodo webpage to
dataset/
folder$ wget https://zenodo.org/record/2669079/files/host-network-traffic-time-series-2019-01-annon.csv
-
Go to
analyses/
directory$ cd analyses/
-
Run Jupyter notebook
$ jupyter notebook
-
Run
Dataset_preprocessing.ipynb
to preprocess dataset -
Run
Balancing_dataset.ipynp
to prepare balanced dataset -
Go to
DTW_LB_Keogh/
directory$ cd DTW_LB_Keogh/
-
Run
DTW_module.ipynb
. Rename generated .so file to DTW.so and copy this file intoanalyses/KNN_LB_Keogh/
andanalyses/K-Means/
directories -
Run any of the analysis present in folders.
Juraj Smeriga and Tomas Jirsik. 2019. Behavior-Aware Network Segmentation using IP Flows. In Proceedings of the 14th International Conference on Availability, Reliability and Security (ARES '19). ACM, New York, NY, USA, Article 5, 9 pages. DOI: https://doi.org/10.1145/3339252.3339265
Bibtex
@inproceedings{Smeriga:2019:BNS:3339252.3339265,
author = {Smeriga, Juraj and Jirsik, Tomas},
title = {Behavior-Aware Network Segmentation Using IP Flows},
booktitle = {Proceedings of the 14th International Conference on Availability, Reliability and Security},
series = {ARES '19},
year = {2019},
isbn = {978-1-4503-7164-3},
location = {Canterbury, CA, United Kingdom},
pages = {5:1--5:9},
articleno = {5},
numpages = {9},
url = {http://doi.acm.org/10.1145/3339252.3339265},
doi = {10.1145/3339252.3339265},
acmid = {3339265},
publisher = {ACM},
address = {New York, NY, USA},
}
This research was supported by ERDF "CyberSecurity, CyberCrime and Critical Information Infrastructures Center of Excellence" (No. CZ.02.1.01/0.0/0.0/16_019/0000822).