HPC is an open dataset of logs collected from System 20 of the high performance computing cluster at the Los Alamos National Laboratories. But the link (http://institutes.lanl.gov/data/fdata/) to the original data has been out of service. The log has been used for benchmarking automated log parsers in the following papers, where you may find more details about the usage of this dataset.
-
Pinjia He, Jieming Zhu, Shilin He, Jian Li, Michael R. Lyu. An Evaluation Study on Log Parsing and Its Use in Log Mining, in Proc. of IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2016.
-
Adetokunbo Makanju, A. Nur Zincir-Heywood, Evangelos E. Milios. Clustering Event Logs Using Iterative Partitioning, in Proc. of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2009.
Note that HPC_2k.log
is a sample log. The raw logs can be requested from Zenodo: https://doi.org/10.5281/zenodo.1144100