Skip to content

Latest commit

 

History

History
52 lines (42 loc) · 1.91 KB

README.md

File metadata and controls

52 lines (42 loc) · 1.91 KB

Titan RUR Dataset

Titan was the flagship supercomputer at the Oak Ridge Leadership Computing Facility (OLCF). It was deployed in late 2012, became the fastest supercomputer in the world and was retired on August 2, 2019.

During its production lifetime, Titan provided more than 26 billion core hours of computing time to scientists. Throughout this period, an extensive operational dataset - called the Resource Utilization Report (RUR) - was collected from the Titan system. This RUR data, on one hand, is extremely coarse-grained: in order to avoid any noticeable disturbance to the production jobs, only a small amount of data could be collected and stored. On the other hand, the RUR data is extremely comprehensive in the sense that it provides a log record for every job submitted to Titan from April 2015 to July 2019. These records provide a unique window into operational resource usage at an extreme scale and over a long term.

Overview

Year Raw Data Size # of Job Submissios # of Failed Jobs
2015 1.5 GB 1,529,972 292,699
2016 4.3 GB 4,745,305 691,094
2017 2.8 GB 2,814,838 328,187
2018 2.2 GB 2,370,860 250,773
2019 1.3 GB 1,520,971 104,173

References

@INPROCEEDINGS{fwang2:2019b,
    author={\textbf{F. Wang} and S. Oral and S. Sen and N. Imam},
    %author={\textbf{F. Wang} and Sarp Oral and Satyabrata Sen and Neena Imam},
    booktitle={2019 IEEE International Conference on Cluster Computing (CLUSTER)},
    title={Learning from Five-year Resource-Utilization Data of Titan System},
    year={2019},
    volume={},
    number={},
    pages={1-6},
    doi={10.1109/CLUSTER.2019.8891001},
    ISSN={1552-5244},
    month={9},
}