SNESet

SNESet comprises a total of 9 million clean records with QoS and QoE telemetry metrics of 8 VSAs over four months, covering end-users from 798 edge sites, 30 cities, and 3 ISPs in one country.

We provide the artifact for our SIGMOD'24 paper, including:

Setup
Datasets (SNESet and all datasets for comparison)
Characterization & Comparison
Benchmark (Experiments for regression)

1. Setup

Required software dependencies are listed below:

catboost==1.1.1
matplotlib==3.5.0
numpy==1.19.2
pandas==1.1.3
seaborn==0.11.0
scikit-learn==1.1.3
scipy==1.5.2
statsmodels==0.12.2
python==3.8.5
pytorch==1.8.1
xgboost==1.7.1

The installation of GPU version LighGBM refers to this link for more details.

Dependencies can be installed using the following command:

pip install -r requirements.txt

2. Datasets

SNESet and all datasets for comparison are available under the path 'datasets/'. The datasets for comparison consists of:

Huawei Dataset. The original dataset is available at http://jeremie.leguay.free.fr/qoe/index.html. Our cleaned version is <repo>/datasets/ICC_cleaned.csv.
Alibaba cluster-trace-v2018. The original datasets is available at this reporstery. Our aggregated results are available at <repo>/datasets/ecs/.
Edge Dataset. The original datasets is available at this reporstery. Our aggregated results are available at <repo>/datasets/ens/.
SNESet. The raw data of our dataset SNESet is <repo>/datasets/training_2nd_dataset.csv. The overall architecture of data collection and analysis is shown below. Please refer to our paper for more details about the data collection system.

3. Characterization

We characterize and compare the QoS and QoE metrics in SNESet with existing publicly available datasets and qualitatively investigate the impact of QoS on QoE using Kendall correlation and relative information gain. Please refer to <repo>/characterization/README.md for details.

4. Benchmark

We quantitatively measure the impact of different QoS metrics on QoE utilizing seven mainstream regression methods. Considering the timeliness of real-world deployment, we compare the prediction accuracy and the time efficiency in both domain-general (for all the applications) and domain-specific scenarios (for specific applications). Please refer to <repo>/benchmark/README.md for details.

Citation

If you find this repo useful, please cite our paper.

@article{li2023demystifying,
  title={Demystifying the QoS and QoE of Edge-hosted Video Streaming Applications in the Wild with SNESet},
  author={Li, Yanan and Deng, Guangqing and Bai, Changming and Yang, Jingyu and Wang, Gang and Zhang, Hao and Bai, Jin and Yuan, Haitao and Xu, Mengwei and Wang, Shangguang},
  journal={Proceedings of the ACM on Management of Data},
  volume={1},
  number={4},
  pages={1--29},
  year={2023},
  publisher={ACM New York, NY, USA}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SNESet

1. Setup

2. Datasets

3. Characterization

4. Benchmark

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
benchmark		benchmark
characterization		characterization
datasets		datasets
figures		figures
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

YananLi18/SNESet

Folders and files

Latest commit

History

Repository files navigation

SNESet

1. Setup

2. Datasets

3. Characterization

4. Benchmark

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages