Skip to content

Commit

Permalink
fix reference trace and docs, sampler docs, loader docs, gitignore an…
Browse files Browse the repository at this point in the history
…d numpy collision

Signed-off-by: JooYoung Park <[email protected]>

add finer-step reference trace

Signed-off-by: JooYoung Park <[email protected]>

added number of function suggestion to cores

Signed-off-by: JooYoung Park <[email protected]>

fix wordlist

Signed-off-by: JooYoung Park <[email protected]>

fixed docs

Signed-off-by: JooYoung Park <[email protected]>

add word to wordlist

Signed-off-by: JooYoung Park <[email protected]>

fix doc

Signed-off-by: JooYoung Park <[email protected]>

Update reference traces docs

Signed-off-by: Leonid Kondrashov <[email protected]>

Update docs/loader.md

Co-authored-by: Dmitrii Ustiugov <[email protected]>

resampled the reference traces that were sampled wrongly

Signed-off-by: JooYoung Park <[email protected]>
  • Loading branch information
JooyoungPark73 committed Dec 5, 2023
1 parent b5eac1a commit 16d4712
Show file tree
Hide file tree
Showing 9 changed files with 36 additions and 21 deletions.
4 changes: 3 additions & 1 deletion .github/configs/wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -748,4 +748,6 @@ Lazar
Cvetkovic
cvetkovic
ethz
lazar
lazar
xvzf
untar
2 changes: 1 addition & 1 deletion .github/workflows/integration_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ jobs:
- name: Drawing samples
run: |
tar -xzvf $tpath/inputs/preprocessed.tar.gz -C $tpath/inputs/
python -m sampler sample --source_trace $tpath/inputs/preprocessed --output $tpath/sampled --min-size 10 --step-size=10 --max-size=50
python -m sampler sample --source_trace $tpath/inputs/preprocessed --original_trace $tpath/inputs/preprocessed --output $tpath/sampled --min-size 10 --step-size=10 --max-size=50
# - name: Plotting results
# run: |
Expand Down
3 changes: 0 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,7 @@ analysis
tmp
data/out
data/azure
data/traces/*
!data/traces/example/
data/traces/reference/*/*.csv
!data/traces/reference/
pkg/generator/*.png
pkg/generator/*.txt
pkg/driver/*.csv
Expand Down
2 changes: 2 additions & 0 deletions data/traces/reference/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*.csv
statistics
4 changes: 2 additions & 2 deletions data/traces/reference/sampled_150.tar.gz
Git LFS file not shown
4 changes: 4 additions & 0 deletions docs/loader.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,10 @@ For to configure the workload for load generator, please refer to `docs/configur
There are a couple of constants that should not be exposed to the users. They can be examined and changed
in `pkg/common/constants.go`.

Sample sizes appropriate for performance evaluation vary depending on the platform.
As a starting point for fine-tuning, we suggest at most 5 functions per core with SMT disabled.
For example, 80 functions for a 16-core node. With larger sample sizes, trace replaying may lead to failures in function invocations.

## Build the image for a synthetic function

The reason for existence of Firecracker and container version is because of different ports for gRPC server. Firecracker
Expand Down
33 changes: 21 additions & 12 deletions docs/sampler.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ git lfs install
cd sampler
git lfs fetch
git lfs checkout
pip install -r requirements.txt
pip install -r ../requirements.txt
```

## Pre-processing the original trace (mandatory)
Expand Down Expand Up @@ -91,9 +91,9 @@ monotonic load increase (in terms of resource usage) when sweeping the sample si
```console
python3 -m sampler sample -h

usage: sample [-h] -t path -o path [-min integer] [-st integer] [-max integer] [-tr integer]
usage: sample [-h] -t path -orig path -o path [-min integer] [-st integer] [-max integer] [-tr integer]

optional arguments:
options:
-h, --help show this help message and exit
-t path, --source_trace path
Path to trace to draw samples from
Expand All @@ -113,22 +113,31 @@ optional arguments:

## Reference traces

The reference traces are stored in `data/traces/reference` folder of this repository, as `preprocessed.tar.gz` and
`sampled.tar.gz` files stored in Git LFS.
The reference traces are stored in `data/traces/reference` folder of this repository, as `preprocessed_150.tar.gz` and
`sampled_150.tar.gz` files stored in Git LFS.

`preprocessed_150.tar.gz` contains the preprocessed traces for the original Azure trace for day 1, 09:00:00-11:30:00 (150
minutes total). 150 minutes trace captures approximately half of all functions from original Azure trace, but makes it
more suitable to run in shorter experiments (10 minutes - 2 hours).

`sampled_150.tar.gz` contains the sampled traces for preprocessed trace from `preprocessed_150.tar.gz`. Sample sizes are
10-200 functions with step 10, 200-3k with step 50, and 3k-24k with step 1k.

`preprocessed.tar.gz` contains the preprocessed traces for the original Azure trace for day 1, 09:00:00-11:30:00 (150
minutes total).
You can untar the tarballs with the following commands:

`sampled.tar.gz` contains the sampled traces for preprocessed trace from `preprocessed.tar.gz`. Sample sizes are 50-3k
functions with step 50 and 3k-24k with step 1k.
```console
tar -xvzf sampled_150.tar.gz
tar -xvzf preprocessed_150.tar.gz
```

The reference traces were obtained by running the following commands:

```console
python3 -m preprocess -t data/azure/ -o data/reference/preprocessed_150 -s 00:09:00 -dur 150
python3 -m sampler preprocess -t data/azure/ -o data/traces/reference/preprocessed_150 -s 00:09:00 -dur 150

python3 -m sample -t data/reference/preprocessed_150 -o data/reference/sampled_150 -min 3000 -st 1000 -max 24000 -tr 16
python3 -m sample -t data/reference/sampled_150/samples/3000 -o data/reference/sampled_150 -min 50 -st 50 -max 3000 -tr 16
python3 -m sampler sample -t data/traces/reference/preprocessed_150 -orig data/traces/reference/preprocessed_150 -o data/traces/reference/sampled_150 -min 3000 -st 1000 -max 24000 -tr 16
python3 -m sampler sample -t data/traces/reference/sampled_150/samples/3000 -orig data/traces/reference/preprocessed_150 -o data/traces/reference/sampled_150 -min 200 -st 50 -max 3000 -tr 16
python3 -m sampler sample -t data/traces/reference/sampled_150/samples/200 -orig data/traces/reference/preprocessed_150 -o data/traces/reference/sampled_150 -min 10 -st 10 -max 200 -tr 16
```

## Tools
Expand Down
4 changes: 2 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
matplotlib==3.7.2
numpy==1.26.1
numpy==1.24.4
pandas==1.3.5
scipy==1.11.2
scipy==1.10.1
pytest==7.4.0
cloudpickle==2.2.1
seaborn==0.13.0
Expand Down
1 change: 1 addition & 0 deletions sampler/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ def main():
sample_parser.add_argument(
'-orig',
'--original_trace',
required=True,
metavar='path',
default=None,
help='Path to the Azure (or other original) trace files, required to maximize the derived sample\'s representativity (WD from the original trace)'
Expand Down

0 comments on commit 16d4712

Please sign in to comment.