added new configs

huggingface · Nov 24, 2023 · 5668117 · 5668117
1 parent 612fcbe
commit 5668117
Show file tree

Hide file tree

Showing 66 changed files with 32 additions and 4,000 deletions.
diff --git a/.gitignore b/.gitignore
@@ -168,6 +168,4 @@ data/
 version.txt
 
 actions-runner/
-experiments/
-examples/
-results/
+experiments/
diff --git a/examples/running-llamas/README.md b/examples/running-llamas/README.md
@@ -7,16 +7,16 @@ A set of benchmarks on Meta's LLaMA2's inference.
 You will need to install these quantization packages:
 
 ```bash
-pip install auto-gptq # or install it from source
+pip install auto-gptq 
 ```
 
 ## Running
 
 Then run these commands from this directory:
 
 ```bash
-optimum-benchmark --config-dir configs/ --config-name _base_ --multirun
-optimum-benchmark --config-dir configs/ --config-name gptq --multirun
+optimum-benchmark --config-dir configs/ --config-name fp16 --multirun
+optimum-benchmark --config-dir configs/ --config-name bnb-4bit --multirun
 ```
 
 This will create a folder called `experiments` with the results of the benchmarks with an inference `batch_size` ranging from 1 to 16 and an input `sequence_length` (prompt size) of 256.

diff --git a/examples/running-llamas/artifacts/A100-80GB/forward_latency_plot.png b/examples/running-llamas/artifacts/A100-80GB/forward_latency_plot.png
diff --git a/examples/running-llamas/artifacts/A100-80GB/forward_memory_plot.png b/examples/running-llamas/artifacts/A100-80GB/forward_memory_plot.png
diff --git a/examples/running-llamas/artifacts/A100-80GB/full_report.csv b/examples/running-llamas/artifacts/A100-80GB/full_report.csv
diff --git a/examples/running-llamas/artifacts/A100-80GB/generate_memory_plot.png b/examples/running-llamas/artifacts/A100-80GB/generate_memory_plot.png
diff --git a/examples/running-llamas/artifacts/A100-80GB/generate_throughput_plot.png b/examples/running-llamas/artifacts/A100-80GB/generate_throughput_plot.png