This project focuses on the optimization of quantization processes, including uniform, non-uniform, and easy quantization methods. The methods aim to minimize the L2 norm between the original and quantized vectors while handling outliers effectively.
```
Cuda >= 11.0(Optional)
Python >= 3.6
Docker(Optional)
```
git clone https://github.com/milad1378yz/quantization-optimization-exercise
cd quantization-optimization-exercise
Navigate to the docker
directory and run the Docker container:
cd docker
docker-compose up
Attach to the container, and you’re ready to execute the main script directly without needing to install dependencies from requirements.txt
.
Once inside the Docker container, run the main script to start the quantization optimization process:
python src/main.py
The main script accepts various arguments to control vector size, quantization approach, device, and more. Here are some of the primary options:
--vector_size
: Size of the vector (small
orlarge
). Default issmall
.--approach
: Quantization approach to use (uniform
,non-uniform
,easyquant
, orall
).--num_bits
: Number of bits for quantization levels.--device
: Device for optimization (cpu
orcuda
).--use_multiprocessing
: Enable multiprocessing for large vectors.
For instance, to run the non-uniform quantization with a large vector, use:
python src/main.py --vector_size large --approach non-uniform --device cuda
To get all of the results, do this:
chmod +x run_all.sh
./run_all.sh
After each run, the script saves results, including L2 norms and latencies, in the specified save_dir
. Plots are generated to visualize quantization levels, distributions, and outlier handling.
The src/results_processor.py
script processes the generated results.txt
and creates a summary image of quantization performance.
--input_file
: Path to theresults.txt
file. Default isresults/results.txt
.--output_image
: Path to save the output image. Default isresults/quantization_performance_comparison.png
.
To run the analysis:
python src/results_processor.py--input_file results/results.txt --output_image results/quantization_performance_comparison.png
The following image shows a sample output generated by the analysis script: