How to build

Build on ICX & CLX

Need to modify CMakeLists.txt:

# Enable AVX512_FP16 optimization
add_definitions(-DAVX512_FP32_WEIGHT_ONLY_FP16=true)
# add_definitions(-DAVX512_FP16_WEIGHT_ONLY_FP16=true)
# add_definitions(-DAVX512_BF16_WEIGHT_ONLY_BF16=true)
add_definitions(-DAVX512_FP32_WEIGHT_ONLY_INT8=true)
# add_definitions(-DAVX512_FP16_WEIGHT_ONLY_INT8=true)
# add_definitions(-DDEBUG=true)
# add_definitions(-DSTEP_BY_STEP_ATTN=true)
# add_definitions(-DUSE_MKLML=true)
# add_definitions(-DTIMELINE=true)
# add_definitions(-DUSE_SHM=true)

$ mkdir build && cd build
$ cmake -DBUILD_WITH_SHARED_LIBS=ON ..
$ make -j

Build Options

※ 每次编译项目，需要采用如下，确保编译正确：make clean && cmake .. && make -j

(1) Build UT with option -DXFT_BUILD_TESTS=ON, like cmake .. -DXFT_BUILD_TESTS=ON.
(2) Build with Debug option -DCMAKE_BUILD_TYPE=Debug, like cmake .. -DCMAKE_BUILD_TYPE=Debug. This will use -O0 instead of -O2 and open -g.

Create python whl package

Prepare env

torch
python

Build xfastertransformer.so

# cd <root_directory>
mkdir build && cd build
cmake ..

Create whl package

# cd <root_directory>
python3 setup.py bdist_wheel --verbose 

# add tag
python setup.py egg_info --tag-build="avx512+fp32" bdist_wheel --verbose

Convert models

python -c "import xfastertransformer as xft; xft.LlamaConvert().converter("/data/llama-2-7b-chat","/data/llama-2-7b-chat-xft", "fp16")"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to build

Build on ICX & CLX

Build Options

Create python whl package

Convert models

Clone this wiki locally