ggml : rewrite silu and softmax for cpu · Nexesenex/croco.cpp@d7359a3

Commit

ggml : rewrite silu and softmax for cpu

This change upstreams llamafile's vectorized expf() functions. This lets
us compute softmax and silu more accurately than the short[65536] lookup
table that GGML previously used to make this operation go faster. We can
support aarch64 and sse2+ with the worst case rounding error of 2ulp. It
makes make -j8 tests && ./tests/test-backend-ops -o SOFT_MAX -b CPU perf
go 1.5x faster for SSE2+FMA, 1.9x faster for AVX2+FMA and 2.1x on AVX512

Loading branch information

jart committed May 10, 2024

1 parent f98eb31 commit d7359a3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `d7359a3`

Commit

There are no files selected for viewing

0 comments on commit d7359a3

0 comments on commit `d7359a3`