Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LoopVectorization.jl causing segfaults on 1.11 #525

Closed
maleadt opened this issue Jan 10, 2024 · 5 comments
Closed

LoopVectorization.jl causing segfaults on 1.11 #525

maleadt opened this issue Jan 10, 2024 · 5 comments

Comments

@maleadt
Copy link

maleadt commented Jan 10, 2024

LoopVectorization.jl's generated IR seems to cause segfaults on 1.11, as observed on PkgEval with at least 6 packages (MCPhylo,jl, LocalPoly.jl, VectorizedReduction.jl, NaNStatistics.jl, TimeSeriesClassification.jl, PlmDCA.jl). See this report for details: https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/2cbecf4_vs_18b4f3f/report.html

@chriselrod I'm opening a new issue because #518 was closed, and to list all issues in case somebody wants to tackle this.


Some of the errors that I've encountered:

An LLVM assertion, as seen with MCPhylo.jl (requires assertions build of Julia):

julia: /workspace/srcdir/llvm-project/llvm/lib/IR/Instructions.cpp:2561: void llvm::InsertValueInst::init(llvm::Value*, llvm::Value*, llvm::ArrayRef<unsigned int>, const llvm::Twine&): Assertion `ExtractValueInst::getIndexedType(Agg->getType(), Idxs) == Val->getType() && "Inserted value must match indexed type!"' failed.

[177] signal 6 (-6): Aborted
in expression starting at /home/pkgeval/.julia/packages/MCPhylo/KWPlY/test/distributions/phylodist.jl:1
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7fbeac0f040e)
__assert_fail at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
_ZN4llvm15InsertValueInst4initEPNS_5ValueES2_NS_8ArrayRefIjEERKNS_5TwineE at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line)
InsertValueInst at /source/usr/include/llvm/IR/Instructions.h:2640 [inlined]
Create at /source/usr/include/llvm/IR/Instructions.h:2565 [inlined]
CreateInsertValue at /source/usr/include/llvm/IR/IRBuilder.h:2343
emit_new_struct at /source/src/cgutils.cpp:3870
emit_new_struct at /source/src/julia.h:1704
emit_expr at /source/src/codegen.cpp:5945
emit_ssaval_assign at /source/src/codegen.cpp:5367
emit_stmtpos at /source/src/codegen.cpp:5642 [inlined]
emit_function at /source/src/codegen.cpp:8810
jl_emit_code at /source/src/codegen.cpp:9144
jl_emit_codeinst at /source/src/codegen.cpp:9227
_jl_compile_codeinst at /source/src/jitlayers.cpp:220
jl_generate_fptr_impl at /source/src/jitlayers.cpp:525
jl_compile_method_internal at /source/src/gf.c:2509 [inlined]
jl_compile_method_internal at /source/src/gf.c:2397
_jl_invoke at /source/src/gf.c:2912 [inlined]
ijl_apply_generic at /source/src/gf.c:3097
logpdf at /home/pkgeval/.julia/packages/MCPhylo/KWPlY/src/distributions/Phylodist.jl:118

A segfault during vload, as seen with NaNStatistics.jl and PlmDCA.jl:

[60] signal 11 (2): Segmentation fault
in expression starting at /home/pkgeval/.julia/packages/NaNStatistics/oBRaH/test/testArrayStats.jl:80
macro expansion at /home/pkgeval/.julia/packages/VectorizationBase/0dXyA/src/llvm_intrin/memory_addr.jl:987 [inlined]
__vload at /home/pkgeval/.julia/packages/VectorizationBase/0dXyA/src/llvm_intrin/memory_addr.jl:987 [inlined]
_vload at /home/pkgeval/.julia/packages/VectorizationBase/0dXyA/src/strided_pointers/stridedpointers.jl:95 [inlined]
macro expansion at /home/pkgeval/.julia/packages/VectorizationBase/0dXyA/src/vecunroll/memory.jl:60 [inlined]
_vload_unroll at /home/pkgeval/.julia/packages/VectorizationBase/0dXyA/src/vecunroll/memory.jl:535 [inlined]
_vload at /home/pkgeval/.julia/packages/VectorizationBase/0dXyA/src/vecunroll/memory.jl:771 [inlined]
macro expansion at /home/pkgeval/.julia/packages/LoopVectorization/7iB2K/src/reconstruct_loopset.jl:1107 [inlined]
_turbo_! at /home/pkgeval/.julia/packages/LoopVectorization/7iB2K/src/reconstruct_loopset.jl:1107 [inlined]
_nanmean at /home/pkgeval/.julia/packages/NaNStatistics/oBRaH/src/ArrayStats/ArrayStats.jl:344
__nanmean at /home/pkgeval/.julia/packages/NaNStatistics/oBRaH/src/ArrayStats/ArrayStats.jl:308 [inlined]
#nanmean#5 at /home/pkgeval/.julia/packages/NaNStatistics/oBRaH/src/ArrayStats/ArrayStats.jl:307 [inlined]
nanmean at /home/pkgeval/.julia/packages/NaNStatistics/oBRaH/src/ArrayStats/ArrayStats.jl:307

A segfault during vadd_fast as seen with VectorizedReductions.jl:

[12] signal 11 (2): Segmentation fault
in expression starting at /home/pkgeval/.julia/packages/VectorizedReduction/bsnWJ/test/reduce.jl:4
macro expansion at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/llvm_intrin/binary_ops.jl:31 [inlined]
vadd_fast at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/llvm_intrin/binary_ops.jl:31 [inlined]
fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined]
fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined]
fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined]
fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined]
fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined]
fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined]
fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined]
vadd_fast at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:111 [inlined]
add_fast at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/base_defs.jl:91 [inlined]
macro expansion at /home/pkgeval/.julia/packages/LoopVectorization/7iB2K/src/reconstruct_loopset.jl:1107 [inlined]
_turbo_! at /home/pkgeval/.julia/packages/LoopVectorization/7iB2K/src/reconstruct_loopset.jl:1107 [inlined]
macro expansion at /home/pkgeval/.julia/packages/VectorizedReduction/bsnWJ/src/vmapreduce.jl:236 [inlined]
vvmapreduce at /home/pkgeval/.julia/packages/VectorizedReduction/bsnWJ/src/vmapreduce.jl:231
vvreduce at /home/pkgeval/.julia/packages/VectorizedReduction/bsnWJ/src/vmapreduce.jl:147

The source of bad IR hasn't been fully determined yet, but it seems to be the Expr(:new) that's generated to pass structs by value instead of by reference: JuliaLang/julia#52702 (comment).


Deprecating LoopVectorization.jl isn't possible, because:

So the only solution forwards seems fixing LoopVectorization.jl. I've taken a first attempt at it in #523, but just removing the Expr(:new) optimization isn't sufficient, and there's other issues (see above).

@chriselrod
Copy link
Member

chriselrod commented Jan 10, 2024

A few people expressed disappointment at deprecating this library.
If one of them can fix these issues, they're welcome to take over maintenance.

Otherwise, we should proceed on the deprecation plan.
Note that both RecursiveFactorization and TriangularSolve.jl depend heavily on LV; @inbounds @fastmath performs many times worse to the point of making these libraries go from being faster than BLAS below matrices that are 200x200 or so, to many times slower.
Thus they could be deprecated with it, or use explicit kernels.

Unfortunately, RFLU is still much faster than MKL below the 200x200 point on many people's computers
SciML/LinearSolve.jl#357
But we could replace the @turbo uses with manual kernels. There are only three things these libraries do with it:

  1. matrix multiply
  2. SIMD argmax
  3. faster tail handling for some simple loops

This stack is a major contributor to compile times, and does not precompile well: it is a frequent source of bugs where precompiled code allocates, but recompiling it in a running REPL with Revise (through making meaningless changes) make those allocations go away.
At JuliaCon, I was told that Clima people actually sited LoopVectorization.jl as why they're not using the SciML ecosystem, and was asked if we could get rid of it. I'm not affiliated with them in any way that I know of, so I don't see any reason to appease them, just pointing out that there are different perspectives on these tradeoffs.

@brenhinkeller
Copy link
Contributor

brenhinkeller commented Jan 17, 2024

I'm slightly tempted to try to take over maintenance, but I fear that if the maintenance burden has become unsustainable for you it'll just be that much more unsustainable for me.

What was the Clima folks' apparent objection, out of curiosity?

@chriselrod
Copy link
Member

I'm slightly tempted to try to take over maintenance, but I fear that if the maintenance burden has become unsustainable for you it'll just be that much more unsustainable for me.

What was the Clima folks' apparent objection, out of curiosity?

It really shouldn't take much effort.
My suggestion, if the problem is limited to the Array case because of the Memory change, is to add special casing for Array where they get turned into PtrArrays. You'd then also need to make sure they get GC.@preserved.

@schrimpf
Copy link

schrimpf commented May 1, 2024

I believe the segfaults in the tests from VectorizedReductions.jl and NaNStatistics.jl are caused by reducing over empty collections which leads to code along the lines of

A = Float64[]
out = zero(eltype(A))
@turbo for i in eachindex(A)
    out += A[i]
end

LoopVectorization warns users not to do this, so the problem lies with VectorizedReductions and NaNStatistics (here it's the weighted versions of each statistic that has problems, unweighted stats correctly add the check_empty=true flag).

I have not verified it, but I suspect PlmDCA has a similar problem. A quick search of the code reveals that it checks for empty collections in some places, but not others.

@chriselrod
Copy link
Member

check_empty=true/false didn't change between early Julia versions and 1.10.
We could make check_empty=true the default.

Closing this because tests pass on 1.11, while check_empty=false should've also caused segfaults in older Julia versions. https://github.com/JuliaSIMD/LoopVectorization.jl/actions/runs/8946228352

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants