Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broadcasting on arrays larger than typemax(Int32) yields truncation error #2658

Closed
b-fg opened this issue Feb 14, 2025 · 2 comments · Fixed by #2666
Closed

Broadcasting on arrays larger than typemax(Int32) yields truncation error #2658

b-fg opened this issue Feb 14, 2025 · 2 comments · Fixed by #2666
Labels
bug Something isn't working

Comments

@b-fg
Copy link

b-fg commented Feb 14, 2025

Describe the bug

When broadcasting on arrays larger than typemax(Int32), a truncation error pops up:

Error

ERROR: InexactError: trunc(Int32, 2254857829)
Stacktrace:
  [1] throw_inexacterror(::Symbol, ::Vararg{Any})
    @ Core ./boot.jl:750
  [2] checked_trunc_sint
    @ ./boot.jl:764 [inlined]
  [3] toInt32
    @ ./boot.jl:801 [inlined]
  [4] Int32
    @ ./boot.jl:891 [inlined]
  [5] convert
    @ ./number.jl:7 [inlined]
  [6] cconvert
    @ ./essentials.jl:687 [inlined]
  [7] macro expansion
    @ ~/.julia-acc/packages/CUDA/1kIOw/lib/utils/call.jl:222 [inlined]
  [8] macro expansion
    @ ~/.julia-acc/packages/CUDA/1kIOw/lib/cudadrv/libcuda.jl:5139 [inlined]
  [9] #735
    @ ~/.julia-acc/packages/CUDA/1kIOw/lib/utils/call.jl:35 [inlined]
 [10] check
    @ ~/.julia-acc/packages/CUDA/1kIOw/lib/cudadrv/libcuda.jl:35 [inlined]
 [11] cuOccupancyMaxPotentialBlockSize
    @ ~/.julia-acc/packages/CUDA/1kIOw/lib/utils/call.jl:34 [inlined]
 [12] launch_configuration(fun::CuFunction; shmem::Int64, max_threads::Int64)
    @ CUDA ~/.julia-acc/packages/CUDA/1kIOw/lib/cudadrv/occupancy.jl:61
 [13] launch_configuration
    @ ~/.julia-acc/packages/CUDA/1kIOw/lib/cudadrv/occupancy.jl:56 [inlined]
 [14] (::KernelAbstractions.Kernel{…})(::CuArray{…}, ::Vararg{…}; ndrange::Tuple{…}, workgroupsize::Nothing)
    @ CUDA.CUDAKernels ~/.julia-acc/packages/CUDA/1kIOw/src/CUDAKernels.jl:107
 [15] _copyto!
    @ ~/.julia-acc/packages/GPUArrays/uiVyU/src/host/broadcast.jl:71 [inlined]
 [16] materialize!
    @ ~/.julia-acc/packages/GPUArrays/uiVyU/src/host/broadcast.jl:38 [inlined]
 [17] materialize!(dest::CuArray{Float32, 1, CUDA.DeviceMemory}, bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(identity), Tuple{Float32}})
    @ Base.Broadcast ./broadcast.jl:875
 [18] top-level scope
    @ REPL[3]:1
Some type information was truncated. Use `show(err)` to see complete types.

To reproduce

using CUDA
a = CUDA.rand(Float32, floor(Int, typemax(Int32)*1.05));
a .= zero(Float32);

Note that a GPU with memory >8GB is required to reproduce this test. I am using CUDA.jl v5.6.1:

(CUDAtest) pkg> st
Project CUDAtest v0.1.0
Status `/gpfs/home/delf/delf428444/CUDAtest/Project.toml`
  [052768ef] CUDA v5.6.1

Expected behavior

No truncation error should appear when broadcasting, even for arrays larger than typemax(Int32).

Version info
Julia

Julia Version 1.11.2
Commit 5e9a32e7af2 (2024-12-01 20:02 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 160 × Intel(R) Xeon(R) Platinum 8460Y+
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, sapphirerapids)
Threads: 160 default, 0 interactive, 80 GC (on 160 virtual cores)
Environment:
  JULIA_MPI_BINARY = system
  JULIA_DEPOT_PATH = /home/delf/delf428444/.julia-acc
  LD_LIBRARY_PATH =
  JULIA_NUM_THREADS = auto

CUDA.jl

CUDA runtime 12.6, artifact installation
CUDA driver 12.6
NVIDIA driver 535.86.10

CUDA libraries:
- CUBLAS: 12.6.4
- CURAND: 10.3.7
- CUFFT: 11.3.0
- CUSOLVER: 11.7.1
- CUSPARSE: 12.5.4
- CUPTI: 2024.3.2 (API 24.0.0)
- NVML: 12.0.0+535.86.10

Julia packages:
- CUDA: 5.6.1
- CUDA_Driver_jll: 0.10.4+0
- CUDA_Runtime_jll: 0.15.5+0

Toolchain:
- Julia: 1.11.2
- LLVM: 16.0.6

4 devices:
  0: NVIDIA H100 (sm_90, 54.390 GiB / 63.718 GiB available)
  1: NVIDIA H100 (sm_90, 63.424 GiB / 63.718 GiB available)
  2: NVIDIA H100 (sm_90, 63.424 GiB / 63.718 GiB available)
  3: NVIDIA H100 (sm_90, 63.422 GiB / 63.718 GiB available)
@b-fg b-fg added the bug Something isn't working label Feb 14, 2025
@b-fg
Copy link
Author

b-fg commented Feb 14, 2025

Probably related: #1968

@b-fg
Copy link
Author

b-fg commented Feb 19, 2025

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant