Skip to content

0.15.5

Compare
Choose a tag to compare
@athas athas released this 23 Apr 16:16
· 4189 commits to master since this release

Added

  • reduce_by_index with f32-addition is now approximately 2x
    faster in the CUDA backend.

Fixed

  • Fixed kernel extractor bug in if-interchange (#921).

  • Fixed some cases of malformed kernel code generation (#922).

  • Fixed rare memory corruption bug involving branches returning
    arrays (#923).

  • Fixed spurious warning about entry points involving opaque return
    types, where the type annotations are put on a higher-order return
    type.

  • Fixed incorrect size type checking for sum types in negative
    position with unknown constructors (#927).

  • Fixed loop interchange for permuted sequential loops with more
    than one outer parallel loop (#928).

  • Fixed a type checking bug for branches returning incomplete sum
    types (#931).