Skip to content

Releases: diku-dk/futhark

0.12.3

09 Nov 18:37
Compare
Choose a tag to compare

Added

  • Character literals can now be any integer type.

  • The integer modules now have popc and clz functions.

  • Tweaked inlining so that larger programs may now compile faster
    (observed about 20%).

  • Pattern-matching on large sum typed-values taken from arrays may
    be a bit faster.

Fixed

  • Various small fixes to type errors.

  • All internal functions used in generated C code are now properly
    declared static.

  • Fixed bugs when handling dimensions and aliases in type ascriptions.

0.12.2

18 Oct 17:48
Compare
Choose a tag to compare

Added

  • New tool: futhark autotune, for tuning the threshold parameters
    used by incremental flattening. Based on work by Svend Lund
    Breddam, Simon Rotendahl, and Carl Mathias Graae Larsen.

  • New tool: futhark dataget, for extracting test input data. Most
    will probably never use this.

  • Programs compiled with the cuda backend now take options
    --default-group-size, --default-num-groups, and
    --default-tile-size.

  • Segmented reduce_by_index are now substantially fasted for small
    histograms.

  • New functions: f32.lerp and f64.lerp, for linear interpolation.

Fixed

  • Fixes to aliasing of record updates.

  • Fixed unnecessary array duplicates after coalescing optimisations.

  • reduce_by_index nested in maps will no longer sometimes
    require huge amounts of memory.

  • Source location now correct for unknown infix operators.

  • Function parameters are no longer in scope of themselves (#798).

  • Fixed a nasty out-of-bounds error in handling of irregular allocations.

  • The floor/ceil functions in f32/f64 now handle infinities
    correctly (and are also faster).

  • Using % on floats now computes fmod instead of crashing the compiler.

0.12.1

21 Aug 12:42
Compare
Choose a tag to compare

Added

  • The internal representation of parallel constructs has been
    overhauled and many optimisations rewritten. The overall
    performance impact should be neutral on aggregate, but there may
    be changes for some programs (please report if so).

  • Futhark now supports structurally typed sum types and pattern
    matching! This work was done by Robert Schenck. There remain
    some problems with arrays of sum types that themselves contain
    arrays.

  • Significant reduction in compile time for some large programs.

  • Manually specified type parameters need no longer be exhaustive.

  • Mapped rotate is now simplified better. This can be
    particularly helpful for stencils with wraparound.

Removed

  • The ~ prefix operator has been removed. ! has been extended
    to perform bitwise negation when applied to integers.

Changed

  • The --futhark option for futhark bench and futhark test now
    defaults to the binary being used for the subcommands themselves.

  • The legacy futhark -t option (which did the same as futhark check) has been removed.

  • Lambdas now bind less tightly than type ascription.

  • stream_map is now map_stream and stream_red is now
    reduce_stream.

Fixed

  • futhark test now understands --no-tuning as it was always
    supposed to.

  • futhark bench and futhark test now interpret --exclude in
    the same way.

  • The Python and C# backends can now properly read binary boolean
    input.

0.11.2

28 Jun 08:17
Compare
Choose a tag to compare

Fixed

  • Entry points whose types are opaque due to module ascription, yet
    whose representation is simple (scalars or arrays of scalars) were
    mistakely made non-opaque when compiled with --library. This
    has been fixed.

  • The CUDA backend now supports default sizes in .tuning files.

  • Loop interchange across multiple dimensions was broken in some cases (#767).

  • The sequential C# backend now generates code that compiles (#772).

  • The sequential Python backend now generates code that runs (#765).

0.11.1

08 Jun 15:38
Compare
Choose a tag to compare

Added

  • Segmented scans are a good bit faster.

  • reduce_by_index has received a new implementation that uses
    local memory, and is now often a good bit faster when the target
    array is not too large.

  • The f32 and f64 modules now contain gamma and lgamma
    functions. At present these do not work in the C# backend.

  • Some instances of reduce with vectorised operators (e.g. map2 (+)) are orders of magnitude faster than before.

  • Memory usage is now lower on some programs (specifically the ones
    that have large maps with internal intermediate arrays).

Removed

  • Size parameters (not annotations) are no longer permitted
    directly in let and loop bindings, nor in lambdas. You are
    likely not affected (except for the stream constructs; see
    below). Few people used this.

Changed

  • The array creation functions exported by generated C code now take
    int64_t arguments for the shape, rather than int. This is in
    line with what the shape functions return.

  • The types for stream_map, stream_map_per, stream_red, and
    stream_red_per have been changed, such that the chunk function
    now takes the chunk size as the first argument.

Fixed

  • Fixes to reading values under Python 3.

  • The type of a variable can now be deduced from its use as a size
    annotation.

  • The code generated by the C-based backends is now also compilable
    as C++.

  • Fix memory corruption bug that would occur on very large segmented
    reductions (large segments, and many of them).

0.10.2

10 Apr 13:36
Compare
Choose a tag to compare

Added

  • reduce_by_index is now a good bit faster on operators whose
    arguments are two 32-bit values.

  • The type checker warns on size annotations for function parameters
    and return types that will not be visible from the outside,
    because they refer to names nested inside tuples or records. For
    example, the function

    let f (n: i32, m: i32): [n][m]i32 = ...
    

    will cause such a warning. It should instead be written

    let f (n: i32) (m: i32): [n][m]i32 = ...
    
  • A new library function
    futhark_context_config_select_device_interactively() has been
    added.

Fixed

  • Fix reading and writing of binary files for C-compiled executables
    on Windows.

  • Fixed a couple of overly strict internal sanity checks related to
    in-place updates (#735, #736).

  • Fixed a couple of convoluted defunctorisation bugs (#739).

0.10.1

26 Mar 10:37
Compare
Choose a tag to compare

Added

  • Using definitions from the intrinsic module outside the prelude
    now results in a warning.

  • reduce_by_index with vectorised operators (e.g. map2 (+)) is
    orders of magnitude faster than before.

  • Executables generated with the pyopencl backend now support the
    options --default-tile-size, --default-group-size,
    --default-num-groups, --default-threshold, and --size.

  • Executables generated with c and opencl now print a help text
    if run with invalid options. The py and pyopencl backends
    already did this.

  • Generated executables now support a --tuning flag for passing
    many tuned sizes in a file.

  • Executables generated with the cuda backend now take an
    --nvrtc-option option.

  • Executables generated with the opencl backend now take a
    --build-option option.

Removed

  • The old futhark-* executables have been removed.

Changed

  • If an array is passed for a function parameter of a polymorphic
    type, all arrays passed for parameters of that type must have the
    same shape. For example, given a function

    let pair 't (x: t) (y: t) = (x, y)
    

    The application pair [1] [2,3] will now fail at run-time.

  • futhark test now numbers un-named data sets from 1 rather than
    0. This only affects the text output and the generated JSON
    files, and fits the tuple element ordering in Futhark.

  • String literals are now of type []u8 and contain UTF-8 encoded
    bytes.

Fixed

  • An significant problematic interaction between empty arrays and
    inner size declarations has been closed (#714). This follows a
    range of lesser empty-array fixes from 0.9.1.

  • futhark datacmp now prints to stdout, not stderr.

  • Fixed a major potential out-of-bounds access when sequentialising
    reduce_by_index (in most cases the bug was hidden by subsequent
    C compiler optimisations).

  • The result of an anonymous function is now also forbidden from
    aliasing a global variable, just as with named functions.

  • Parallel scans now work correctly when using a CPU OpenCL
    implementation.

  • reduce_by_index was broken on newer NVIDIA GPUs when using fancy
    operators. This has been fixed.

0.9.1

08 Feb 16:07
Compare
Choose a tag to compare

Added

  • futhark cuda: a new CUDA backend by Jakob Stokholm Bertelsen.

  • New command for comparing data files: futhark datacmp.

  • An :mtype command for futhark repl that shows the type of a
    module expression.

  • futhark run takes a -w option for disabling warnings.

Changed

  • Major command reorganisation: all Futhark programs have been
    combined into a single all-powerful futhark program. Instead of
    e.g. futhark-foo, use futhark foo. Wrappers will be kept
    around under the old names for a little while. futharki has
    been split into two commands: futhark repl and futhark run.
    Also, py has become python and cs has become csharp, but
    pyopencl and csopencl have remained as they were.

  • The result of a function is now forbidden from aliasing a global
    variable. Surprisingly little code is affected by this.

  • A global definition may not be ascribed a unique type. This never
    had any effect in the first place, but now the compiler will
    explicitly complain.

  • Source spans are now printed in a slightly different format, with
    ending the line number omitted when it is the same as the start
    line number.

Fixed

  • futharki now reports source locations of trace expressions
    properly.

  • The type checker now properly complains if you try to define a
    type abbreviation that has unused size parameters.

0.8.1

25 Dec 11:02
Compare
Choose a tag to compare

Added

  • Now warns when /futlib/... files are redundantly imported.

  • futharki now prints warnings for files that are ":load"ed.

  • The compiler now warns when entry points are declared with types
    that will become unnamed and opaque, and thus impossible to
    provide from the outside.

  • Type variables invented by the type checker will now have a
    unicode subscript to distinguish them from type parameters
    originating in the source code.

  • futhark-test and futhark-bench now support generating random
    test data.

  • The library backends now generate proper names for arrays of
    opaque values.

  • The parser now permits empty programs.

  • Most transpositions are now a good bit faster, especially on
    NVIDIA GPUs.

Removed

  • The <- symbol can no longer be used for in-place updates and
    record updates (deprecated in 0.7.3).

Changed

  • Entry points that accept a single tuple-typed parameter are no
    longer silently rewritten to accept multiple parameters.

Fixed

  • The :type command in futharki can now handle polymorphic
    expressions (#669).

  • Fixed serious bug related to chaining record updates.

  • Fixed type inference of record fields (#677).

  • futharki no longer goes in an infinite loop if a for loop
    contains a negative upper bound.

  • Overloaded number types can no longer carry aliases (#682).

0.7.4

31 Oct 07:32
Compare
Choose a tag to compare

Added

  • Support type parameters for operator specs defined with val.

Fixed

  • Fixed nasty defunctionalisation bug (#661).

  • cabal/stack sdist works now.