Releases: diku-dk/futhark
0.12.3
Added
-
Character literals can now be any integer type.
-
The integer modules now have
popc
andclz
functions. -
Tweaked inlining so that larger programs may now compile faster
(observed about 20%). -
Pattern-matching on large sum typed-values taken from arrays may
be a bit faster.
Fixed
-
Various small fixes to type errors.
-
All internal functions used in generated C code are now properly
declaredstatic
. -
Fixed bugs when handling dimensions and aliases in type ascriptions.
0.12.2
Added
-
New tool:
futhark autotune
, for tuning the threshold parameters
used by incremental flattening. Based on work by Svend Lund
Breddam, Simon Rotendahl, and Carl Mathias Graae Larsen. -
New tool:
futhark dataget
, for extracting test input data. Most
will probably never use this. -
Programs compiled with the
cuda
backend now take options
--default-group-size
,--default-num-groups
, and
--default-tile-size
. -
Segmented
reduce_by_index
are now substantially fasted for small
histograms. -
New functions:
f32.lerp
andf64.lerp
, for linear interpolation.
Fixed
-
Fixes to aliasing of record updates.
-
Fixed unnecessary array duplicates after coalescing optimisations.
-
reduce_by_index
nested inmap
s will no longer sometimes
require huge amounts of memory. -
Source location now correct for unknown infix operators.
-
Function parameters are no longer in scope of themselves (#798).
-
Fixed a nasty out-of-bounds error in handling of irregular allocations.
-
The
floor
/ceil
functions inf32
/f64
now handle infinities
correctly (and are also faster). -
Using
%
on floats now computes fmod instead of crashing the compiler.
0.12.1
Added
-
The internal representation of parallel constructs has been
overhauled and many optimisations rewritten. The overall
performance impact should be neutral on aggregate, but there may
be changes for some programs (please report if so). -
Futhark now supports structurally typed sum types and pattern
matching! This work was done by Robert Schenck. There remain
some problems with arrays of sum types that themselves contain
arrays. -
Significant reduction in compile time for some large programs.
-
Manually specified type parameters need no longer be exhaustive.
-
Mapped
rotate
is now simplified better. This can be
particularly helpful for stencils with wraparound.
Removed
- The
~
prefix operator has been removed.!
has been extended
to perform bitwise negation when applied to integers.
Changed
-
The
--futhark
option forfuthark bench
andfuthark test
now
defaults to the binary being used for the subcommands themselves. -
The legacy
futhark -t
option (which did the same asfuthark check
) has been removed. -
Lambdas now bind less tightly than type ascription.
-
stream_map
is nowmap_stream
andstream_red
is now
reduce_stream
.
Fixed
-
futhark test
now understands--no-tuning
as it was always
supposed to. -
futhark bench
andfuthark test
now interpret--exclude
in
the same way. -
The Python and C# backends can now properly read binary boolean
input.
0.11.2
Fixed
-
Entry points whose types are opaque due to module ascription, yet
whose representation is simple (scalars or arrays of scalars) were
mistakely made non-opaque when compiled with--library
. This
has been fixed. -
The CUDA backend now supports default sizes in
.tuning
files. -
Loop interchange across multiple dimensions was broken in some cases (#767).
-
The sequential C# backend now generates code that compiles (#772).
-
The sequential Python backend now generates code that runs (#765).
0.11.1
Added
-
Segmented scans are a good bit faster.
-
reduce_by_index
has received a new implementation that uses
local memory, and is now often a good bit faster when the target
array is not too large. -
The
f32
andf64
modules now containgamma
andlgamma
functions. At present these do not work in the C# backend. -
Some instances of
reduce
with vectorised operators (e.g.map2 (+)
) are orders of magnitude faster than before. -
Memory usage is now lower on some programs (specifically the ones
that have largemap
s with internal intermediate arrays).
Removed
- Size parameters (not annotations) are no longer permitted
directly inlet
andloop
bindings, nor in lambdas. You are
likely not affected (except for thestream
constructs; see
below). Few people used this.
Changed
-
The array creation functions exported by generated C code now take
int64_t
arguments for the shape, rather thanint
. This is in
line with what the shape functions return. -
The types for
stream_map
,stream_map_per
,stream_red
, and
stream_red_per
have been changed, such that the chunk function
now takes the chunk size as the first argument.
Fixed
-
Fixes to reading values under Python 3.
-
The type of a variable can now be deduced from its use as a size
annotation. -
The code generated by the C-based backends is now also compilable
as C++. -
Fix memory corruption bug that would occur on very large segmented
reductions (large segments, and many of them).
0.10.2
Added
-
reduce_by_index
is now a good bit faster on operators whose
arguments are two 32-bit values. -
The type checker warns on size annotations for function parameters
and return types that will not be visible from the outside,
because they refer to names nested inside tuples or records. For
example, the functionlet f (n: i32, m: i32): [n][m]i32 = ...
will cause such a warning. It should instead be written
let f (n: i32) (m: i32): [n][m]i32 = ...
-
A new library function
futhark_context_config_select_device_interactively()
has been
added.
Fixed
0.10.1
Added
-
Using definitions from the
intrinsic
module outside the prelude
now results in a warning. -
reduce_by_index
with vectorised operators (e.g.map2 (+)
) is
orders of magnitude faster than before. -
Executables generated with the
pyopencl
backend now support the
options--default-tile-size
,--default-group-size
,
--default-num-groups
,--default-threshold
, and--size
. -
Executables generated with
c
andopencl
now print a help text
if run with invalid options. Thepy
andpyopencl
backends
already did this. -
Generated executables now support a
--tuning
flag for passing
many tuned sizes in a file. -
Executables generated with the
cuda
backend now take an
--nvrtc-option
option. -
Executables generated with the
opencl
backend now take a
--build-option
option.
Removed
- The old
futhark-*
executables have been removed.
Changed
-
If an array is passed for a function parameter of a polymorphic
type, all arrays passed for parameters of that type must have the
same shape. For example, given a functionlet pair 't (x: t) (y: t) = (x, y)
The application
pair [1] [2,3]
will now fail at run-time. -
futhark test
now numbers un-named data sets from 1 rather than
0. This only affects the text output and the generated JSON
files, and fits the tuple element ordering in Futhark. -
String literals are now of type
[]u8
and contain UTF-8 encoded
bytes.
Fixed
-
An significant problematic interaction between empty arrays and
inner size declarations has been closed (#714). This follows a
range of lesser empty-array fixes from 0.9.1. -
futhark datacmp
now prints to stdout, not stderr. -
Fixed a major potential out-of-bounds access when sequentialising
reduce_by_index
(in most cases the bug was hidden by subsequent
C compiler optimisations). -
The result of an anonymous function is now also forbidden from
aliasing a global variable, just as with named functions. -
Parallel scans now work correctly when using a CPU OpenCL
implementation. -
reduce_by_index
was broken on newer NVIDIA GPUs when using fancy
operators. This has been fixed.
0.9.1
Added
-
futhark cuda
: a new CUDA backend by Jakob Stokholm Bertelsen. -
New command for comparing data files:
futhark datacmp
. -
An
:mtype
command forfuthark repl
that shows the type of a
module expression. -
futhark run
takes a-w
option for disabling warnings.
Changed
-
Major command reorganisation: all Futhark programs have been
combined into a single all-powerfulfuthark
program. Instead of
e.g.futhark-foo
, usefuthark foo
. Wrappers will be kept
around under the old names for a little while.futharki
has
been split into two commands:futhark repl
andfuthark run
.
Also,py
has becomepython
andcs
has becomecsharp
, but
pyopencl
andcsopencl
have remained as they were. -
The result of a function is now forbidden from aliasing a global
variable. Surprisingly little code is affected by this. -
A global definition may not be ascribed a unique type. This never
had any effect in the first place, but now the compiler will
explicitly complain. -
Source spans are now printed in a slightly different format, with
ending the line number omitted when it is the same as the start
line number.
Fixed
-
futharki
now reports source locations oftrace
expressions
properly. -
The type checker now properly complains if you try to define a
type abbreviation that has unused size parameters.
0.8.1
Added
-
Now warns when
/futlib/...
files are redundantly imported. -
futharki
now prints warnings for files that are ":load"ed. -
The compiler now warns when entry points are declared with types
that will become unnamed and opaque, and thus impossible to
provide from the outside. -
Type variables invented by the type checker will now have a
unicode subscript to distinguish them from type parameters
originating in the source code. -
futhark-test
andfuthark-bench
now support generating random
test data. -
The library backends now generate proper names for arrays of
opaque values. -
The parser now permits empty programs.
-
Most transpositions are now a good bit faster, especially on
NVIDIA GPUs.
Removed
- The
<-
symbol can no longer be used for in-place updates and
record updates (deprecated in 0.7.3).
Changed
- Entry points that accept a single tuple-typed parameter are no
longer silently rewritten to accept multiple parameters.
Fixed
-
The
:type
command infutharki
can now handle polymorphic
expressions (#669). -
Fixed serious bug related to chaining record updates.
-
Fixed type inference of record fields (#677).
-
futharki
no longer goes in an infinite loop if afor
loop
contains a negative upper bound. -
Overloaded number types can no longer carry aliases (#682).