Ranges: Generate is weird

The consequence of us aligning things makes some algorithms weird.

In regular C++ I can write std::iota in the following way:

template< std::output_iterator I, class T >
void iota( I f, I l, T v )
{
  std::generate(f, l, [&v]() mutable { return v++; });
}

Which could naively translate in SIMD like:

template< std::output_iterator I, std::integral T >
void iota( I f, I l, T v )
{
  eve::wide<T> wv{v};
  eve::wide<T> step([](int i, int) {  return i; });
  generate(f, l, [&]() mutable { 
    auto res = wv + step;
    wv += step;
    return wv;
  });
}

NOTE: the extra integral restriction is to make the step easier to do, nothing more.

However in SIMD this is either not correct or not efficient! (and it better be efficient :P).

The problem is, if my iterator is not aligned, the algorithm should try to align it. But this will mess with the offset and I might get smth like:

[3, 4, 5, 6, ...] even if the initial value is 0. Obviously we can fix it for iota but for general purpose generate this might still bite users. Especially if they just test on a std::vector which'd just happen to allocate aligned enough.

Possible solutions:

Documentation. We just accept that this is the behaviour.
Force the precise iteration for generate. Just mentioning for completeness, I really don't want to sacrifice perf. + People often keep state in other algorithms too. Why not transform + some dynamic offset or smth.
do_not_partially_align trait (bikeshed pending). We should have it regardless of this but we can guarantee that the code example will work. Obviously not the most performant option.
Users can write their own algorithm for this case. Just quite a bit of work.
I was also thinking for some trait to get more information in the callback but I couldn't figure out how to even write iota with it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ranges: Generate is weird

Clone this wiki locally