Skip to content

Ranges: Generate is weird

Denis Yaroshevskiy edited this page May 18, 2021 · 1 revision

The consequence of us aligning things makes some algorithms weird.

In regular C++ I can write std::iota in the following way:

template< std::output_iterator I, class T >
void iota( I f, I l, T v )
{
  std::generate(f, l, [&v]() mutable { return v++; });
}

Which could naively translate in SIMD like:

template< std::output_iterator I, std::integral T >
void iota( I f, I l, T v )
{
  eve::wide<T> wv{v};
  eve::wide<T> step([](int i, int) {  return i; });
  generate(f, l, [&]() mutable { 
    auto res = wv + step;
    wv += step;
    return wv;
  });
}

NOTE: the extra integral restriction is to make the step easier to do, nothing more.

However in SIMD this is either not correct or not efficient! (and it better be efficient :P).

The problem is, if my iterator is not aligned, the algorithm should try to align it. But this will mess with the offset and I might get smth like:

[3, 4, 5, 6, ...] even if the initial value is 0. Obviously we can fix it for iota but for general purpose generate this might still bite users. Especially if they just test on a std::vector which'd just happen to allocate aligned enough.

Possible solutions:

  • Documentation. We just accept that this is the behaviour.
  • Force the precise iteration for generate. Just mentioning for completeness, I really don't want to sacrifice perf. + People often keep state in other algorithms too. Why not transform + some dynamic offset or smth.
  • do_not_partially_align trait (bikeshed pending). We should have it regardless of this but we can guarantee that the code example will work. Obviously not the most performant option.
  • Users can write their own algorithm for this case. Just quite a bit of work.
  • I was also thinking for some trait to get more information in the callback but I couldn't figure out how to even write iota with it.
Clone this wiki locally