Does fftw plan reusage makes sense? #57

tdd11235813 · 2016-11-23T10:09:39Z

FFTW_MEASURE means, that fftw overwrites the input and output buffers in the planning stage.
After planning the buffers can be filled with data (memcpy).
Plan reusage means to have only one plan at a time.
For non-fftw_measure (estimate, wisdom) plans I think, it is NOT worthwhile to reuse fftw plans as they do not allocate temporary buffers (are we sure?).
But: It might be worthwhile in terms of memcpy. We could save memcpy part as long as padding is not required. The input data coming from BenchmarkExecutor is aligned, but not padded with respect to FFT, so only for padding (Inplace Real2Complex) the memcpy part would be required.
Have to look on the results w.r.t. upload and download times ..

psteinb · 2016-11-24T20:07:34Z

any news on this? I wondered if this idea is relevant for using FFTW or for interpreting the results of gearshifft?

tdd11235813 · 2017-06-15T11:44:42Z

to finally give an answer on this, I plotted Time of Upload vs Total Time to get the ratio.

upload refers to the memcpy operation and the timer measured a ~40% contribution to the total solution time at the worst case. But does this really comes from memcpy?

download is the same memcpy operation, just in the other direction. It is smooth and fast, no significant times here. So the long upload time might come from a cache warmup.
The memcpy can be avoided when transform is not a real-inplace and when fftw is not run with fftw-measure. It would reduce the total runtime by 40% if we assume the cache warmup to be the only responsible factor for the upload time.

The rshiny tool is going to get an update to examine such statistics. At the moment I do not plan to change fftw in gearshifft to avoid the memcopies in the aforementioned cases.

psteinb · 2017-06-15T11:57:57Z

thanks for the update. Interesting findings I believe.

Are these results from multi-threaded or single-threaded runs? I am asking as it doesn't need to be warm-up only, but (in a multi-threaded scenario) also cache line trashing.

tdd11235813 · 2017-06-15T12:00:46Z

true. this is multi-threaded. the single-threaded benchmark is still running on taurus. let's see what we will have there.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does fftw plan reusage makes sense? #57

Does fftw plan reusage makes sense? #57

tdd11235813 commented Nov 23, 2016

psteinb commented Nov 24, 2016

tdd11235813 commented Jun 15, 2017

psteinb commented Jun 15, 2017 •

edited

Loading

tdd11235813 commented Jun 15, 2017

Does fftw plan reusage makes sense? #57

Does fftw plan reusage makes sense? #57

Comments

tdd11235813 commented Nov 23, 2016

psteinb commented Nov 24, 2016

tdd11235813 commented Jun 15, 2017

psteinb commented Jun 15, 2017 • edited Loading

tdd11235813 commented Jun 15, 2017

psteinb commented Jun 15, 2017 •

edited

Loading