Copy all eventspace counters to host efficiently at each time step #282

denisalevi · 2022-04-13T10:42:36Z

Currently, there are multiple kernels that require or could be optimized when having the number of spiking neurons on the host and each kernel calls their own cudaMemcpy or would have to.

Instead, implement an asynchronous cudaMemcpy from device to host after all thresholders (for any spike or event condition) are run.

Move the event counters for all eventspaces into a single contiguous array, such that one copy can copy all counters for multiple eventspaces at once. This would require changing the access to the counter variable in some kernels, but should be easy to do.
Detect from Python side which kernel will be the first to require the counter variables. That kernel should that synchronize with the asynchronous copy call. Also detect if no kernel requires the counters and don't copy in that case.

Related issues that would benefit or could be closed by this:

The text was updated successfully, but these errors were encountered:

denisalevi added optimisation easy labels Apr 13, 2022

denisalevi mentioned this issue Apr 13, 2022

Optimize PopulationRateMonitor #285

Open

3 tasks

denisalevi mentioned this issue Jun 2, 2022

Optimize our SpikeMonitor for Subgroups #293

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Copy all eventspace counters to host efficiently at each time step #282

Copy all eventspace counters to host efficiently at each time step #282

denisalevi commented Apr 13, 2022 •

edited

Loading

Copy all eventspace counters to host efficiently at each time step #282

Copy all eventspace counters to host efficiently at each time step #282

Comments

denisalevi commented Apr 13, 2022 • edited Loading

denisalevi commented Apr 13, 2022 •

edited

Loading