Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with multiple run() calls #86

Closed
denisalevi opened this issue Mar 9, 2018 · 4 comments
Closed

Problems with multiple run() calls #86

denisalevi opened this issue Mar 9, 2018 · 4 comments

Comments

@denisalevi
Copy link
Member

denisalevi commented Mar 9, 2018

There seem to be some issues when using multiple run() calls. First there is #85.

Then we are adding cudaProfileStart() and cudaProfileStop() around each Network.run() call. Not sure if we keep those for the final release anyways, but we should be aware of it.

Then just from looking it seems like we have a memory leak when synapses_initialise_queue.cu is rerun between run() calls (e.g. when the delay value is changed). In that case, we just allocate new memory for all device variables created in synapses_initialise_queue.cu and save their pointers to the global variables as before (without freeing the memory they pointed to before, after the first initialisation.) And we call the CudaSpikeQueue.prepare() method for spikequeues, for which we already called that method, which create new queues (cudaVectors) without deleting the old ones.

Similar situations might happen when other variables are changed between run() calls. Therefore, some tests would be appropriate.

@denisalevi
Copy link
Member Author

denisalevi commented Mar 14, 2018

Another bug with multiple runs and delays. When running this script

from brian2 import *
import brian2cuda

def test(mode='cuda'):                      
                                                                       
    set_device(mode+'_standalone', directory=None, build_on_run=False, 
                       with_output=False)                                      
    inG = NeuronGroup(1, 'v : 1', threshold='True')                    
    G = NeuronGroup(2, 'v : 1')                                        
    G.v[:] = 0                                                         
    S = Synapses(inG, G, on_pre='v += 1')                              
    S.connect()                                                        
    S.delay[:] = array([0,1]) * defaultclock.dt 
    mon = StateMonitor(G, 'v', record=True)                            
                                                                       
    run(2*defaultclock.dt)                                                                               
    run(2*defaultclock.dt)                                             
                                                                       
    device.build(direct_call=False, **device.build_options)            
                                                                       
    print mon.v[:]                                                               
                                                                       
if __name__ == '__main__':                                             
    test()                                  
    test(mode='cpp')                        

I get as output for the cuda call

[[ 0.  1.  2.  3.]
 [ 0.  0.  1.  1.]]

and for cpp

[[ 0.  1.  2.  3.]
 [ 0.  0.  1.  2.]]

This bug does only occur if we have different delays for both synapses. Looks like the last spike in inG from the first run statement is lost between the two run statements for the synapses with delay = 1*dt. It looks like synapses_queue_initialise() is called between run statements independent of change of delay and with that the spike queues are reset (which is why that last spike from the first run is lost)

EDIT
There is also some bugs when we have scalar delays. In synapses_initialise_queue, we just create the same number of eventspaces again and push them into a global eventspace vector. So we end up with the double of eventspaces we had before.

denisalevi added a commit that referenced this issue Mar 16, 2018
That array in device memory is not used. This allows simulation of
higher network sizes for our benchmarks, but introduces a bug when
trying to change delays between multiple `run()` statements, see #83.
The commit includes tests, which should pass as soon as #83 and #86 are
fixed.
denisalevi added a commit that referenced this issue Mar 16, 2018
That array in device memory is not used. This allows simulation of
higher network sizes for our benchmarks, but introduces a bug when
trying to change delays between multiple `run()` statements, see #83.
The commit includes tests, which should pass as soon as #83 and #86 are
fixed.
@moritzaugustin
Copy link
Contributor

while we have to fix the multiple runs bug before releasing, we can do this under the assumption that delay and dt do not change between runs

we need to raise an error (with reference to the respective issue) if a user tries to change it anywys

@moritzaugustin
Copy link
Contributor

see also: brian-team/brian2genn#76 (last comment by @mstimberg )

@denisalevi denisalevi added this to the First beta release milestone Aug 28, 2019
@denisalevi
Copy link
Member Author

This issue is being fixed in PR #300

  • cudaProfileStart() and cudaProfileStop() are removed
  • Some memory leaks from allocating new memory without deleting old one in before_run_synapses_push_spikes are removed. There still seems to be a memory leak when having multiple runs (not sure if from before_run_synapses_push_spikes or elsewhere). Opened a new issue at Fix memory leak when having multiple run calls #301.
  • All issues with delayed spikes being lost between run calls (for homogeneous and heterogeneous delays) are fixed and I added a few new tests (including the previously failing example from above).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants