You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What do we mean by detailed profiling? We need to decide what kind of profiling we want. Currently our benchmarks use our custom time measurements, which need to use cudaDeviceSynchronize() between kernel calls to produce meaningful time measurements. But since a normal run without synchronization might benefit from some concurrent kernel execution, these measurements increase the total run time.
Alternatively we could use the nvprof profiler, which also gives us performance metrics. But we would need to implement new benchmark scripts for that. which shouldn't take too long though.
First detailed profiling of some brian2 examples with brian2cuda after updating brian2 version (#22).
The text was updated successfully, but these errors were encountered: