You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
gds_kernel_loopback_latency executes without errors.
gds_kernel_latency returns some errors:
STDOUT:
pre-posting took 2272.00 usec
batch info: rx+kernel+tx 20 per batch
pre-posted 60 sequences in 2 batches
GPU kernel calc buf size: 131072
iters=1000 tx/rx_depth=1024
testing....
[1] batch 1: posted 20 sequences
[1] batch 2: posted 20 sequences
pre-posting took 2757.00 usec
gpu_wait_tracking_event nothing to do (12)
gpu_wait_tracking_event nothing to do (12)
gpu_wait_tracking_event nothing to do (12)
....
STDERR
[3] unexpected rx ev 12, batch len 20
[4] unexpected rx ev 11, batch len 20
[5] unexpected rx ev 13, batch len 20
[6] unexpected rx ev 11, batch len 20
[7] unexpected rx ev 13, batch len 20
[8] unexpected tx ev 18, batch len 20
[8] unexpected rx ev 14, batch len 20
[9] unexpected rx ev 16, batch len 20
….
Sometimes it gets stuck and sometimes it finishes the execution.
HPGMG doesn't show any error but results are incorrect (both CUDA 9.0 and 9.2). For instance, having a run with 2 procs, SA model, input params 5 and 8:
DGX-1V settings:
gds_kernel_loopback_latency executes without errors.
gds_kernel_latency returns some errors:
STDOUT:
STDERR
Sometimes it gets stuck and sometimes it finishes the execution.
HPGMG doesn't show any error but results are incorrect (both CUDA 9.0 and 9.2). For instance, having a run with 2 procs, SA model, input params 5 and 8:
Correct result would be
FMGSolve... f-cycle norm=6.934041112871547e-05 rel=7.171390380175266e-05 done (0.010723 seconds)
These errors don't appear on brdw0/1 P100.
The text was updated successfully, but these errors were encountered: