Replies: 2 comments 4 replies
-
Looks like your memory bandwidth is saturating (as expected). Note that FDTD is heavily memory bound (not compute-bound). |
Beta Was this translation helpful? Give feedback.
3 replies
-
See also #882 which may be relevant to the performance degradation you are experiencing as you ramp up the number of "independent" jobs/threads. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
mp.divide_parallel_processes(N)
where I have a simple piece of waveguide and I am sweeping the width and each width is expected to run on a separate core. Here is my code:mpirun -np N python MyWG_simple.py > "script_printed.out" &
and I change N in the code above to be the same as N in this command I run, so I expect each simulation to run on one physical core. I am runningmeep 1.23.0
withmpich 4.0.2
andmpi4py 3.1.3
onUbuntu 20.04
on a 32-core and 64-thread machine, and I am the only one using this machine. Here are the time results:I expected the time to stay constant because all the simulations are running in parallel, but the time increases exponentially with increasing the number of cores What am I doing wrong?
taskset -c 0-4 mpirun -np 5 python MyWG_simple.py > "script_printed.out" &
(and of course I change the number of cores each time I run to match the N in the meep script), and I get worse results:I am wondering:
a) Why does the time increase exponentially and not stay constant as I increase the number of cores for an embarrassingly parallel task?
b) Why does it take more time to run if I use taskset than when I am not using taskset?
Beta Was this translation helpful? Give feedback.
All reactions