You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Configure SAS_NUM_THREADS from the OpenCL config dialog.
Set the cpu count (third parameter to start_mapper) using the value of SAS_NUM_THREADS.
The OMP_NUM_THREADS variable affects programs using OpenMP, including sasmodels compiled with OpenMP support. I guess my thinking at the time (2014) was that parallel fitting only made sense when running sasmodels with single-threaded models. With OpenCL off by default(?) and tinycc not supporting OpenMP, I suspect most environments are running SasView with only one core. Using a different config variable allows us to untangle these concepts and give the user control.
With no GPU and no OpenMP compiler we should be using SAS_NUM_THREADS=0. This will use one thread per core.
User can force single threaded by setting SAS_NUM_THREADS=1. This should select SingleMapper rather than MPMapper.
If we have GPU enabled (or OpenCL on the CPU), then it is trickier. For small cards with few cores (< 100) we should be using single threaded. For large cards with thousands of cores then we can use cpu count = num gpu cores / num q points. Assuming 200 q points per curve on average, an nvidia 4090 with 16000 cores should support SAS_NUM_THREADS=80. This will still leave a lot of unused compute for lm and amoeba fitters, but a big speed increase for dream. Fits to 2-D data should still use SingleMapper.
Multiple GPUs will require that different threads use different gpu device IDs. Since this is done in multiprocessing we should be able to modify the SAS_OPENCL environment variable for each thread to indicate which device to use for that GPU context. We will want two different MPMapper instances, one for 1-D data with many threads per device, and another for 2-D data with a single thread per device. Either a separate environment variable (SAS_GPUS=4?) or a string like SAS_NUM_THREADS=4x80 could indicate that there are multiple devices. Setting the device context for multiple GPUs may require some changes to bumps.
The text was updated successfully, but these errors were encountered:
Looking at the code, SasView should already support parallel fitting:
sasview/src/sas/sascalc/fit/BumpsFitting.py
Lines 395 to 396 in d68e731
This support needs to be improved:
SAS_NUM_THREADS
rather thanOMP_NUM_THREADS
.SAS_NUM_THREADS
from the OpenCL config dialog.start_mapper
) using the value ofSAS_NUM_THREADS
.The
OMP_NUM_THREADS
variable affects programs using OpenMP, including sasmodels compiled with OpenMP support. I guess my thinking at the time (2014) was that parallel fitting only made sense when running sasmodels with single-threaded models. With OpenCL off by default(?) and tinycc not supporting OpenMP, I suspect most environments are running SasView with only one core. Using a different config variable allows us to untangle these concepts and give the user control.With no GPU and no OpenMP compiler we should be using
SAS_NUM_THREADS=0
. This will use one thread per core.User can force single threaded by setting
SAS_NUM_THREADS=1
. This should selectSingleMapper
rather thanMPMapper
.If we have GPU enabled (or OpenCL on the CPU), then it is trickier. For small cards with few cores (< 100) we should be using single threaded. For large cards with thousands of cores then we can use cpu count = num gpu cores / num q points. Assuming 200 q points per curve on average, an nvidia 4090 with 16000 cores should support
SAS_NUM_THREADS=80
. This will still leave a lot of unused compute forlm
andamoeba
fitters, but a big speed increase fordream
. Fits to 2-D data should still useSingleMapper
.Multiple GPUs will require that different threads use different gpu device IDs. Since this is done in multiprocessing we should be able to modify the
SAS_OPENCL
environment variable for each thread to indicate which device to use for that GPU context. We will want two differentMPMapper
instances, one for 1-D data with many threads per device, and another for 2-D data with a single thread per device. Either a separate environment variable (SAS_GPUS=4
?) or a string likeSAS_NUM_THREADS=4x80
could indicate that there are multiple devices. Setting the device context for multiple GPUs may require some changes to bumps.The text was updated successfully, but these errors were encountered: