Skip to content

GrCUDA MultiGPU Pre-release

Pre-release
Pre-release
Compare
Choose a tag to compare
@gwdidonato gwdidonato released this 15 Apr 20:31
· 17 commits to GRCUDA-96-11-AD-AE-SC22 since this release

New features

  • Enabled support for multiple GPU in the asynchronous scheduler:
    • Added the GrCUDADeviceManager component that encapsulates the status of the multi-GPU system. It tracks the currently active GPUs, the streams and the currently active computations associated with each GPU, and what data is up-to-date on each device.
    • Added the GrCUDAStreamPolicy component that encapsulates new scheduling heuristics to select the best device for each new computation (CUDA streams are uniquely associated to a GPU), using information such as data locality and the current load of the device. We currently support 5 scheduling heuristic with increasing complexity:
      • ROUND_ROBIN: simply rotate the scheduling between GPUs. Used as initialization strategy of other policies;
      • STREAM_AWARE: assign the computation to the device with the fewest busy stream, i.e. select the device with fewer ongoing computations;
      • MIN_TRANSFER_SIZE: select the device that requires the least amount of bytes to be transferred, maximizing data locality;
      • MINMIN_TRANSFER_TIME: select the device for which the minimum total transfer time would be minimum;
      • MINMAX_TRANSFER_TIME select the device for which the maximum total transfer time would be minimum.
    • Modified the GrCUDAStreamManager component to select the stream with heuristics provided by the policy manager.
    • Extended the CUDARuntime component with APIs for selecting and managing multiple GPUs.