Skip to content

Commit

Permalink
[v1.19.x] hmem/synapseai: Refine the error handling and warning
Browse files Browse the repository at this point in the history
Currently, synapseai_init return FI_ENODATA when there is no
synapseai device on the platform. This will cause warnings
are printed on any non-synapseai platform. This patch improves
this by making synapseai_init return ENOSYS when all the API
calls succeeded but zero devices are detected.

Signed-off-by: Shi Jin <[email protected]>
(cherry picked from commit f66e3f7)
  • Loading branch information
shijin-aws committed Nov 17, 2023
1 parent 92f37b6 commit 19f79b7
Showing 1 changed file with 22 additions and 10 deletions.
32 changes: 22 additions & 10 deletions src/hmem_synapseai.c
Original file line number Diff line number Diff line change
Expand Up @@ -108,20 +108,32 @@ int synapseai_init(void)

err = synapseai_dl_init();
if (err)
return -FI_ENODATA;
return err;

status = synapseai_ops.synInitialize();
if (status != synSuccess)
return -FI_ENODATA;
if (status != synSuccess) {
FI_WARN(&core_prov, FI_LOG_CORE,
"synInitialize failed: %d\n", status);
return -FI_EIO;
}

status = synapseai_ops.synDeviceGetCount(&device_count);
if (status != synSuccess || device_count == 0)
/*
* TODO We should call destroy here to free resources allocated
* in initialize, but the destroy call hangs on instances without
* a habana device
*/
return -FI_ENODATA;

/*
* TODO: Starting from here we should call synDestroy before
* returning error to free the resources allocated in synInitialize,
* but the destroy call hangs on instances without
* a habana device
*/

if (status != synSuccess) {
FI_WARN(&core_prov, FI_LOG_CORE,
"synDeviceGetCount failed: %d\n", status);
return -FI_EIO;
}

if (device_count == 0)
return -FI_ENOSYS;

return FI_SUCCESS;
}
Expand Down

0 comments on commit 19f79b7

Please sign in to comment.