Enable parallel instance loading backend attribute #208

rmccorm4 · 2023-07-31T23:40:45Z

Main area of concern appears to be this model_state->LoadModel call made by each instance.

However, this function appears well protected, with most actions being done on a cloned session for each instance, and a called out area of concern for OpenVINO which is already locked.

Corresponding tests: triton-inference-server/server#6126
Backend docs: triton-inference-server/backend#87

tanmayv25

Can you add a comment in the beginning of TRITONBACKEND_ModelInstanceInitialize to let a developer know that TRITONBACKEND_ModelInstanceInitialize will be called concurrently and hence should be thread-safe.
Lastly, a section in the backend readme that points the instances of the model will be loaded in parallel will be useful.

Similarly for other backends.

GuanLuo

@pranavsharma FYI that we are relaxing the model loading that results in possible concurrent creation of multiple ORT sessions.

…alled concurrently and should be thread-safe

rmccorm4 · 2023-08-04T05:28:57Z

src/onnxruntime.cc

@@ -2674,6 +2674,10 @@ TRITONBACKEND_ModelFinalize(TRITONBACKEND_Model* model)
 TRITONBACKEND_ISPEC TRITONSERVER_Error*
 TRITONBACKEND_ModelInstanceInitialize(TRITONBACKEND_ModelInstance* instance)
 {
+  // NOTE: If the corresponding TRITONBACKEND_BackendAttribute is enabled by the


@tanmayv25 added note here as requested.

Also added some generic BackendAttribute docs in the backend repo here: triton-inference-server/backend#87 - please review as well.

Enable parallel instance loading backend attribute

324fefd

rmccorm4 requested review from Tabrizian, tanmayv25 and GuanLuo July 31, 2023 23:40

Tabrizian previously approved these changes Aug 1, 2023

View reviewed changes

tanmayv25 reviewed Aug 1, 2023

View reviewed changes

GuanLuo previously approved these changes Aug 3, 2023

View reviewed changes

Review feedback: Add note in ModelInstanceInitialize that it may be c…

c7b7e9f

…alled concurrently and should be thread-safe

rmccorm4 dismissed stale reviews from GuanLuo and Tabrizian via c7b7e9f August 4, 2023 04:59

rmccorm4 commented Aug 4, 2023

View reviewed changes

rmccorm4 requested review from tanmayv25, Tabrizian and GuanLuo August 4, 2023 05:29

rmccorm4 mentioned this pull request Aug 4, 2023

Add docs on BackendAttributes triton-inference-server/backend#87

Merged

Tabrizian approved these changes Aug 4, 2023

View reviewed changes

rmccorm4 merged commit 5183b78 into main Aug 4, 2023

rmccorm4 deleted the rmccormick-parallel branch August 4, 2023 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable parallel instance loading backend attribute #208

Enable parallel instance loading backend attribute #208

rmccorm4 commented Jul 31, 2023 •

edited

Loading

tanmayv25 left a comment •

edited

Loading

GuanLuo left a comment

rmccorm4 Aug 4, 2023

Enable parallel instance loading backend attribute #208

Enable parallel instance loading backend attribute #208

Conversation

rmccorm4 commented Jul 31, 2023 • edited Loading

tanmayv25 left a comment • edited Loading

Choose a reason for hiding this comment

GuanLuo left a comment

Choose a reason for hiding this comment

rmccorm4 Aug 4, 2023

Choose a reason for hiding this comment

rmccorm4 commented Jul 31, 2023 •

edited

Loading

tanmayv25 left a comment •

edited

Loading