Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for session.use_device_allocator_for_initializers in onnxruntime_backend #294

Merged
merged 6 commits into from
Jan 29, 2025

Conversation

pskiran1
Copy link
Member

@pskiran1 pskiran1 commented Jan 22, 2025

Resolves #166

@pskiran1 pskiran1 changed the title Expose session.use_device_allocator_for_initializers in onnxruntime_backend Expose session.use_device_allocator_for_initializers in onnxruntime_backend Jan 22, 2025
@pskiran1 pskiran1 marked this pull request as ready for review January 23, 2025 06:56
@pskiran1 pskiran1 changed the title Expose session.use_device_allocator_for_initializers in onnxruntime_backend Add support for session.use_device_allocator_for_initializers in onnxruntime_backend Jan 23, 2025
@tanmayv25
Copy link
Contributor

@pskiran1 Do you see the memory properly being released with this change?

@pskiran1
Copy link
Member Author

@pskiran1 Do you see the memory properly being released with this change?

Yes, but not entirely. We have noticed that some memory is still not being freed (~256MB). According to the ORT engineering team in the GitHub issue mentioned below, we need to also configure the ArenaCfg options, which are currently not supported by OrtCUDAProviderOptionsV2.
I have raised a query with the ONNX Runtime team regarding this limitation here: microsoft/onnxruntime#12748 (comment)

@pskiran1 pskiran1 merged commit 0b4f3f0 into main Jan 29, 2025
3 checks passed
@pskiran1 pskiran1 deleted the spolisetty_arena_cfg_options branch January 29, 2025 04:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose session.use_device_allocator_for_initializers in onnxruntime_backend to completely shrink arena
3 participants