Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add asyncio support for tritonclient (beta) #23

Merged
merged 3 commits into from
Dec 4, 2024

Conversation

kimdwkimdw
Copy link
Member

@kimdwkimdw kimdwkimdw commented Nov 20, 2024

Overview

This PR adds asyncio support for the Triton client library by integrating with tritonclient's asyncio modules (currently in beta status). According to the official documentation, Python asyncio support is currently in beta. While this implementation enables fully asynchronous inference requests which can improve performance for concurrent workloads, users should be aware of the beta status of the underlying functionality.

Key Changes

  • Added create_with_asyncio() factory method to create async-capable InferenceClient
  • Implemented async client initialization and model configuration fetching
  • Added new aio_infer() method for async inference requests
  • Added test coverage for async functionality
  • Added new sample model (sample_sleep_1sec) for testing concurrent requests
  • Updated Docker image to 24.05-pyt-python-py3

Test Results

  • Basic async inference tests passing
  • Concurrent request tests showing proper parallelization
  • Performance test with sample_sleep_1sec model shows ~10 concurrent requests completing in under 2 seconds

Usage Example

# Create async client
client = InferenceClient.create_with_asyncio(
    "model_name",
    "localhost:8001",
    protocol="grpc"
)

# Run async inference
result = await client.aio_infer(input_data)

Implementation Notes

  • Uses native asyncio support from tritonclient's aio modules (beta feature)
  • Maintains backwards compatibility with existing sync interfaces
  • Proper resource cleanup and error handling for async operations
  • Configurable via same parameters as sync client
  • Beta status means the API and behavior may change in future releases

Testing Notes

  • Added pytest-asyncio for async test support
  • New test cases specifically for async functionality
  • Concurrent request testing with artificial delays

Copy link

@homura-rtzr homura-rtzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@ancom21c ancom21c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@kimdwkimdw kimdwkimdw merged commit 58933e2 into main Dec 4, 2024
1 check passed
@kimdwkimdw kimdwkimdw deleted the feature/support_aio_tritonclient branch December 4, 2024 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants