Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeout param to all chat_async methods, add backup model param #36

Merged
merged 4 commits into from
Dec 13, 2024

Conversation

rishsriv
Copy link
Member

Context

We sometimes have scenarios where the LLM we are using goes down sporadically, or just has an insanely long response time. For example, sonnet will (in about 1 in 200 requests) take 60+ to generate a response that it usually does in 5 seconds.

Solution

We want to:

  1. Add a timeout parameter to our LLM models (the OpenAI and Anthropic SDKs support this, Google's python-genai does not yet from what I can tell)
  2. If the original model either fails or does not yield a response in the desired period, then use the backup model

This PR implements this solution.

@rishsriv rishsriv merged commit 2aa9110 into main Dec 13, 2024
1 check passed
@rishsriv rishsriv deleted the rishabh/add-timeout branch December 13, 2024 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant