Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lets get a sample of standard retry logic with exponential backoff, etc. #469

Open
SeaDude opened this issue Aug 10, 2024 · 3 comments
Open
Assignees
Labels
status:triaged Issue/PR triaged to the corresponding sub-team type:feature request New feature request/enhancement

Comments

@SeaDude
Copy link

SeaDude commented Aug 10, 2024

Description of the feature request:

There are many recipes for all sorts of functionality, but none (that I can find) that show retry logic for return codes 429, 503 and 500. I'm seeing these return codes A LOT.

What problem are you trying to solve with this feature?

More robust API calls.

Any other information you'd like to share?

This snippet to successfully retry when return code is 429 Resource Exhausted but times-out if return code is 503 Model is Overloaded or if 500 An internal error has occurred.

from google.generativeai.types import RequestOptions
from google.api_core import retry

def submit_gemini_query(api_key, system_message, user_message, response_class):
    
    genai.configure(api_key=api_key)

    generation_config = {
        "temperature": 0,
        "max_output_tokens": 8192
    }
    
    model = genai.GenerativeModel(
        model_name="gemini-1.5-pro-latest",
        generation_config=generation_config,
        system_instruction=system_message
    )

    response = model.generate_content(user_message,
                                      request_options=RequestOptions(
                                        retry=retry.Retry(
                                            initial=10, 
                                            multiplier=2, 
                                            maximum=60, 
                                            timeout=300
                                        )
                                       )
                                    )

    return response.text
  • image
  • image
  • image
@MarkDaoust
Copy link
Contributor

It's used in this example:

" return model.generate_content(prompt, request_options={'retry':retry.Retry()})"

But cookbook could use a walkthrough of the http_options.

@MarkDaoust MarkDaoust transferred this issue from google-gemini/generative-ai-python Feb 20, 2025
@Giom-V
Copy link
Collaborator

Giom-V commented Feb 20, 2025

The issue is that at the moment the new SDK doesn't support retries, we have planned to write a notebook about errors and retries when it will be supported.

@markmcd
Copy link
Member

markmcd commented Feb 21, 2025

Agree we want this. Here is the google.api_core.retry API reference, in case anyone stumbles on this.

I've been working on this a bit recently so here's how you can use it with the new SDK. It's not ideal, we're working on getting it built in more naturally, so we won't include this in the cookbook unless absolutely necessary (e.g. large embedding batches):

from google.api_core import retry

# Catch transient Gemini errors.
def is_retryable(e) -> bool:
    if retry.if_transient_error(e):
        # Good practice, but probably won't fire with the google-genai SDK
        return True
    elif (isinstance(e, genai.errors.ClientError) and e.code == 429):
        # Catch 429 quota exceeded errors
        return True
    elif (isinstance(e, genai.errors.ServerError) and e.code == 503):
        # Catch 503 model overloaded errors
        return True
    else:
        return False

@retry.Retry(predicate=is_retryable)
def do_stuff(...):
    return client.models.generate_content(...).text

do_stuff(...)

The specific errors will need tweaking but this at least gives a template to start with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:triaged Issue/PR triaged to the corresponding sub-team type:feature request New feature request/enhancement
Projects
None yet
Development

No branches or pull requests

5 participants