429 Too Many Requests #388
Replies: 7 comments
-
Hi @arnomoonens, this looks like you hit OpenAI's API rate limit. There's not much we can do about that directly, but you can, as for all models, increase the |
Beta Was this translation helpful? Give feedback.
-
Thanks for the information. |
Beta Was this translation helpful? Give feedback.
-
Yeah, that's confusing. Can you let me know the exact error message you're getting with the |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Thanks! Added this to our backlog, we'll improve the error message for this. |
Beta Was this translation helpful? Give feedback.
-
I'm the maintainer of LiteLLM we provide an Open source proxy for load balancing Azure + OpenAI + Any LiteLLM supported LLM From this thread it looks like you're running in 429 rate limit errors. Our proxy will allow you to maximize throughput by load balancing between Azure OpenAI instance - I hope our solution makes it easier for you. (i'd love feedback if you're trying to do this) Here's the quick start:Doc: https://docs.litellm.ai/docs/simple_proxy#load-balancing---multiple-instances-of-1-model Step 1 Create a Config.yamlmodel_list:
- model_name: gpt-4
litellm_params:
model: azure/chatgpt-v-2
api_base: https://openai-gpt-4-test-v-1.openai.azure.com/
api_version: "2023-05-15"
api_key:
- model_name: gpt-4
litellm_params:
model: azure/gpt-4
api_key:
api_base: https://openai-gpt-4-test-v-2.openai.azure.com/
- model_name: gpt-4
litellm_params:
model: azure/gpt-4
api_key:
api_base: https://openai-gpt-4-test-v-2.openai.azure.com/ Step 2: Start the litellm proxy:
Step3 Make Request to LiteLLM proxy:
|
Beta Was this translation helpful? Give feedback.
-
Thanks for letting us know! We'll take it into consideration. Will also convert this into a discussion, as this is not a bug/feature request per se. |
Beta Was this translation helpful? Give feedback.
-
I am getting a 429 error from the OpenAI API when iterating over a
model.pipe
result.I am using
@llm_tasks = "spacy.NER.v2"
andThis is the traceback:
Could you please help with this?
Is there some kind of throttling mechanism that could resolve this?
Beta Was this translation helpful? Give feedback.
All reactions