Serverless rate limiting for AI model endpoints #814

ebubae · 2024-10-23T20:58:08Z

Is your feature request related to a problem? Please describe.
Existing rate limiting solution requires endpoint to be continually running to correctly limit requests but our serverless infrastructure makes it such that it cannot easily maintain state between teardowns of the application on subsequent requests.

Describe the solution you'd like
Use Upstash rate limiting to rate limit incoming requests. More specifically we should

Rate limit incoming requests by developer token
Rate limit requests from internal main key by incoming IP address

Describe alternatives you've considered
Existing Redis solution doesn't scale as well as a serverless solution and does not have built-in rate limiting functionality.

ebubae self-assigned this Oct 23, 2024

ebubae mentioned this issue Oct 23, 2024

Add translation endpoint to Igbo API #812

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serverless rate limiting for AI model endpoints #814

Serverless rate limiting for AI model endpoints #814

ebubae commented Oct 23, 2024

Serverless rate limiting for AI model endpoints #814

Serverless rate limiting for AI model endpoints #814

Comments

ebubae commented Oct 23, 2024