Use of LiteLLM Router for load balancing and fallbacks #1570

denisergashbaev · 2024-10-01T17:48:11Z

As DSPy is using LiteLLM internally, I wonder how to use the LiteLLM router. In particular, I would like to add load balancing and fallbacks via LiteLLM.

Another example. LiteLLM provides rate limit aware routing strategy that routes the call to the deployment with the lowest tokens per minute value (see BerriAI/litellm#4510, https://docs.litellm.ai/docs/routing#advanced---routing-strategies-%EF%B8%8F). I would want to use the router

Thank you

okhat · 2024-10-01T19:46:10Z

Thanks! Maybe just launch their server and connect to it via the client dspy.LM? i.e., DSPy doesn't need to be invovled

okhat · 2024-10-02T00:57:43Z

Link: https://github.com/stanfordnlp/dspy/blob/main/examples/migration.ipynb

denisergashbaev · 2024-10-02T07:21:55Z

Thank you, Omar! I think it makes sense to expose router of LiteLLM as well. Under some circumstances, one would not want to run a separate proxy server and prefer using the LiteLLM Router for fallbacks, load balancing, and request prioritization

okhat · 2024-10-02T08:38:30Z

@denisergashbaev Sorry, I'm mistaken about the nature of the LiteLLM router. I assumed it was inherently a proxy.

It's actually just a client-side thing, indeed: https://docs.litellm.ai/docs/routing

okhat · 2024-10-02T08:41:55Z

Yes, I think we should support this. It seems like we should inherit dspy.LM and just accept a list of models instead of one model. This seems cool. Do you need this soon? We'd certainly appreciate a PR.

zhaohan-dong · 2024-10-08T08:56:23Z

Down to work on this if needed. @okhat maybe another function in dspy.LM similar to litellm_completion? Can also do a router kwarg to pass in a litellm.Router object.

okhat · 2024-10-08T16:02:53Z

@zhaohan-dong Thanks a lot! How do you envision the interface looking like? Let's agree on the right API before doing anything intensive :D

zhaohan-dong · 2024-10-08T18:01:34Z

Exactly what I hoped to ascertain I think litellm tries to have similar signature for Router as the plain generation methods. So I'm thinking it could be progressively do dependency injection as first step:

An argument router: Optional[litellm.Router] = None
If router != None, the inference call would invoke router.text_completion()
If model not in the router's model_list, throw error (Not 100% this is the cleanest way)

Otherwise could inherit LM with AnotherClass(model_list=model_list, **kwargs), so people who wnat to use the router would use that, and not affecting other users. Not sure what's the best naming/file to put it in.

Happy to proceed either way or do something else you'd suggest.

Fundamentally I see the plain litellm.text_generation() as a special case of Router.text_generation(), where there's only one model in the model_list.

okhat · 2024-10-10T08:44:40Z

Thanks a lot @zhaohan-dong ! I like the idea of a class that inherits from dspy.LM

zhaohan-dong · 2024-10-10T08:47:42Z

Awesome! Maybe RoutedLM as name?

denisergashbaev · 2024-10-26T20:05:13Z

Thanks for the response and your willingness to help. Let me know if I could help as well.

zhaohan-dong · 2024-10-29T09:42:57Z

@denisergashbaev I tried a PR here: #1611. Dunno if you could collab?

ryanh-ai · 2024-11-23T18:50:29Z

Hi! Checking in on this, I would love to leverage this if someone has on a branch somewhere. Would be happy to write some integration tests so that it can be merged eventually.

zhaohan-dong · 2024-11-24T00:24:55Z

@ryanh-ai Have a branch here https://github.com/zhaohan-dong/dspy/tree/litellm-router

denisergashbaev changed the title ~~How to user LiteLLM router for load balancing and fallbacks?~~ Use of LiteLLM Router for load balancing and fallbacks? Oct 1, 2024

denisergashbaev changed the title ~~Use of LiteLLM Router for load balancing and fallbacks?~~ Use of LiteLLM Router for load balancing and fallbacks Oct 1, 2024

denisergashbaev mentioned this issue Oct 1, 2024

Configuring rate limiter (throttler) #1572

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of LiteLLM Router for load balancing and fallbacks #1570

Use of LiteLLM Router for load balancing and fallbacks #1570

denisergashbaev commented Oct 1, 2024 •

edited

Loading

okhat commented Oct 1, 2024

okhat commented Oct 2, 2024

denisergashbaev commented Oct 2, 2024

okhat commented Oct 2, 2024 •

edited

Loading

okhat commented Oct 2, 2024

zhaohan-dong commented Oct 8, 2024 •

edited

Loading

okhat commented Oct 8, 2024

zhaohan-dong commented Oct 8, 2024

okhat commented Oct 10, 2024

zhaohan-dong commented Oct 10, 2024

denisergashbaev commented Oct 26, 2024

zhaohan-dong commented Oct 29, 2024

ryanh-ai commented Nov 23, 2024

zhaohan-dong commented Nov 24, 2024

Use of LiteLLM Router for load balancing and fallbacks #1570

Use of LiteLLM Router for load balancing and fallbacks #1570

Comments

denisergashbaev commented Oct 1, 2024 • edited Loading

okhat commented Oct 1, 2024

okhat commented Oct 2, 2024

denisergashbaev commented Oct 2, 2024

okhat commented Oct 2, 2024 • edited Loading

okhat commented Oct 2, 2024

zhaohan-dong commented Oct 8, 2024 • edited Loading

okhat commented Oct 8, 2024

zhaohan-dong commented Oct 8, 2024

okhat commented Oct 10, 2024

zhaohan-dong commented Oct 10, 2024

denisergashbaev commented Oct 26, 2024

zhaohan-dong commented Oct 29, 2024

ryanh-ai commented Nov 23, 2024

zhaohan-dong commented Nov 24, 2024

denisergashbaev commented Oct 1, 2024 •

edited

Loading

okhat commented Oct 2, 2024 •

edited

Loading

zhaohan-dong commented Oct 8, 2024 •

edited

Loading