-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Project Roadmap #57
Comments
is AWQ supported? |
Hey @RileyCodes, not yet, will add that to the roadmap! |
does we have tested bitsandbytes Quantization ? |
Hey @abhibst, I've done some basic sanity checks on it, but haven't tested it very thoroughly. Please feel free to report any issues you encounter and I'll take a look! |
Sure Thanks for confirming |
How would you go about adding this in Stable Diffusion? I am really interested in experimenting with that. |
Hey @sansavision, at a high level it would look a lot like the LoRA pipeline used in Diffusers: https://github.com/huggingface/api-inference-community/blob/main/docker_images/diffusers/app/pipelines/text_to_image.py#L25 A v0 shouldn't be too bad, we would basically just run a single forward pass to generate the image and perform postprocessing (as part of the existing Prefill step) and short-circuit the Decode step. |
If no one has started I will start working on awq tomorrow |
Nice! Thanks @flozi00, that would be awesome! |
Any plans to support vision transformers from huggingface / timm? A lot of potential use cases there for deploying many classifiers. If not what would that entail? Would be open to contributing if possible. |
Hey @SamGalanakis, great suggestion! The plan at the moment is to start by supporting text classifiers. Once that framework is in place for that, it should be hopefully relatively straightforward to support image classifiers as well. Happy to start a thread on Discord to discuss! |
Whisper would be also very cool 😄 |
@tgaddair Ok clear, joined the discord will look out for it! |
Hi, @tgaddair , could I know how long it will take to support the stable diffusion model? |
Hey @Hap-Zhang, the plan at the moment is to add it after we add support for embedding generation and text classification. Both of those are planned for January 2024, so in the next month. |
@tgaddair Okay, got it. Thank you very much for your efforts. Stay tuned for it. |
If we could have OpenAI compatible endpoints that would be great too. So we can use this as drop in replacement for OpenAI models :) |
Hey @AdithyanI, yes, this should be coming this week or next! See #145 to follow progress. |
@tgaddair oh wow that would be awesome! Thank you so much for the work here. Is the discord still open for others to join :) ? |
@AdithyanI this should be landing some time today :) |
Hey @AdithyanI, the Discord should be available. Are you using this link? |
@tgaddair I asked for outlines repo authors to add support to this : dottxt-ai/outlines#523 I don't know how hard is it to integrate that here. |
Thanks for starting the Outlines thread @AdithyanI! Looks like the maintainer created an issue #176. Excited to explore this integration! |
Would it be possible to add in context length-scaling methods like Self-Extend , Rope scaling, and/or yarn scaling? I know that llama.cpp has a good implementation of these in their server, and self-extend in particular is much more stable than rope or yarn. Having long context or doing context enhancement is super important for RAG applications. |
About the supported models, could you consider the ChatGLM3 ? @tgaddair |
It seems that LongLoRA proposed |
Do you plan on supporting AQLM to setve LoRa of Mixtral Instruct with Lorax? |
Hey @thincal, the last thing we need to support LongLoRA, if I remember correctly, is #231 which @geoffreyangus is planning to pick up next week. @remiconnesson, we have PR #233 from @flozi00 for AQLM. It's pretty close to landing, but just needs a little additional work to finish it up. If no one else picks it up, I can probably take a look in the next week or two. |
Are T5 based models on the Roadmap? |
Hello :) How far do you think we are for this PR to be merged? :) |
Hey @remiconnesson, will probably be the next thing I take a look at after wrapping up speculative decoding this week. @amir-in-a-cynch we can definitely add T5 to the roadmap! |
Hello, will you integrate / merge / migrate to the latest hugging face text-generation-inference as it is back now with Apache 2.0 license? |
Is there an expected release date for v0.11? |
WIP project roadmap for LoRAX. We'll continue to update this over time.
v0.10
v0.11
Previous Releases
v0.9
Backlog
Models
Adapters
Throughput / Latency
Quantization
Usability
The text was updated successfully, but these errors were encountered: