What is the future plan of model expansion? #1380

jenniew · 2024-11-15T23:33:01Z

🚀 The feature, motivation and pitch

I see current torchchat only support a few kinds of model, like llama based(liked) architecture, or pre-defined Transformer architecture models. Is there any plan to support other kinds of model architecture in the future? which kinds of model you're considering to add? If there is a new model whose architecture is not in the supporting list, is there a way to run it?

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

mikekgfb · 2024-11-18T06:19:31Z

My personal take on how to tc might support a broader set of models:

Because the model description is part of the torchchat tree, there's a natural limit to the types of models that can be supported to those that can fit the general infra that torchchat supports.

Of course, the model.py could be made arbitrarily complex, but that doesn't seem desirable. I can see three possible directions:
1 - add additional model-variant.py files for other types. This ultimately triggers the same limitation, because the number of models that may be supported is limited by the number of models distributed. It may also involve rights issues, because some of these models may contain copyrighted or patented portions.
2 - build models from GGUF, following the --gguf-path approach as per docs/GGUF.md
3 - allow users to bring their own model descriptions.

(2) requires gguf import to track new features, and limits models to those supported bu GGUF.
(3) allows users to build new models, but requires integration for tokenization and and for export (e.g., the HF cache is at present not exportable via AOTI and/or ET afaik)

Here's an attempt at implementing a solution that allows users to bring their own models (does not support export, and sidesteps the query formatting by adding support for and using pre-tokenized text inputs) for phi-3-mini:
https://github.com/mikekg/torchchat/tree/phichat

This introduces an option --cuxtom-builder, which can be using the following invocation:

python torchchat.py generate --custom-builder torchchat/model_python/phi-3-mini.py:model_builder --tokenizer-path /content/torchchat/tokenizer.model --prompt "[32010, 739, 471, 263, 6501, 322, 14280, 29891, 4646, 29892, 322, 32007, 2]"

Example run:
https://colab.research.google.com/drive/1HHONUbKqqXU9yU3BIrjH0dRWKdwgY34H?usp=sharing

To make it exportable, we'd want to avoid using components that can't be exported (likely the HF Cache, possibly others), either by changing the source code directly, or using a model rewrited for those components similar to what we use today for quantization in torchchat for aoti & et, or to introduce the et optimization sdpa_with_kv_cache for mobile backends.

Jack-Khuu · 2024-11-18T21:29:02Z

Great Question @jenniew.

Like you mentioned, model support is currently biased towards Llama/Transformer architectures, but we intend for the inference pipeline to be built model agnostic. The upcoming models are Llava and Granite Code Models (though both are Transformer based), with Mamba's (SSM) being on my radar.

The ultimate plan is to create a simple interface between Model Definitions (architecture, compile, export) and Inference Pipeline (generate, chat, browser, openai api) such that onboarding becomes easier (e.g. leaning on torchtune for models instead of hosting it ourselves).

@mikekgfb shows a promising approach above as well as mentioning GGUF being an approach.

@jenniew If you have a particular model/architecture/artifact in mind, you can share here or send me a message, and we can give more detailed suggestions

byjlw · 2024-11-19T21:38:05Z

Like @Jack-Khuu mentioned. We need to make some architecture changes and create a model adding flow so it's easy for anyone to add models.

In the meantime, feel free to ask for a specific model.

Jack-Khuu self-assigned this Nov 16, 2024

Jack-Khuu added enhancement New feature or request Question Question about the repo as a whole labels Nov 16, 2024

Jack-Khuu added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the future plan of model expansion? #1380

What is the future plan of model expansion? #1380

jenniew commented Nov 15, 2024

mikekgfb commented Nov 18, 2024 •

edited

Loading

Jack-Khuu commented Nov 18, 2024

byjlw commented Nov 19, 2024

What is the future plan of model expansion? #1380

What is the future plan of model expansion? #1380

Comments

jenniew commented Nov 15, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

mikekgfb commented Nov 18, 2024 • edited Loading

Jack-Khuu commented Nov 18, 2024

byjlw commented Nov 19, 2024

mikekgfb commented Nov 18, 2024 •

edited

Loading