Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for token streaming, parallel jobs and custom CORS #4

Merged
merged 12 commits into from
Jul 11, 2024

Conversation

Blaizzy
Copy link
Collaborator

@Blaizzy Blaizzy commented Jul 10, 2024

This PR adds:

  1. Multi-modal token streaming.
  2. Support for Parallel calls (single and multiple models) by default upto N workers.
  3. Supported model type endpoint.
  4. Delete model endpoint.
  5. Custome CORS.

Todo:

  • Refactor stream_generate after mlx-vlm's next release

Closes #2, Closes #5

@Blaizzy Blaizzy mentioned this pull request Jul 10, 2024
@Blaizzy Blaizzy changed the title Add support for token streaming and custom CORS Add support for token streaming, parallel jobs and custom CORS Jul 11, 2024
@Blaizzy Blaizzy merged commit 97d468c into main Jul 11, 2024
1 check passed
@Blaizzy Blaizzy deleted the pc/streaming-cors branch August 24, 2024 05:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

max_tokens not overriding the default Cross origin support
1 participant