Skip to content
This repository has been archived by the owner on Oct 16, 2023. It is now read-only.

OPT inference #198

Open
Joanna-0421 opened this issue Feb 28, 2023 · 2 comments
Open

OPT inference #198

Joanna-0421 opened this issue Feb 28, 2023 · 2 comments

Comments

@Joanna-0421
Copy link

hello,
I want to just inference of pre-trained model in the terminal, but I don't want to run a HTTP server. How could I do that?

@binmakeswell
Copy link
Member

Hi @Joanna-0421 If you don't need HTTP service, it seems unnecessary for you to use EnergonAI, and you can just use OPT at Colossal-AI. Thanks.

@irasin
Copy link

irasin commented Apr 4, 2023

Hi, @binmakeswell
Using enerogonAI insterad of colossal-AI should speed up inference on local machine with such as non-blocking pipeline parallel, redundant padding elimination, gpu offload, right?

If I does want to infer opt on local machine instead of http service, how should we modify the opt_server.py? Can you give us some examples?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants