OPT inference #198

Joanna-0421 · 2023-02-28T12:34:26Z

hello,
I want to just inference of pre-trained model in the terminal, but I don't want to run a HTTP server. How could I do that?

binmakeswell · 2023-03-20T06:33:11Z

Hi @Joanna-0421 If you don't need HTTP service, it seems unnecessary for you to use EnergonAI, and you can just use OPT at Colossal-AI. Thanks.

irasin · 2023-04-04T03:41:48Z

Hi, @binmakeswell
Using enerogonAI insterad of colossal-AI should speed up inference on local machine with such as non-blocking pipeline parallel, redundant padding elimination, gpu offload, right?

If I does want to infer opt on local machine instead of http service, how should we modify the opt_server.py? Can you give us some examples?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OPT inference #198

OPT inference #198

Joanna-0421 commented Feb 28, 2023

binmakeswell commented Mar 20, 2023

irasin commented Apr 4, 2023

OPT inference #198

OPT inference #198

Comments

Joanna-0421 commented Feb 28, 2023

binmakeswell commented Mar 20, 2023

irasin commented Apr 4, 2023