LLM: Refactor Pipeline-Parallel-FastAPI example #11319

xiangyuT · 2024-06-14T07:00:43Z

Description

Use AutoModelForCausalLM.from_pretrained to load pipeline_divided model
Add /generate_stream endpoint
Support OpenAI-formatted API
Add Gradio WebUI
Add benchmark.py to do stream benchmark.

…_0611

…ple_merge_0614

plusbang

We could merge it as the first step of refactoring PP serving. Will continue to organize code in next PR :)

Initially Refactor for Pipeline-Parallel-FastAPI example

xiangyuT and others added 22 commits June 3, 2024 13:50

init

e6bc108

refine

2ac2ea2

refine

a331e82

fix

21c36c8

refine

ee41e92

refine

aeee124

Merge branch 'main' into pp_stream_0603

8f7438a

fix

d19b279

fix

31108c4

fix stream

329273e

try to add openai api

1111a68

refine webui

d868505

refine openai api

5703a24

refine openai other api

410c9d3

add binbin's model load

baebfa2

refine

a6cbbfb

add support for chatglm2

4368051

refine

e61678f

Merge branch 'main' of https://github.com/xiangyuT/BigDL into pp_test…

c4829d3

…_0611

init to merge main

ba57dcc

refine

8cbcfb7

merge

c22a210

glorysdj requested a review from plusbang June 14, 2024 07:29

xiangyuT marked this pull request as ready for review June 14, 2024 07:59

refine

4fffbe5

xiangyuT changed the title ~~[WIP] Refactor Pipeline-Parallel-FastAPI example~~ LLM: Refactor Pipeline-Parallel-FastAPI example Jun 17, 2024

xiangyuT added 4 commits June 21, 2024 08:51

Merge branch 'main' of https://github.com/xiangyuT/BigDL into pp_exam…

c6d9cd4

…ple_merge_0614

fix

caffb56

merge

be1906c

fix

4cd78b8

plusbang approved these changes Jun 25, 2024

View reviewed changes

xiangyuT merged commit 8ddae22 into intel:main Jun 25, 2024
28 of 29 checks passed

plusbang mentioned this pull request Jun 25, 2024

Update pipeline parallel serving for more model support #11428

Merged

2 tasks

RyuKosei pushed a commit to RyuKosei/ipex-llm that referenced this pull request Jul 19, 2024

LLM: Refactor Pipeline-Parallel-FastAPI example (intel#11319)

6a63bd3

Initially Refactor for Pipeline-Parallel-FastAPI example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM: Refactor Pipeline-Parallel-FastAPI example #11319

LLM: Refactor Pipeline-Parallel-FastAPI example #11319

xiangyuT commented Jun 14, 2024

plusbang left a comment

LLM: Refactor Pipeline-Parallel-FastAPI example #11319

LLM: Refactor Pipeline-Parallel-FastAPI example #11319

Conversation

xiangyuT commented Jun 14, 2024

Description

plusbang left a comment

Choose a reason for hiding this comment