Skip to content

Commit

Permalink
add test
Browse files Browse the repository at this point in the history
  • Loading branch information
ngxson committed Sep 3, 2024
1 parent ba0065f commit 852f654
Showing 1 changed file with 29 additions and 0 deletions.
29 changes: 29 additions & 0 deletions examples/server/tests/features/parallel.feature
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,35 @@ Feature: Parallel
| disabled | 128 |
| enabled | 64 |

Scenario Outline: Multi users with number of prompts exceeding number of slots
Given a system prompt You are a writer.
And a model tinyllama-2
Given a prompt:
"""
Write a very long book.
"""
And a prompt:
"""
Write another a poem.
"""
And a prompt:
"""
What is LLM?
"""
And a prompt:
"""
The sky is blue and I love it.
"""
And <n_predict> max tokens to predict
And streaming is <streaming>
Given concurrent OAI completions requests
Then the server is busy
Then the server is idle
Then all prompts are predicted with <n_predict> tokens
Examples:
| streaming | n_predict |
| disabled | 128 |
| enabled | 64 |

Scenario: Multi users with total number of tokens to predict exceeds the KV Cache size #3969
Given a prompt:
Expand Down

0 comments on commit 852f654

Please sign in to comment.