Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add triton CustomCacheManager #55

Merged
merged 2 commits into from
Jun 18, 2024

Conversation

dtrifiro
Copy link

@dtrifiro dtrifiro commented Jun 18, 2024

fixes RHOAIENG-8043

Co-authored-by: Chih-Chieh-Yang [email protected]
Signed-off-by: Thomas Parnell [email protected]

@openshift-ci openshift-ci bot requested review from rpancham and terrytangyuan June 18, 2024 11:20
Copy link

openshift-ci bot commented Jun 18, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dtrifiro

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dtrifiro
Copy link
Author

cherry-pick of IBM/vllm#35

@dtrifiro dtrifiro force-pushed the fix-triton-cache-issues branch from fa8a6a2 to 097f576 Compare June 18, 2024 11:39
@dtrifiro dtrifiro changed the title add triton CustomCacheManger add triton CustomCacheManager Jun 18, 2024
@dtrifiro dtrifiro force-pushed the fix-triton-cache-issues branch from 097f576 to c935d57 Compare June 18, 2024 12:42
@dtrifiro
Copy link
Author

dtrifiro commented Jun 18, 2024

Merge after #56

@dtrifiro dtrifiro force-pushed the fix-triton-cache-issues branch from c935d57 to 26b004e Compare June 18, 2024 13:47
@dtrifiro dtrifiro force-pushed the fix-triton-cache-issues branch from 26b004e to 3aef43e Compare June 18, 2024 15:28
@dtrifiro dtrifiro merged commit c127b61 into opendatahub-io:main Jun 18, 2024
13 checks passed
dtrifiro and others added 2 commits June 18, 2024 17:29
Xaenalt pushed a commit that referenced this pull request Sep 18, 2024
* Add hpu syncs in model loader to prevent memory peak after loading weights

* Remove spaces

* Fix typo
prarit pushed a commit to prarit/vllm that referenced this pull request Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants