Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CANN] Fix buffer_num and runtime speed slowly error #8865

Merged
merged 1 commit into from
Aug 5, 2024

Conversation

wangshuai09
Copy link
Contributor

Fix the fellowing error:

  • The queue for calulate will hang on for data writing when BUFFER_NUM >1, and which length need be 1.
  • if in switch causes the inference speed to decrease

Comment on lines 1670 to 1672
// TODO: fix me
// Current groupsize should not be greater than k-1 in
// aclnnWeightQuantBatchMatmulV2GetWorkspaceSize().
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment still relevant after removing the respective code block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error about this comment has not been resolved yet, keep TODO to resolve later.

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Aug 5, 2024
@mofosyne mofosyne added Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix bugfix fixes an issue or bug labels Aug 5, 2024
ggerganov
ggerganov previously approved these changes Aug 5, 2024
@ggerganov ggerganov requested a review from hipudding August 5, 2024 12:51
@ggerganov ggerganov dismissed their stale review August 5, 2024 12:52

better if hipudding reviews the change

@hipudding hipudding merged commit bc0f887 into ggerganov:master Aug 5, 2024
53 checks passed
@hipudding hipudding added the Ascend NPU issues specific to Ascend NPUs label Aug 5, 2024
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ascend NPU issues specific to Ascend NPUs bugfix fixes an issue or bug ggml changes relating to the ggml tensor library for machine learning Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants