-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distributed dense sparse matrix multiplication in NTPoly #173
Comments
Thank you for your question. When NTPoly detects that a block of the matrix is dense, it will automatically switch to a dense multiplication routine. But I am not sure how the performance compares to other libraries in this case. I would like to prepare an example driver routine for you that shows off this feature. Do you also need help converting from BLACS format matrices to NTPoly? Also, are your matrices square for these calculations? |
Thank you so much for offering the help. Yes, an example would be really helpful.
|
@yaoyi92 are you asking for a dense * sparse -> dense operation? That would also be useful for things like the Sternheimer equation. |
@bhourahine, yes, that's exactly the operation I am looking for. My primary goal is to reduce the memory cost for my sparse rotation matrices. The MM timing is also not negligible based on my test. I hope the sparsity can be also used for some speedup if possible. |
Thank you for the clarification. Right now what happens in NTPoly is that when two blocks of the matrix are multiplied, it will measure their sparsity. If both matrices appear quite dense, it will switch to a BLAS call. Otherwise, it uses the sparse matrix multiplication routine I wrote. Unfortunately, that means there is no special optimization for sparse * dense operations. See line 58 of Nonetheless, NTPoly's performance might be good enough for you even if the sparse*sparse routine is employed. And it will definitely help you the memory costs. I've attached a small driver program here which I hope shows you how you can call the multiplication routines. In the future, it wouldn't be too hard to add a specialized sparse-dense routine at the block level. The key thing will be to develop good benchmarks, since usually sparse - dense multiplication is done in a matrix free way.
|
Thank you so much! |
I am wondering whether it is possible to perform dense sparse matrix multiplication via NTPoly subroutines. In FHI-aims, we need to perform some matrix multiplication for rotation matrices (sparse) on some other matrices (dense, specifically, the screened Coulomb matrices represented in the auxiliary basis). The dense matrices are distributed in BLACS format. Do you have any suggestions or possibly point me to a subroutine in NTPoly I should look into? Thank you very much in advance.
The text was updated successfully, but these errors were encountered: