-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output_parameters turn out to be all-zero inexplicably when n_fits is set up to large enough #31
Comments
Please post a sample program which reproduces the error. |
Finally get it. |
@adrianjp88 Should this happen? I just thought then it would use smaller chunks of fits, so as long as at least the data of one fit fits into the available GPU memory it should run fine, shouldn't it? |
@gittry We cannot understand your issue without a complete example code which reproduces the problem. |
Discussion at issue gpufit#31 Output_parameters turn out to be all-zero inexplicably when n_fits is set up to large enough. In this example, it is bound up with available_gpu_memory_ = std::size_t(double(free_bytes) * 0.1) in line 14, info.cu.al model
The example is pulled at #33. In line 14, info.cu, available_gpu_memory_ =std::size_t(double(free_bytes) * 0.1) |
Maybe it's not relevant, but am I wrong when I say that you are using user_info to pass the kernel your data, and data to pass the time vector (the same for all your measurements)? |
I have a question that is somehow related to this issue ... let me know if you prefer me to open a new issue. I succeeded in implementing my compartmental model as per issues #27 and #30 and now I was experimenting with increasing number of parallel fits. What I discovered is that if I use a n_fit that is greater then my max_chunck_size, when the library tries to allocate gpu memory for the second chunck I get the following error:
That is thrown by void GPUData::init at:
Quite surprisingly, the exact same command run a couple of lines above to write "data_" to the GPU memory completes just fine. I checked and this error happens with both my new model and with original models you implemented. |
@mscipio |
That's not going to be easy because I am working on Linux, so I did some changes to the code to make it compile and it's no longer compatible with the version in this repo. If you say that in your version you don't have an issue of this kind, I will try (T.T) to trace back all the differences hoping to find MY mistake along the way. You don't need an example code from me to test it out with your code: just pick on of the examples (like Linear_Regression_Example.cpp and increase A LOT n_fit so that the problem doesn't fit you GPU in one chunk) EDIT: |
@mscipio It is not clear to me what is the error you are reporting. In the manuscript we have tested Gpufit with up to 10^8 fits per function call. This is significantly larger than the maximum number of fits that can be processed simultaneously on the GPU. You are making modifications to the core of the Gpufit code, so introducing changes there could easily lead to bugs. Why do you need to make changes to Gpudata::Init? |
@superchromix The changes I made (I wasn't trying to modify Gpudata::Init, anyway) were meant to debug my new kernel (and just a C++ implementation was not enough). Now I just cloned back your current version of the library and will go on working on this one. I just checked and I don't have that issue with Linear_Regression_Example.cpp, so I guess it was something I made to cause the error. I will check my new model in this current branch asap and eventually open a pull request if you are interested in having it. |
@gittry |
@superchromix
The experimental data is just Unique X coordinate values for each fit stored in float type. Then how to make clear that I get proper results when n_fits or available_gpu_memory_ is set properly in some extent, otherwise, the output turns out all zero. |
@gittry |
@superchromix At my issue, I set the original variable size at |
We have updated the memory GPU memory management in the latest versions of Gpufit, to allow for larger user_info sizes. This should address this issue. |
To handle with the stack overflow problem, all my parameters in gpufit are initialized in the way like
Then I found the n_fits was limited in a extent : when the n_fits was large enough(like 1000,0000 in my case),the gpufitted output_parameters turned out to be all-zero, but it worked as expected when n_fits was set to 100,0000. It is hard to figure out. By the way, the PC memory is OK.
The text was updated successfully, but these errors were encountered: