-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
predict.konenen is failing if the matrix is too big #13
Comments
possible solution from @dfalster :
|
possible solution here: 0ca8092 needs testing... |
worked for me, @jack-bilby . Took about a little over an hour for the file you sent me |
Just noting the small MSOM object Will has used here won't give very meaningful results, and will likely be faster than using the full object. If it's that slow, this could be an argument to parallelise this step? |
Will, is that output with that solution you suggested? Or with the original code? I think it would be a good idea to at least have an option to chunk/parallelise the processes, especially for larger datasets. |
Working on it |
Looking like self organizing map predictions are not row-by-row. they take some kind complex window type thing. So when I chunk the imput file, I get edge effects on the chunks.
|
not sure what to do about that behavior. @jack-bilby @dfalster ? |
Darn. Thanks for investigating. In that case, I can see these options
I reckon #1 is the way forward. |
Yup. Aside from the computational annoyance, @jack-bilby I think we can write the methods in a more informed way now. |
BTW - nice work implementing the parallelisation, and then testing for consistency. That as wise to check. shame it wasn't so early parallelisable. |
i think it might be by design this "neighborhood" prediction thing. surprisingly, it's not like random forest at all, more like CNN. |
interesting for multiple projects from chatgpt: Row-wise Prediction (Independent instances)These methods predict based on individual rows, treating each instance as a separate feature vector without explicit context from neighboring rows:
Neighborhood Prediction (Uses neighboring or temporal context)These methods take the surrounding or neighboring data points into account when making predictions, making them better suited for sequential and time series data:
Summary
Neighborhood methods are generally better suited for time series because they naturally capture the temporal structure and dependencies in the data. |
from: @jack-bilby :
The text was updated successfully, but these errors were encountered: