Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

errors in iterative used of h2o model prediction #16457

Open
CNickm opened this issue Nov 25, 2024 · 0 comments
Open

errors in iterative used of h2o model prediction #16457

CNickm opened this issue Nov 25, 2024 · 0 comments
Labels

Comments

@CNickm
Copy link

CNickm commented Nov 25, 2024

H2O version, Operating System and Environment
3.46.0.6 on LINUX ubuntu 22.04.4

Actual behavior

Just posting to give hints to other people encountering similar problems that I did. I solved it enough for my current application but it is quite a handicap for a more frequent use.

In the research project I work, we have to perform predictions on loads of small datasets. Some of the models we use are based on h2o. This leads to problems in the iterative process we use: for reasons of memory limitation, each dataset is processed one after another (75000 records each time). When it has been running for 2 hours, the R session loses connection to the java VM and crashes the whole prediction pipeline, with errors like taking too long to initialize.

Furthermore, there is a memory leak in the VM forcing to restart it.

A way to bypass these problems is to force the restart at each iteration using h2o.shutdown and then h2o.init(). Yet this causes another issue if the prediction call occurs just afterwards, with a signature like Error in h2o dosafereset(). It seems that h2o.init():

  • is not capable to assess if the remaining free memory of the VM is sufficient for new use;

  • releases control to the R script to easily: it does not check that the API behind the hood is up and running. Therefore, any call occurring just afterwards leads to critical failures that cannot be found running a line at a time as the initialisation is likely to have happened in the short time between the launch of the 2 commands.

Therefore, it is required to insert a Sys.sleep(5) just after calling h2o.init().

Furthermore, there is a memory leak in the VM forcing to restart it.

@CNickm CNickm added the bug label Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant