You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
H2O version, Operating System and Environment
3.46.0.6 on LINUX ubuntu 22.04.4
Actual behavior
Just posting to give hints to other people encountering similar problems that I did. I solved it enough for my current application but it is quite a handicap for a more frequent use.
In the research project I work, we have to perform predictions on loads of small datasets. Some of the models we use are based on h2o. This leads to problems in the iterative process we use: for reasons of memory limitation, each dataset is processed one after another (75000 records each time). When it has been running for 2 hours, the R session loses connection to the java VM and crashes the whole prediction pipeline, with errors like taking too long to initialize.
Furthermore, there is a memory leak in the VM forcing to restart it.
A way to bypass these problems is to force the restart at each iteration using h2o.shutdown and then h2o.init(). Yet this causes another issue if the prediction call occurs just afterwards, with a signature like Error in h2o dosafereset(). It seems that h2o.init():
is not capable to assess if the remaining free memory of the VM is sufficient for new use;
releases control to the R script to easily: it does not check that the API behind the hood is up and running. Therefore, any call occurring just afterwards leads to critical failures that cannot be found running a line at a time as the initialisation is likely to have happened in the short time between the launch of the 2 commands.
Therefore, it is required to insert a Sys.sleep(5) just after calling h2o.init().
Furthermore, there is a memory leak in the VM forcing to restart it.
The text was updated successfully, but these errors were encountered:
H2O version, Operating System and Environment
3.46.0.6 on LINUX ubuntu 22.04.4
Actual behavior
Just posting to give hints to other people encountering similar problems that I did. I solved it enough for my current application but it is quite a handicap for a more frequent use.
In the research project I work, we have to perform predictions on loads of small datasets. Some of the models we use are based on h2o. This leads to problems in the iterative process we use: for reasons of memory limitation, each dataset is processed one after another (75000 records each time). When it has been running for 2 hours, the R session loses connection to the java VM and crashes the whole prediction pipeline, with errors like taking too long to initialize.
Furthermore, there is a memory leak in the VM forcing to restart it.
A way to bypass these problems is to force the restart at each iteration using h2o.shutdown and then h2o.init(). Yet this causes another issue if the prediction call occurs just afterwards, with a signature like Error in h2o dosafereset(). It seems that h2o.init():
is not capable to assess if the remaining free memory of the VM is sufficient for new use;
releases control to the R script to easily: it does not check that the API behind the hood is up and running. Therefore, any call occurring just afterwards leads to critical failures that cannot be found running a line at a time as the initialisation is likely to have happened in the short time between the launch of the 2 commands.
Therefore, it is required to insert a Sys.sleep(5) just after calling h2o.init().
Furthermore, there is a memory leak in the VM forcing to restart it.
The text was updated successfully, but these errors were encountered: