-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CuSe sample input crashes current NOMAD 1.2.2 #95
Comments
Totally forgot to mention: Docker is docker 25.0.3 on Rocky 9.3:
|
Hi @behnle, thanks for reporting. Can you confirm you are talking about the |
I will take a closer look but I suspect the parsing timed out. However, the error seems to be inconsistent with a timed-out entry. It could also be that the archive size is larger than permitted causing trouble with the archive reader. |
Exactly, that's the one. Let me know if you need additional informations for tracking down the problem. |
There are only two parser warnings
but no errors. |
You can modify the settings by specifying them in the nomad.yaml file. You can have a look at the docs here. For a complete list of config keys, I suggest you look at the code under nomad/config/models.py . For example you can adjust services.api_timeout or celery.timeout |
I had already skimmed the list of config options and had set
Did not change any celery settings, though. |
can you please send me the image path. i have troubles finding it. |
Might be related to this: https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/issues/1861 |
ah yes I have completely forgotten about this. thanks Lauri |
@ladinesa What do You mean by "image path"? The docker image?
@lauri-codes Yes, might be related to my issue. Would it help to pull a new docker image if available? |
@behnle: Sorry I missed your comment. You can try updating the |
@lauri-codes Thanks for the heads-up. I recently pulled the "latest" image:
With this release, every other attempt on the original sample data succeeds, but some reprocessing runs fail with
and the docker-compose log contains error messages like this one
In the journal of the server, i found the following potentially related error message:
(uid 1000 is the nomad user) I have no clue what is going on. The server has 16 GiB RAM so IMHO an OOM event is rather unlikely (but not impossible). This affects the sample files Edit: NOMAD version is now 1.2.2.dev357+g15b7cd2e1 |
@behnle: I will try to reproduce the problem and see why the parser is struggling with this example. |
I can confirm that at least one particular main file seems to use a very large amount of RAM, ultimately causing the process to be killed. Here is the zip: int_hse.zip file, the problematic file is We need to check what is causing the memory usage to blow up in the FHI-aims parser for this file. In general some calculations are very big and will need a lot of RAM to be processed, but this does not look like one to me. @ndaelman-hu, @JosePizarro3 : Could you investigate this a bit? |
It's likely this basis set tier checker. I'll see to slim it down. |
The issue is Am investigating further. |
When trying to visualize the
CuSe FHI-aims GeometryOptimization simulation
sample, NOMAD crashes with a python error:In the GUI, this triggers an internal server error:
Unexpected error: "[object Object] (500)". Please try again and let us know, if this error keeps happening.
No cell or workflow graph is shown.
NOMAD version is
1.2.2.dev295+g2e611aff1
.The text was updated successfully, but these errors were encountered: