Replies: 14 comments 5 replies
-
Hi CosimoMV, Thanks for giving OpenParEM3D a try. There may be a problem with the compilation for parallel processing. OpenParEM3D is based on OpenMPI, and you mention OpenMP, which is not the same thing. You have success running in serial (np=1), where MPI is not invoked, then you have trouble with parallel execution (np>1). Perhaps in compiling, you are linking to OpenMP instead of OpenMPI? Although, I don't know if that is possible. Have you tried running the pre-compiled binary? It is statically linked, so OpenMPI vs. OpenMP issues would not occur. If you can run the pre-compiled binary in parallel, then that would potentially point to linking to OpenMP when compiling. |
Beta Was this translation helpful? Give feedback.
-
I installed Ubuntu 24.04.1 LTS in VirtualBox, and the pre-compiled binaries from 20.04.06 LTS did not run. I'll dig into it and provide an update when I have things running in 24.04.1 LTS. |
Beta Was this translation helpful? Give feedback.
-
I get a clean compile in Ubuntu 24.04.1 LTS if I include one more library in section 5.1 of the installation guide (i.e. before compiling hypre): After that, I do see the behavior of a successful run in serial and fail with libc.so.6 in parallel. Working on that now. |
Beta Was this translation helpful? Give feedback.
-
The pre-compiled binaries run in Ubuntu 22.04.5 LTS after adding a library [installation guide to be updated]: Ubuntu 24.04.1 LTS is having a problem with a PETSc call to solve the linear system. The problem persists even after updateing PETSc and SLEPc to their latest versions. |
Beta Was this translation helpful? Give feedback.
-
Wow, thank you so much for your help! On Ubuntu 24.04 I have already installed libopenmpi-dev. I also tried reinstalling all the packages linked to openMPI, but nothing changed. The thing about Is there any test I can do to help you debug OpenParEM3D on Ubuntu 24.04? This is the error I get during compilation, if I don't set
|
Beta Was this translation helpful? Give feedback.
-
I'm thinking that the -fopenmp flag is ok. Your error messages are essentially referencing Intel processors, so I think you have Intel-optimized libraries installed and would require some additional information to the compiler to properly utilize these. Another reason why I'm thinking that the -fopenmp flag is ok is that I'm seeing the same behavior that you are seeing: proper execution in serial and the same error in parallel. Also note that OpenParEM2D runs properly in parallel for the failing case before it gets to the error, so parallel processing is fundamentally working even for the failing case. To summarize on the error, it does not occur for Ubuntu 20.04.06 LTS or Ubuntu 22.04.05 LTS. We both see the error in Ubuntu 24.04.1 LTS. The error occurs at a PETSc call to KSPSolve. I have tried updating PETSc and SLEPc to the latest versions and recompiling, but that does not resolve the error. The next step is to investigate the latest or an earlier version of libc, which is the library reporting the error. After that, I'll try Valgrind to look for a memory problem in OpenParEM. [Note that Valgrind has been extensively used on OpenParEM in Ubuntu 20.04.06 LTS, but maybe something pops up in Ubuntu 24.04.1 LTS?] If you want to try any of these ideas or something else, that would be great. I'll keep working on my side because it would be great to be able to support Ubuntu 24. |
Beta Was this translation helpful? Give feedback.
-
This is looking like a complex issue between PETSc and Ubuntu 24.04.1 LTS. The errors with libc.so.6 do not appear if the pre-conditioner to the KSPSolve is changed from PCCHOLESKY to PCJACOBI or PCNONE, a simple parameter change. [However, convergence is not achieved with these options.] From the perspective of OpenParEM3D, successfully running (if not converging) with one of these changes somewhat rules out a fix based in OpenParEM code. Running Valgrind did not turn up anything new, and compiling and installing the latest version of OpenMPI did not affect the error. Already tried was ensuring that Ubuntu 24.04.1 LTS is fully up-to-date and that PETSc is using the latest version. libc is very fundamental to Linux, so I decided to not mess with libc. Given that Ubuntu 24.04.1 LTS just came out in late April, perhaps it and the infrastructure are not yet quite ready? I will fully check out Ubuntu 22 by verifying it with the regression suite and updating the documentation. Users will then have the option of Ubuntu 20 and 22, and I think Ubuntu 24 will have to wait for another day. |
Beta Was this translation helpful? Give feedback.
-
Ubuntu 22.04.05 LTS passed the regression suites, so it is good for use with OpenParEM. I will update documentation and release that, then I will re-visit Ubuntu 24. |
Beta Was this translation helpful? Give feedback.
-
I tried compiling glibc 2.40 and got some errors, so I tuned the compiler options to remove them (I just googled the error) and tried recompiling OpenParem with these flags (on glibc 2.39), but nothing changed. I suspected a change in glibc from Ubunutn 22.04 and Ubunutnu 24, and I found that from glibc 2.34 libpthread was removed as a separate library... but Ubuntu 22.04 has glibc 2.35, so this can't be the problem. |
Beta Was this translation helpful? Give feedback.
-
Thanks for giving that a shot. My current plan is to turn on debugging in PETSc, then trace back the failing point to the specific PETSc code. That effort may suggest a fix or enable a detailed bug report to the PETSc or Ubuntu teams. I don't think this is going to be a quick fix. |
Beta Was this translation helpful? Give feedback.
-
I found the problem, so it didn't take as long I feared. A change in 6 lines of code in OpenParEM3D allows Ubuntu 24 to work. I need to audit the code to see if there are similar changes required elsewhere then re-run the regression suites across Ubuntu 20, 22, and 24, then re-release. This will take a few days due to run times on the regression suites. If you do not want to wait, I can tell you the 6 lines to change so that you can be running while I to the work on a formal release. |
Beta Was this translation helpful? Give feedback.
-
I pushed the changes to OpenParEM3D last night, so you can pull main to get the latest. I have not created a release for this version yet. In the OpenParEM3D regression directory, check the README.txt file. Step 4 shows how to run entire regression suite. The script assumes that you have 12+ cores on your computer, so if you do not, you will want to reduce the number of cores in the file regression_case_list.txt to avoid overloading the computer and slowing things down. There is a simple script called check.sh that checks the results for problems, and you can run this at any time during the run to see how things are going. No output is good. Or, you can manually check regression.log and regression_results.csv. |
Beta Was this translation helpful? Give feedback.
-
Yes, that is the same error we were both seeing before. After I made the change to fem3D.cpp, the problem was resolved. I'm thinking that you somehow did not pick up the change. In fem3D.cpp, you should see lines 455-458 look like this: |
Beta Was this translation helpful? Give feedback.
-
Well, that is frustrating. Here is a fully detailed experiment I just did using pre-compiled binaries, and it works under Ubuntu 24.04.1 LTS running in VirtualBox. By trying the binaries, all compilation variables are eliminated. Remove existing executables:
Download precompiled binaries:
Check:
Run:
If this works, then the problem is somewhere in the compilation process. If it doesn't work, then there is a difference in our Ubuntu 24 installations or VirtualBox is somehow papering over a problem. |
Beta Was this translation helpful? Give feedback.
-
Hello! I'm trying to install OpenParEm 3D, (a very interesting program and thank you @M8kmyday for making it open-source!), but I'm having problems.
I'm on Ubuntu 24.04, and I have compiled all the necessary packages following the guide on the pdf. But when I try to install Openparem2D and OpenPaEM3D I get errors in the compilation, which I can fix by using the ‘-fopenmp’ option in the compiler.
Once I go to test if everything works, using the command “process3D.sh straight.proj 2”, I get the reported error (I have correctly set export OMP_NUM_THREADS=1 in my bashrc file and this is not a RAM problem), while using a single process (process process3D.sh straight.proj 1) I can finish the simulation without errors. What am I doing wrong?
Thank you very much for your help.
Beta Was this translation helpful? Give feedback.
All reactions