Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FC=caf build fails #333

Open
jphaupt opened this issue Nov 20, 2021 · 10 comments
Open

FC=caf build fails #333

jphaupt opened this issue Nov 20, 2021 · 10 comments
Labels
help wanted Extra attention is needed

Comments

@jphaupt
Copy link

jphaupt commented Nov 20, 2021

Hi,

I understand that pFUnit does not explicitly support coarrays, but from a previous issue, it looks like it should be able to at least somewhat handle them. I have Open Coarrays installed, and I can build my source fine without pFUnit (using FC=caf cmake ..). If I remove all coarrays in my code and simply use cmake .., it builds fine. However, if I build with FC=caf cmake .. then it returns the error:

f951: Fatal Error: Reading module ‘[...]/build/test/mod/my_tests/loader.mod’ at line 269 column 12: Expected right parenthesis

Of course, I want to use coarrays and without FC=caf it fails. Is there a way around this or is this not the correct tool for CAF?

@tclune tclune added the help wanted Extra attention is needed label Nov 21, 2021
@tclune
Copy link
Member

tclune commented Nov 21, 2021

Hmm. The main obstacle with using pFUnit on co-array code is that you must ensure that such tests span all of MPI comm world. So if you are running on 8 processes, all the tests must also use 8 processes, and hence 8 CAF images. With F2018, this limitation can be better handled with teams, but such work has not been done.

But the error above looks to be more like "loader.mod" is from a different compiler. Somehow the environment is schizophrenic. Could you paste here the output of your CMake command? It will be interesting to see what it reports.

I recommend starting with some very simple unit test that does something like printing this_image() or similar. That way we can focus on issues with the build mechanism.

@jphaupt
Copy link
Author

jphaupt commented Nov 22, 2021

Actually, the unit test that I am trying is the following:

module test_tile_indices_mod
    use funit
    implicit none
 
 contains
 
    @test
    subroutine test_assert_true_and_false()
       @assertTrue(1 == 1)
       @assertFalse(1 == 2)
    end subroutine test_assert_true_and_false
    
 end module test_tile_indices_mod

which is trivial and doesn't use any coarray features, but nevertheless fails to build properly. For clarity, it's not cmake that fails, but the subsequent make. I think the problem is pFUnit doesn't like a flag that the wrapper caf introduces, because even if there is no coarray anywhere in the code (test or src), it still fails to build with the same error.

Here is the output of FC=caf cmake ..

-- The Fortran compiler identification is GNU 11.1.0
cc1: warning: command-line option ‘-fcoarray=lib’ is valid for Fortran but not for C
gfortran: warning: /usr/local/lib/libcaf_mpi.a: linker input file unused because linking not done
gfortran: warning: /usr/lib/openmpi/libmpi_usempif08.so: linker input file unused because linking not done
gfortran: warning: /usr/lib/openmpi/libmpi_usempi_ignore_tkr.so: linker input file unused because linking not done
gfortran: warning: /usr/lib/openmpi/libmpi_mpifh.so: linker input file unused because linking not done
gfortran: warning: /usr/lib/openmpi/libmpi.so: linker input file unused because linking not done
-- Detecting Fortran compiler ABI info
-- Detecting Fortran compiler ABI info - failed
-- Check for working Fortran compiler: /usr/local/bin/caf
-- Check for working Fortran compiler: /usr/local/bin/caf - works
-- Checking whether /usr/local/bin/caf supports Fortran 90
-- Checking whether /usr/local/bin/caf supports Fortran 90 - yes
-- Found Python: /usr/bin/python3.9 (found version "3.9.7") found components: Interpreter 
-- Found OpenMP_Fortran: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Configuring done
-- Generating done
-- Build files have been written to: /home/jph/modern-fortran/misc/build

This is the subsequent make output:

Scanning dependencies of target sut
[ 12%] Building Fortran object src/CMakeFiles/sut.dir/tile_indices_mod.f90.o
[ 25%] Linking Fortran static library libsut.a
[ 25%] Built target sut
Scanning dependencies of target main
[ 37%] Building Fortran object src/CMakeFiles/main.dir/main.f90.o
[ 50%] Linking Fortran executable main
[ 50%] Built target main
[ 62%] Generating test_tile_indices_mod.F90
Processing file ../../test/test_tile_indices_mod.pf
 ... Done.  Results in test_tile_indices_mod.F90
Scanning dependencies of target my_tests
[ 75%] Building Fortran object test/CMakeFiles/my_tests.dir/my_tests_driver.F90.o
f951: Fatal Error: Reading module ‘/home/jph/modern-fortran/misc/build/test/mod/my_tests/loader.mod’ at line 269 column 12: Expected right parenthesis
compilation terminated.
Error: comand:
   `/usr/bin/gfortran -I/usr/local/include/OpenCoarrays-2.8.0-27-gdd89ca1_GNU-11.1.0 -fcoarray=lib -pthread -D_TEST_SUITES="/home/jph/modern-fortran/misc/build/test/my_tests.inc" -I/home/jph/modern-fortran/misc/build/test/mod/my_tests -I/home/jph/modern-fortran/misc/test -I/home/jph/modern-fortran/misc/build/src -I/home/jph/fpkg/pFUnit/build/src/funit/mod -I/home/jph/fpkg/pFUnit/include -I/home/jph/fpkg/pFUnit/build/extern/fArgParse/extern/gFTL-shared/extern/gFTL/include/v1 -I/home/jph/fpkg/pFUnit/build/extern/fArgParse/extern/gFTL-shared/src/v1/mod -I/home/jph/fpkg/pFUnit/build/extern/fArgParse/mod -Jmod/my_tests -fopenmp -c /home/jph/modern-fortran/misc/build/test/my_tests_driver.F90 -o CMakeFiles/my_tests.dir/my_tests_driver.F90.o`
failed to compile.
make[2]: *** [test/CMakeFiles/my_tests.dir/build.make:92: test/CMakeFiles/my_tests.dir/my_tests_driver.F90.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:170: test/CMakeFiles/my_tests.dir/all] Error 2
make: *** [Makefile:101: all] Error 2

@tclune
Copy link
Member

tclune commented Nov 22, 2021

Can you redo the make step with VERBOSE=1? And does the build succeed if you have FC=gfortran instead?

I still cannot envision any error other than the fact that loader.mod is somehow built with a different compiler or at least seriously different flags.

@jphaupt
Copy link
Author

jphaupt commented Nov 22, 2021

Sure. Here is the output of make VERBOSE=1

/usr/bin/cmake -S/home/jph/modern-fortran/misc -B/home/jph/modern-fortran/misc/build --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /home/jph/modern-fortran/misc/build/CMakeFiles /home/jph/modern-fortran/misc/build//CMakeFiles/progress.marks
make  -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/home/jph/modern-fortran/misc/build'
make  -f src/CMakeFiles/sut.dir/build.make src/CMakeFiles/sut.dir/depend
make[2]: Entering directory '/home/jph/modern-fortran/misc/build'
cd /home/jph/modern-fortran/misc/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/jph/modern-fortran/misc /home/jph/modern-fortran/misc/src /home/jph/modern-fortran/misc/build /home/jph/modern-fortran/misc/build/src /home/jph/modern-fortran/misc/build/src/CMakeFiles/sut.dir/DependInfo.cmake --color=
Dependee "/home/jph/modern-fortran/misc/build/src/CMakeFiles/sut.dir/DependInfo.cmake" is newer than depender "/home/jph/modern-fortran/misc/build/src/CMakeFiles/sut.dir/depend.internal".
Dependee "/home/jph/modern-fortran/misc/build/src/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/home/jph/modern-fortran/misc/build/src/CMakeFiles/sut.dir/depend.internal".
Scanning dependencies of target sut
make[2]: Leaving directory '/home/jph/modern-fortran/misc/build'
make  -f src/CMakeFiles/sut.dir/build.make src/CMakeFiles/sut.dir/build
make[2]: Entering directory '/home/jph/modern-fortran/misc/build'
[ 12%] Building Fortran object src/CMakeFiles/sut.dir/tile_indices_mod.f90.o
cd /home/jph/modern-fortran/misc/build/src && /usr/local/bin/caf  -I/home/jph/modern-fortran/misc/build/src -J. -c /home/jph/modern-fortran/misc/src/tile_indices_mod.f90 -o CMakeFiles/sut.dir/tile_indices_mod.f90.o
/usr/bin/cmake -E cmake_copy_f90_mod src/tile_indices_mod.mod src/CMakeFiles/sut.dir/tile_indices_mod.mod.stamp GNU
/usr/bin/cmake -E touch src/CMakeFiles/sut.dir/tile_indices_mod.f90.o.provides.build
[ 25%] Linking Fortran static library libsut.a
cd /home/jph/modern-fortran/misc/build/src && /usr/bin/cmake -P CMakeFiles/sut.dir/cmake_clean_target.cmake
cd /home/jph/modern-fortran/misc/build/src && /usr/bin/cmake -E cmake_link_script CMakeFiles/sut.dir/link.txt --verbose=1
/usr/bin/ar qc libsut.a CMakeFiles/sut.dir/tile_indices_mod.f90.o
/usr/bin/ranlib libsut.a
make[2]: Leaving directory '/home/jph/modern-fortran/misc/build'
[ 25%] Built target sut
make  -f src/CMakeFiles/main.dir/build.make src/CMakeFiles/main.dir/depend
make[2]: Entering directory '/home/jph/modern-fortran/misc/build'
cd /home/jph/modern-fortran/misc/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/jph/modern-fortran/misc /home/jph/modern-fortran/misc/src /home/jph/modern-fortran/misc/build /home/jph/modern-fortran/misc/build/src /home/jph/modern-fortran/misc/build/src/CMakeFiles/main.dir/DependInfo.cmake --color=
Dependee "/home/jph/modern-fortran/misc/build/src/CMakeFiles/main.dir/DependInfo.cmake" is newer than depender "/home/jph/modern-fortran/misc/build/src/CMakeFiles/main.dir/depend.internal".
Dependee "/home/jph/modern-fortran/misc/build/src/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/home/jph/modern-fortran/misc/build/src/CMakeFiles/main.dir/depend.internal".
Scanning dependencies of target main
make[2]: Leaving directory '/home/jph/modern-fortran/misc/build'
make  -f src/CMakeFiles/main.dir/build.make src/CMakeFiles/main.dir/build
make[2]: Entering directory '/home/jph/modern-fortran/misc/build'
[ 37%] Building Fortran object src/CMakeFiles/main.dir/main.f90.o
cd /home/jph/modern-fortran/misc/build/src && /usr/local/bin/caf  -I/home/jph/modern-fortran/misc/build/src  -c /home/jph/modern-fortran/misc/src/main.f90 -o CMakeFiles/main.dir/main.f90.o
[ 50%] Linking Fortran executable main
cd /home/jph/modern-fortran/misc/build/src && /usr/bin/cmake -E cmake_link_script CMakeFiles/main.dir/link.txt --verbose=1
/usr/local/bin/caf CMakeFiles/main.dir/main.f90.o -o main  libsut.a 
make[2]: Leaving directory '/home/jph/modern-fortran/misc/build'
[ 50%] Built target main
make  -f test/CMakeFiles/my_tests.dir/build.make test/CMakeFiles/my_tests.dir/depend
make[2]: Entering directory '/home/jph/modern-fortran/misc/build'
[ 62%] Generating test_tile_indices_mod.F90
cd /home/jph/modern-fortran/misc/build/test && /usr/bin/python3.9 /home/jph/fpkg/pFUnit/build/../bin/funitproc /home/jph/modern-fortran/misc/test/test_tile_indices_mod.pf /home/jph/modern-fortran/misc/build/test//test_tile_indices_mod.F90
Processing file ../../test/test_tile_indices_mod.pf
 ... Done.  Results in test_tile_indices_mod.F90
cd /home/jph/modern-fortran/misc/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/jph/modern-fortran/misc /home/jph/modern-fortran/misc/test /home/jph/modern-fortran/misc/build /home/jph/modern-fortran/misc/build/test /home/jph/modern-fortran/misc/build/test/CMakeFiles/my_tests.dir/DependInfo.cmake --color=
Dependee "/home/jph/modern-fortran/misc/build/test/CMakeFiles/my_tests.dir/DependInfo.cmake" is newer than depender "/home/jph/modern-fortran/misc/build/test/CMakeFiles/my_tests.dir/depend.internal".
Dependee "/home/jph/modern-fortran/misc/build/test/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/home/jph/modern-fortran/misc/build/test/CMakeFiles/my_tests.dir/depend.internal".
Scanning dependencies of target my_tests
make[2]: Leaving directory '/home/jph/modern-fortran/misc/build'
make  -f test/CMakeFiles/my_tests.dir/build.make test/CMakeFiles/my_tests.dir/build
make[2]: Entering directory '/home/jph/modern-fortran/misc/build'
[ 75%] Building Fortran object test/CMakeFiles/my_tests.dir/my_tests_driver.F90.o
cd /home/jph/modern-fortran/misc/build/test && /usr/local/bin/caf -D_TEST_SUITES=\"/home/jph/modern-fortran/misc/build/test/my_tests.inc\" -I/home/jph/modern-fortran/misc/build/test/mod/my_tests -I/home/jph/modern-fortran/misc/test -I/home/jph/modern-fortran/misc/build/src -I/home/jph/fpkg/pFUnit/build/src/funit/mod -I/home/jph/fpkg/pFUnit/include -I/home/jph/fpkg/pFUnit/build/extern/fArgParse/extern/gFTL-shared/extern/gFTL/include/v1 -I/home/jph/fpkg/pFUnit/build/extern/fArgParse/extern/gFTL-shared/src/v1/mod -I/home/jph/fpkg/pFUnit/build/extern/fArgParse/mod -Jmod/my_tests -fopenmp -c /home/jph/modern-fortran/misc/build/test/my_tests_driver.F90 -o CMakeFiles/my_tests.dir/my_tests_driver.F90.o
f951: Fatal Error: Reading module ‘/home/jph/modern-fortran/misc/build/test/mod/my_tests/loader.mod’ at line 269 column 12: Expected right parenthesis
compilation terminated.
Error: comand:
   `/usr/bin/gfortran -I/usr/local/include/OpenCoarrays-2.8.0-27-gdd89ca1_GNU-11.1.0 -fcoarray=lib -pthread -D_TEST_SUITES="/home/jph/modern-fortran/misc/build/test/my_tests.inc" -I/home/jph/modern-fortran/misc/build/test/mod/my_tests -I/home/jph/modern-fortran/misc/test -I/home/jph/modern-fortran/misc/build/src -I/home/jph/fpkg/pFUnit/build/src/funit/mod -I/home/jph/fpkg/pFUnit/include -I/home/jph/fpkg/pFUnit/build/extern/fArgParse/extern/gFTL-shared/extern/gFTL/include/v1 -I/home/jph/fpkg/pFUnit/build/extern/fArgParse/extern/gFTL-shared/src/v1/mod -I/home/jph/fpkg/pFUnit/build/extern/fArgParse/mod -Jmod/my_tests -fopenmp -c /home/jph/modern-fortran/misc/build/test/my_tests_driver.F90 -o CMakeFiles/my_tests.dir/my_tests_driver.F90.o`
failed to compile.
make[2]: *** [test/CMakeFiles/my_tests.dir/build.make:92: test/CMakeFiles/my_tests.dir/my_tests_driver.F90.o] Error 1
make[2]: Leaving directory '/home/jph/modern-fortran/misc/build'
make[1]: *** [CMakeFiles/Makefile2:170: test/CMakeFiles/my_tests.dir/all] Error 2
make[1]: Leaving directory '/home/jph/modern-fortran/misc/build'
make: *** [Makefile:101: all] Error 2

I don't think I have any other compilers, just gfortran and some wrappers (mpifort, caf).

FC=gfortran cmake .. passes and succeeds, like I said, but if I want to actually include any coarrays (even if they don't make it to the test files, just in the source), the build fails, Fatal Error: Coarrays disabled at (1), use ‘-fcoarray=’ to enable. E.g. if I change the test to

    @test
    subroutine test_assert_true_and_false()
       @assertTrue(1 == 1)
       @assertFalse(1 == 2)
       print*, num_images()
    end subroutine test_assert_true_and_false

And then FC=gfortran cmake .. passes, but then if I do make VERBOSE=1 I get

/usr/bin/cmake -S/home/jph/modern-fortran/misc -B/home/jph/modern-fortran/misc/build --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /home/jph/modern-fortran/misc/build/CMakeFiles /home/jph/modern-fortran/misc/build//CMakeFiles/progress.marks
make  -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/home/jph/modern-fortran/misc/build'
make  -f src/CMakeFiles/sut.dir/build.make src/CMakeFiles/sut.dir/depend
make[2]: Entering directory '/home/jph/modern-fortran/misc/build'
cd /home/jph/modern-fortran/misc/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/jph/modern-fortran/misc /home/jph/modern-fortran/misc/src /home/jph/modern-fortran/misc/build /home/jph/modern-fortran/misc/build/src /home/jph/modern-fortran/misc/build/src/CMakeFiles/sut.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/home/jph/modern-fortran/misc/build'
make  -f src/CMakeFiles/sut.dir/build.make src/CMakeFiles/sut.dir/build
make[2]: Entering directory '/home/jph/modern-fortran/misc/build'
make[2]: Nothing to be done for 'src/CMakeFiles/sut.dir/build'.
make[2]: Leaving directory '/home/jph/modern-fortran/misc/build'
[ 25%] Built target sut
make  -f src/CMakeFiles/main.dir/build.make src/CMakeFiles/main.dir/depend
make[2]: Entering directory '/home/jph/modern-fortran/misc/build'
cd /home/jph/modern-fortran/misc/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/jph/modern-fortran/misc /home/jph/modern-fortran/misc/src /home/jph/modern-fortran/misc/build /home/jph/modern-fortran/misc/build/src /home/jph/modern-fortran/misc/build/src/CMakeFiles/main.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/home/jph/modern-fortran/misc/build'
make  -f src/CMakeFiles/main.dir/build.make src/CMakeFiles/main.dir/build
make[2]: Entering directory '/home/jph/modern-fortran/misc/build'
make[2]: Nothing to be done for 'src/CMakeFiles/main.dir/build'.
make[2]: Leaving directory '/home/jph/modern-fortran/misc/build'
[ 50%] Built target main
make  -f test/CMakeFiles/my_tests.dir/build.make test/CMakeFiles/my_tests.dir/depend
make[2]: Entering directory '/home/jph/modern-fortran/misc/build'
cd /home/jph/modern-fortran/misc/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/jph/modern-fortran/misc /home/jph/modern-fortran/misc/test /home/jph/modern-fortran/misc/build /home/jph/modern-fortran/misc/build/test /home/jph/modern-fortran/misc/build/test/CMakeFiles/my_tests.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/home/jph/modern-fortran/misc/build'
make  -f test/CMakeFiles/my_tests.dir/build.make test/CMakeFiles/my_tests.dir/build
make[2]: Entering directory '/home/jph/modern-fortran/misc/build'
[ 62%] Building Fortran object test/CMakeFiles/my_tests.dir/test_tile_indices_mod.F90.o
cd /home/jph/modern-fortran/misc/build/test && /usr/bin/gfortran -D_TEST_SUITES=\"/home/jph/modern-fortran/misc/build/test/my_tests.inc\" -I/home/jph/modern-fortran/misc/build/test/mod/my_tests -I/home/jph/modern-fortran/misc/test -I/home/jph/modern-fortran/misc/build/src -I/home/jph/fpkg/pFUnit/build/src/funit/mod -I/home/jph/fpkg/pFUnit/include -I/home/jph/fpkg/pFUnit/build/extern/fArgParse/extern/gFTL-shared/extern/gFTL/include/v1 -I/home/jph/fpkg/pFUnit/build/extern/fArgParse/extern/gFTL-shared/src/v1/mod -I/home/jph/fpkg/pFUnit/build/extern/fArgParse/mod -Jmod/my_tests -fopenmp -c /home/jph/modern-fortran/misc/build/test/test_tile_indices_mod.F90 -o CMakeFiles/my_tests.dir/test_tile_indices_mod.F90.o
/home/jph/modern-fortran/misc/test/test_tile_indices_mod.pf:16:33:

   16 |  end module test_tile_indices_mod
      |                                 1
Fatal Error: Coarrays disabled at (1), use ‘-fcoarray=’ to enable
compilation terminated.
make[2]: *** [test/CMakeFiles/my_tests.dir/build.make:79: test/CMakeFiles/my_tests.dir/test_tile_indices_mod.F90.o] Error 1
make[2]: Leaving directory '/home/jph/modern-fortran/misc/build'
make[1]: *** [CMakeFiles/Makefile2:170: test/CMakeFiles/my_tests.dir/all] Error 2
make[1]: Leaving directory '/home/jph/modern-fortran/misc/build'
make: *** [Makefile:101: all] Error 2

@tclune
Copy link
Member

tclune commented Nov 22, 2021

OK - the pure gfortran case got past the earlier problem, which is good. My best guess is that gfortran somehow changes the format of .mod files when compiling with the coarray flags. Can you try building pFUnit itself with the same extra flags?

@jphaupt
Copy link
Author

jphaupt commented Nov 22, 2021

Looks like rebuilding pFUnit with FC=caf cmake .. fixes my problem. However, it doesn't seem to play nice with MPI any more (I am not sure how to write a unit test using CAF, and normal MPI breaks). For example, previously the MPI units tests from the demos ran fine, but now they fail (with or without FC=caf). It builds fine, but the tests fail, to do with MPI_Init.

Here is LastTest.log in case it's helpful:

Start testing: Nov 22 16:57 CET
----------------------------------------------------------
1/1 Testing: mpi_tests
1/1 Test: mpi_tests
Command: "/usr/bin/mpirun" "-np" "4" "/home/jph/fpkg/pFUnit_demos/MPI/build/tests/mpi_tests"
Directory: /home/jph/fpkg/pFUnit_demos/MPI/build/tests
"mpi_tests" start time: Nov 22 16:57 CET
Output:
----------------------------------------------------------
--------------------------------------------------------------------------
Open MPI has detected that this process has attempted to initialize
MPI (via MPI_INIT or MPI_INIT_THREAD) more than once.  This is
erroneous.
--------------------------------------------------------------------------
[LittleTheorem:373694] *** An error occurred in MPI_Init
[LittleTheorem:373694] *** reported by process [2213740545,3]
[LittleTheorem:373694] *** on a NULL communicator
[LittleTheorem:373694] *** Unknown error
[LittleTheorem:373694] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[LittleTheorem:373694] ***    and potentially your MPI job)
[LittleTheorem:373687] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init: invoked multiple times
[LittleTheorem:373687] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[LittleTheorem:373687] 3 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal unknown handle
<end of output>
Test time =   0.11 sec
----------------------------------------------------------
Test Failed.
"mpi_tests" end time: Nov 22 16:57 CET
"mpi_tests" time elapsed: 00:00:00
----------------------------------------------------------

End testing: Nov 22 16:57 CET

Am I to expect that building for caf should break MPI (I suspect it shouldn't)? If so, how would I then test caf?

@tclune
Copy link
Member

tclune commented Nov 22, 2021

OK - so first, you will want to have use pfunit instead of use funit in any parallel tests. And then each test procedure needs to have a "this" or "self" object. I'll point to an example in a moment. Then you need to specify the number of processes to use in the @test line. And finally, you need to launch the same number of processes as the test requires. (This is where CAF teams will improve the situation.)

An example test is here: https://github.com/Goddard-Fortran-Ecosystem/pFUnit_demos/blob/df02c379be766d31c0106dfe37f082555b799230/MPI/tests/test_halo.pf#L145-L167

So you would have something like

@test(npes=[8])

and only run on 8 images.

At one time there was a prototype that let you use "*" meaning use all the processes available. Maybe it is still there, but looking at the preprocessor it does not look like it to me. But the better fix is to introduce CAF teams and then create smaller teams for each test in the same way that we create subcommunicators in the MPI case.

@jphaupt
Copy link
Author

jphaupt commented Nov 22, 2021

Perhaps I do not understand. However, if I simply clone the directory you linked, and do (after adding FC=caf in the .x file)

cd MPI
export PFUNIT_DIR=/path/to/pfunit/build/with/caf
./build_with_cmake_and_run.x 

then it fails as above. However, if I replace PFUNIT_DIR with a build of pFUnit that didn't use FC=caf, then it passes and all is well. So it looks like your fix breaks this.

Unfortunately I am just starting out with CAF so I can't comment much on how to use teams.

@tclune
Copy link
Member

tclune commented Nov 22, 2021

OK - I suspect this problem is because CAF is launching MPI in the background and the command line has already launched MPI. You'll need to ask someone more familiar with CAF (and/or GFortran's implementation) to see if there is a way to get CAF to "inherit" the MPI environment.

What happens with the above experiment if you edit the build_with_cmake_and_run.x to launch the executable "serially"? That might be a workaround - not sure how GFortran tells CAF how many images to launch - that's the next thing to look at.

@jphaupt
Copy link
Author

jphaupt commented Nov 22, 2021

I'm sorry, how would I edit it to launch serially? Wouldn't that require editing several tests as well (changing npes)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants