Global interpolation without GSLIB #1542

MartinKarp · 2024-10-11T09:54:23Z

Global interpolation without gslib. Same functionality as previously + GPU support, but no need to build with gslib any more. Removes gslib entirely from the codebase. I am now happy enough with it that I think we can start tying to get it into develop. It will continue to be optimized during the spring.

Features

Machinery for finding rst coordinates on the CPU/GPU without GSLIB
Options to execute interpolation on the host rather than device
Functions to copy data back and forth between host and device for vectors, fields and matrices
Unit tests for global interpolation

Todo before merge

Clean code a bit more and double check for comments.
Understand issue with Mac double precision test.

…tin/feature/global_interpolation_no_gslib

…ture/global_interpolation_no_gslib

timofeymukha · 2024-10-18T14:50:03Z

@MartinKarp related to the bug with object init, we should probably have an issue where we list "offenders", so we can refactor them to an init routine.

MartinKarp · 2024-10-21T14:53:47Z

Think this is more or less ready on the CPU. This is truly a "develop" PR in the sense it is far from perfect. It alleviates the gslib dependency though. I think until I add functionality like fast GPU support we can wait with merging this.

…ture/global_interpolation_no_gslib

njansson · 2025-02-14T13:18:44Z

src/gs/gs_mpi.f90

+                   !Isnt this a datarace?
+                   !How do things like this ever get into the code...


No this is not a race, since there's no parallelism in the loop (do concurrent != omp parallel do)

Maybe I understood incrorrectly then, but to me there is a data dependency on u(sp(i)) or? I thought do concurrent specified that there are no dependencies.

https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/2023-1/do-concurrent.html

Yes that would be true, but we’re working with the gathered data so there’s no duplicates. So concurrent is needed to Force vectorisation

But there might be several values to unpack from one rank to the same u(sp(i)) right?

No we're are only communicating unique, gathered dofs, so from a single neighbour there will only be one place to update. The accumulation is needed to account for shared dofs received from other ranks, but for host based MPI each neighbour has an own buffer, so there's no data dependencies in each unpack loop. (What can't be done is to have multithreaded unpack, without critical sections around the updates)

The shorter array with received data is later scattered, which has the one to many update pattern

njansson · 2025-02-14T13:21:17Z

src/gs/gs_mpi.f90

-       do concurrent (j = 1:this%send_dof(dst)%size())
+       do j = 1,this%send_dof(dst)%size()


Why would this need to be changed? (and preventing optimisation of the loop filling the buffer)

njansson · 2025-02-14T13:21:58Z

src/gs/gs_mpi.f90

-       associate(send_data => this%send_buf(i)%data)
-         call MPI_Isend(send_data, size(send_data), &
+       !associate(send_data => this%send_buf(i)%data)
+         call MPI_Isend(this%send_buf(i)%data, this%send_dof(dst)%size(), &


Can we please keep this associate block such that the code would still work with NAG fortran

Yeah, this was just me trying to find why Isend was so slow for many messages on Dardel.

…ture/global_interpolation_no_gslib

….com:ExtremeFLOW/neko into martin/feature/global_interpolation_no_gslib

…ture/global_interpolation_no_gslib

njansson and others added 11 commits September 24, 2024 09:29

Add intersector detector for meshes

758369b

Merge branch 'feature/intersect' of github.com:njansson/neko into mar…

7adb562

…tin/feature/global_interpolation_no_gslib

Merge branch 'develop' of github.com:ExtremeFLOW/neko into martin/fea…

5e5b939

…ture/global_interpolation_no_gslib

w/o gslib with sanity check v0.1

5c1c6b6

fix sort

a7a9c08

normal init, avoid segfaults

dfd4548

aabb get aabb

49b6269

fix intent

087ca5f

v0.1 slow finding rst

5555bb6

casting, debatable effect, whill add derivative

d977bad

update deps

ef1f96d

MartinKarp changed the title ~~Global interpolation no gslib v0.1~~ Global interpolation no gslib v0.1 -> v1.0 Oct 11, 2024

MartinKarp added 7 commits October 17, 2024 20:58

Merge branch 'develop' of github.com:ExtremeFLOW/neko into martin/fea…

e54fbc7

…ture/global_interpolation_no_gslib

a bit of rework

b710ac0

on_host interp

6bedb99

interp on host

b5af60e

some performance fixes

8ce6ed5

non weighted legendre

69982e4

on host option

22b98ff

MartinKarp added 7 commits October 21, 2024 15:31

cleanup

38d2c8b

cleanup

c15065d

on host

7641129

el_owner0

ef95411

fix inverse

eda5a44

cpu tensor

95b81d9

fixes

b52c339

MartinKarp marked this pull request as ready for review October 21, 2024 14:51

Merge branch 'develop' of github.com:ExtremeFLOW/neko into martin/fea…

ce0833f

…ture/global_interpolation_no_gslib

MartinKarp added 7 commits February 7, 2025 14:37

fix idx

b57c7d3

rm print

d27efae

fix correct bounds

aec38b1

make glob map larger

00c2d75

some fixes

8bc5205

gs fixes

438207f

finish if found

318c623

njansson reviewed Feb 14, 2025

View reviewed changes

MartinKarp added 20 commits February 27, 2025 09:08

fix new bcs

0b4e788

update bcs

1f4ace5

Fix recycling case and rename folder

e4a8b69

Merge branch 'develop' of github.com:ExtremeFLOW/neko into martin/fea…

c3955ab

…ture/global_interpolation_no_gslib

fix use

a8c199d

fix use

d25eb33

check NULLbefore push

0da3e25

public routines

4a8158b

tnsrd3

0ea9a54

only

5b0a6a8

add some tests

b6bc4d9

fix padding

7d48c2f

draft for supporting several communicators

a0715e3

different comms

059e337

Merge branch 'develop' of github.com:ExtremeFLOW/neko into martin/fea…

00ebf8d

…ture/global_interpolation_no_gslib

Merge branch 'develop' of github.com:ExtremeFLOW/neko into martin/fea…

fcbd967

…ture/global_interpolation_no_gslib

Merge branch 'martin/feature/global_interpolation_no_gslib' of github…

d47477d

….com:ExtremeFLOW/neko into martin/feature/global_interpolation_no_gslib

Merge branch 'develop' of github.com:ExtremeFLOW/neko into martin/fea…

3201e6c

…ture/global_interpolation_no_gslib

Merge branch 'develop' of github.com:ExtremeFLOW/neko into martin/fea…

6ec1e38

…ture/global_interpolation_no_gslib

check for 0 before rst search

38998c5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Global interpolation without GSLIB #1542

Global interpolation without GSLIB #1542

MartinKarp commented Oct 11, 2024 •

edited

Loading

timofeymukha commented Oct 18, 2024

MartinKarp commented Oct 21, 2024 •

edited

Loading

njansson Feb 14, 2025

MartinKarp Feb 14, 2025

njansson Feb 14, 2025

MartinKarp Feb 14, 2025

njansson Feb 15, 2025

njansson Feb 14, 2025

njansson Feb 14, 2025

MartinKarp Feb 14, 2025 •

edited

Loading

		!Isnt this a datarace?
		!How do things like this ever get into the code...

		do concurrent (j = 1:this%send_dof(dst)%size())
		do j = 1,this%send_dof(dst)%size()

Global interpolation without GSLIB #1542

Are you sure you want to change the base?

Global interpolation without GSLIB #1542

Conversation

MartinKarp commented Oct 11, 2024 • edited Loading

Features

Todo before merge

timofeymukha commented Oct 18, 2024

MartinKarp commented Oct 21, 2024 • edited Loading

njansson Feb 14, 2025

Choose a reason for hiding this comment

MartinKarp Feb 14, 2025

Choose a reason for hiding this comment

njansson Feb 14, 2025

Choose a reason for hiding this comment

MartinKarp Feb 14, 2025

Choose a reason for hiding this comment

njansson Feb 15, 2025

Choose a reason for hiding this comment

njansson Feb 14, 2025

Choose a reason for hiding this comment

njansson Feb 14, 2025

Choose a reason for hiding this comment

MartinKarp Feb 14, 2025 • edited Loading

Choose a reason for hiding this comment

MartinKarp commented Oct 11, 2024 •

edited

Loading

MartinKarp commented Oct 21, 2024 •

edited

Loading

MartinKarp Feb 14, 2025 •

edited

Loading