Square root filtration values for alpha and delaunay-cech complex #1138

VincentRouvreau · 2024-10-01T10:35:53Z

… complex

…nay cech. Python hints for delaunay complex family

mglisse · 2024-10-19T16:46:33Z

src/Alpha_complex/include/gudhi/Alpha_complex.h

+    // For users to be able to define their own sqrt function on their desired Filtration_value type
+    using std::sqrt;


You could also put it next to the single place where it is used, but it is ok here.

mglisse · 2024-10-19T17:21:29Z

src/Alpha_complex/utilities/alpha_complex_persistence.cpp

    std::clog << "Alpha complex can be, or not, weighted (requires a file containing weights values).\n\n";
+    std::clog << "Weighted Alpha complex can have negative filtration values. If '-square-root-filtrations' is";


Should we mention something like that in the documentation of Alpha_complex as well?

mglisse · 2024-10-19T17:47:40Z

src/Cech_complex/include/gudhi/MEB_filtration.h

+      if constexpr (square_root_filtrations)
+        filt = sqrt(filt);
+      complex.assign_filtration(sh, max(filt, Filtration_value(0)));


I think it would be safer to do the max with 0 before the sqrt.

Test filtration values are positive, before sqrt on 4d14cbc

mglisse · 2024-10-19T17:58:22Z

src/Cech_complex/include/gudhi/MEB_filtration.h

+          filt = sqrt(filt);
+        // maxf = filt except for rounding errors
+        maxf = max(maxf, filt);


I think there is a slight risk that a rounding error can make filt negative, with unpredictable results. It feels silly to use sqrt(max(filt,0)), but I don't know if there is an easy way to avoid it. Well, maybe replacing max(maxf,filt) by a hand-written comparison so we know what happens when filt is NaN, but it feels like a gratuitous use of NaN.

Test filtration values are positive, before sqrt on 4d14cbc

mglisse · 2024-10-19T18:00:49Z

src/python/gudhi/delaunay_complex.pyx

-        :param precision: Delaunay complex precision can be 'fast', 'safe' or 'exact'. Default is 'safe'.
-        :type precision: string
+        Args:
+            points (Iterable[Iterable[float]]): A list of points in d-Dimension.


Isn't the type redundant?

It seems you have to declare it like that with mypy, for example:

from collections.abc import Iterable def dummy_sum(a: Iterable[Iterable[int]]): sum = 0 for line in a: for value in line: sum += value return sum arr = [[1,1],[1,5]] print(f'sum = {dummy_sum(arr)}')

while, if I declare dummy_sum as def dummy_sum(a: Iterable[int]):, mypy is unhappy:

test_mypy.py:6: error: "int" has no attribute "__iter__"; maybe "__int__"? (not iterable) [attr-defined] test_mypy.py:11: error: Argument 1 to "dummy_sum" has incompatible type "list[list[int]]"; expected "Iterable[int]" [arg-type] Found 2 errors in 1 file (checked 1 source file)

mglisse · 2024-10-19T18:12:43Z

src/Alpha_complex/include/gudhi/Alpha_complex.h

@@ -354,8 +355,10 @@ class Alpha_complex {
   * It also computes the filtration values accordingly to the \ref createcomplexalgorithm if default_filtration_value
   * is not set.
   *
+   * \tparam square_root_filtrations If false (default value), it assigns to each simplex a filtration value equal to


(ignore the exact names I use here)
I wonder what is more natural between do_sqrt=true/false and output_squared_values=true/false. As the person writing the code, it is natural to ask if you should call sqrt. As the user... I am not sure, it might be the other one.

Renamed as proposed on 5d9a564

DavidLapous

Small comments on the python interface, I did not have the chance to test this.
Quick questions that I forgot:

are weights with sqrt=True well-defined (e.g., via $x\mapsto \mathrm{sgn}(x)\sqrt{|x|}$) ?
do you think it would be reasonable to put the sqrt to True ? (I'm in favor of it, to make it consistent with other complexes).

DavidLapous · 2024-10-25T11:33:43Z

src/python/gudhi/delaunay_complex.pyx

+    def create_simplex_tree(self, max_alpha_square: float = float('inf'),
+                            filtration: Optional[Literal['alpha', 'cech']] = None,
+                            square_root_filtrations: bool = False):


For double / bools (or even vectors / anything that needs a copy to be translated in cpp), I usually find this cleaner to put the cpp type directly so that you don't need to do a copy below (even though I think the compiler will optimize both cases the same); furthermore the errors are less cryptic in python as they don't refer to the inside of a pyx function (or even a "no signature found") but only to the signature.

Suggested change

def create_simplex_tree(self, max_alpha_square: float = float('inf'),

filtration: Optional[Literal['alpha', 'cech']] = None,

square_root_filtrations: bool = False):

def create_simplex_tree(self, double max_alpha_square = float('inf'),

filtration: Optional[Literal['alpha', 'cech']] = None,

bool square_root_filtrations = False) -> SimplexTree:

I didn't know (or remember) that this was possible, but it looks good, there are probably more places where we could simplify the code this way. We need to check what the generated documentation looks like (I think we want to see the Python types there).

Sphinx was not ok with your proposal (the type was missing), so I go for the very redundant following syntax:

def create_simplex_tree(self, double max_alpha_square:float = float('inf'), filtration: Optional[Literal['alpha', 'cech']] = None, bool square_root_filtrations:bool = False) -> SimplexTree:

on dc78dc1

I don't remember : does gudhi uses embedsignature ? This may solve this redundant issue. I'm using that here for instance.

DavidLapous · 2024-10-25T11:34:15Z

src/python/gudhi/delaunay_complex.pyx

        return stree

    @staticmethod
-    def set_float_relative_precision(precision):
+    def set_float_relative_precision(precision: float):


Suggested change

def set_float_relative_precision(precision: float):

def set_float_relative_precision(double precision):

Also, instead of returning nothing, I usually like to return self to be able to do some functional programming like pattern, e.g.,

stuff = ( thing.do_this(...) .do_that(...) .transform(...) )

but that's completely subjective. This doesn't cost anything performance-wise (every python object is a fat pointer IIRC).

DavidLapous · 2024-10-25T11:49:27Z

src/python/gudhi/delaunay_complex.pyx

-    def get_point(self, vertex):
+    def get_point(self, vertex: int) -> list[float]:
        """This function returns the point corresponding to a given vertex from the :class:`~gudhi.SimplexTree` (the
        same as the k-th input point, where `k=vertex`)

-        :param vertex: The vertex.
-        :type vertex: int
-        :rtype: list of float
-        :returns: the point.
+        Args:
+            vertex: The vertex.
+        Returns:
+            the point.


Is it costly to make so that AlphaComplexes do not reorder points (in the simplextree at least) ? so that this function becomes non-necessary ?
There is also the question of removed points, but

is it generically not the case ?

we can also "recreate" the deleted point in the simplextree (with a flag enabled by default)

I think that this behavior is a bit dangerous by default.

For the return type I think it is better to return a numpy array or "à défaut" a tuple.
As loops are cursed in python, another function that could be useful in that case, is a get_points() method, doing the equivalent of this (which is optimized but still slow)

ac = DelaunayComplex(...) st = ac.create_simplex_tree(...) points = np.fromiter((ac.get_point(i) for i in st.get_vertices()), dtype = np.dtype((np.float32, ndim)), count=st.num_vertices())

In general, in python I feel that the classes RipsComplex, AlphaComplex, etc. are not that useful in the interface, and could be replaced by helper functions, directly returning the simplextree;
but that's another topic.

Is it costly to make so that AlphaComplexes do not reorder points (in the simplextree at least) ? so that this function becomes non-necessary ?

That's already the case, the doc you are quoting says "same as the k-th input point". This function is a left-over from an old release where points could be shuffled (only if they were in dimension 3 IIRC). We could hide this function in the doc if it causes too much confusion.

There is also the question of removed points, but
1. is it generically not the case ?

With weights, it is perfectly generic for points to disappear.

2. we can also "recreate" the deleted point in the simplextree (with a flag enabled by default)

I don't understand what you mean by that. If we just add an extra vertex connected to nothing, that will add a connected component, that's much more problematic than a "missing" point.

For the return type I think it is better to return a numpy array or "à défaut" a tuple.

For such a legacy function that mostly remains to avoid breaking old user code, I am not convinced it is worth changing it and risking to break said user code.

In general, in python I feel that the classes RipsComplex, AlphaComplex, etc. are not that useful in the interface, and could be replaced by helper functions, directly returning the simplextree; but that's another topic.

I think @VincentRouvreau must be tired of me saying exactly that for years now 😉 The class is not just unnecessary for users, it also complicates the implementation because half of the work is done in the constructor, forcing us to store intermediate stuff for later use in create_simplex_tree.

On f3ca164 I replaced the classes with free functions and make AlphaComplex class deprecated. I also changed the documentation and the tests on purpose.

IIRC, the argument for classes, was also because we were thinking of the possibility to make incremental alpha complexes. This will be not possible with free functions, but OK for the moment.

src/python/gudhi/delaunay_complex.pyx

mglisse · 2024-10-26T11:28:59Z

are weights with sqrt=True well-defined (e.g., via $x\mapsto \mathrm{sgn}(x)\sqrt{|x|}$) ?

If the weights make some (squared) values negative, I don't think theory provides anything meaningful for the sqrt, no. There are a number of applications where the weights can only increase filtration values, so they remain non-negative. Adding the same constant to all weights (without sqrt, so possibly negative) preserves a lot of things (the triangulation, the order between filtration values), that's one way to avoid negative values, but it obviously does not preserve filtration values, so it would be confusing.
I didn't check if the code replaces only the negative ones with NaN, or if this propagates to random nonsense.

do you think it would be reasonable to put the sqrt to True ? (I'm in favor of it, to make it consistent with other complexes).

I am rather strongly against changing the behavior of old code, that's really not a nice thing to do to users. However

better documentation is always good. The existence of this new option counts as documentation and may bring this issue to users' attention.
for new interfaces (CechPersistence, maybe a function providing the same functionality as AlphaComplex without a class, etc), using sqrt by default sounds good (again with a strong warning in the doc to point out the difference with the old interface), although we need to decide what to do exactly for the weighted case.

VincentRouvreau · 2024-11-21T15:57:35Z

Some benchmark to compare std::sqrt inside functions (as done on this PR) and when std::sqrt is performed after construction with for_each_simplex. (cf. src/Alpha_complex/benchmark/Alpha_complex_1d_benchmark.cpp)

master - without sqrt

$ for seed in 40 41 42 43 44; do ./Alpha_complex_1d_benchmark 5000000 $seed; done
... create simplex tree from alpha complex: 7.906s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.398s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... create simplex tree from alpha complex: 7.837s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.369s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... create simplex tree from alpha complex: 7.906s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.37s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... create simplex tree from alpha complex: 7.876s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.38s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... create simplex tree from alpha complex: 7.887s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.373s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.

sqrt inside functions - cf. branch alpha_sqrt

$ for seed in 40 41 42 43 44; do ./Alpha_complex_1d_benchmark 5000000 $seed; done
... create simplex tree from alpha complex: 7.975s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.43s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... create simplex tree from alpha complex: 8.021s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.431s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... create simplex tree from alpha complex: 7.866s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.401s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... create simplex tree from alpha complex: 7.927s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.395s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... create simplex tree from alpha complex: 7.87s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.394s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.

for_each_simplex(sqrt) - cf. branch alpha_for_each_sqrt

$ for seed in 40 41 42 43 44; do ./Alpha_complex_1d_benchmark 5000000 $seed; done
... create simplex tree from alpha complex: 8.494s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.499s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... create simplex tree from alpha complex: 8.099s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.496s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... create simplex tree from alpha complex: 8.345s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.529s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... create simplex tree from alpha complex: 8.157s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.468s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... create simplex tree from alpha complex: 8.15s
Alpha complex is of dimension 1 - 9999999 simplices - 5000000 vertices.
... simplex tree assign MEB: 2.391s
Delaunay-Cech complex is of dimension 1 - 9999999 simplices - 5000000 vertices.

Co-authored-by: David Loiseaux <[email protected]>

c++ part of square root filtration values for alpha and delaunay-cech…

986ad6f

… complex

VincentRouvreau added the enhancement New feature or request label Oct 1, 2024

VincentRouvreau added 6 commits October 1, 2024 14:31

square_root_filtrations was missing in documentation for Alpha complex

6be76bc

Add new feature in changes

7b14133

Python version of square root filtrations for alpha complex and delau…

2cdc152

…nay cech. Python hints for delaunay complex family

Merge branch 'master' into alpha_complex_square_root_filtrations_option

9d1f74a

python hints not well managed on cython class

c578b93

Add some python test for square_root_filtrations versions

46f5616

VincentRouvreau marked this pull request as ready for review October 14, 2024 16:05

mglisse reviewed Oct 19, 2024

View reviewed changes

DavidLapous reviewed Oct 25, 2024

View reviewed changes

VincentRouvreau and others added 7 commits November 21, 2024 17:00

Merge branch 'master' into alpha_complex_square_root_filtrations_option

5f7c202

code review: Test filtration values are positive, before sqrt

4d14cbc

code review: typing.Iterable s deprecated since python 3.9

f80347f

Co-authored-by: David Loiseaux <[email protected]>

mypy + cython + sphinx compromise

dc78dc1

code review: rename square_root_filtrations as output_squared_values

5d9a564

code review: go with free function instead of classes

f3ca164

Fix some tests and improve test_output_squared_values

e74f9e5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Square root filtration values for alpha and delaunay-cech complex #1138

Square root filtration values for alpha and delaunay-cech complex #1138

VincentRouvreau commented Oct 1, 2024 •

edited

Loading

mglisse Oct 19, 2024

mglisse Oct 19, 2024

mglisse Oct 19, 2024

VincentRouvreau Nov 21, 2024

mglisse Oct 19, 2024

VincentRouvreau Nov 21, 2024

mglisse Oct 19, 2024

VincentRouvreau Nov 22, 2024

mglisse Oct 19, 2024

VincentRouvreau Nov 22, 2024

DavidLapous left a comment •

edited

Loading

DavidLapous Oct 25, 2024

mglisse Oct 26, 2024

VincentRouvreau Nov 22, 2024

DavidLapous Nov 22, 2024 •

edited

Loading

DavidLapous Oct 25, 2024 •

edited

Loading

DavidLapous Oct 25, 2024

mglisse Oct 26, 2024

VincentRouvreau Nov 26, 2024

mglisse commented Oct 26, 2024

VincentRouvreau commented Nov 21, 2024

		// For users to be able to define their own sqrt function on their desired Filtration_value type
		using std::sqrt;

		std::clog << "Alpha complex can be, or not, weighted (requires a file containing weights values).\n\n";
		std::clog << "Weighted Alpha complex can have negative filtration values. If '-square-root-filtrations' is";

	def set_float_relative_precision(precision: float):
	def set_float_relative_precision(double precision):

Square root filtration values for alpha and delaunay-cech complex #1138

Are you sure you want to change the base?

Square root filtration values for alpha and delaunay-cech complex #1138

Conversation

VincentRouvreau commented Oct 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DavidLapous left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DavidLapous Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

DavidLapous Oct 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mglisse commented Oct 26, 2024

VincentRouvreau commented Nov 21, 2024

master - without sqrt

sqrt inside functions - cf. branch alpha_sqrt

for_each_simplex(sqrt) - cf. branch alpha_for_each_sqrt

VincentRouvreau commented Oct 1, 2024 •

edited

Loading

DavidLapous left a comment •

edited

Loading

DavidLapous Nov 22, 2024 •

edited

Loading

DavidLapous Oct 25, 2024 •

edited

Loading