Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Three-way comparison function for strings #364

Open
ChetanKarwa opened this issue Mar 27, 2021 · 9 comments
Open

Three-way comparison function for strings #364

ChetanKarwa opened this issue Mar 27, 2021 · 9 comments
Labels
idea Proposition of an idea and opening an issue to discuss it topic: strings String processing

Comments

@ChetanKarwa
Copy link

I went through the stdlib_string_type module and I want to recommend some changes. Correct me if I am wrong at any point.

  • The lgt, llt, lge and lle work very similarly to the operators >, <, >=, <= respectively according to me there is not much need for all these functions.
  • Instead of all these we can have a general function compare(str1,str2) this function will return an integer value:
    • 1 if str1 is lexicographically greater than str2,
    • 0 if str1 is lexicographically equal to str2,
    • -1 if str1 is lexicographically smaller than str2.
  • We can also include a function like compare_ignore_case(str1,str2) this function is similar to compare but is not case sensitive. (can improve nomenclature of the functions)
@ChetanKarwa ChetanKarwa changed the title stdlib_string_type module's functions can be modified stdlib_string_type module's functions should be modified Mar 27, 2021
@ivan-pi
Copy link
Member

ivan-pi commented Mar 27, 2021

  • The lgt, llt, lge and lle work very similarly to the operators >, <, >=, <= respectively according to me there is not much need for all these functions.

Quoting the gfortran Fortran language reference:

In general, the lexical comparison intrinsics LGE, LGT, LLE, and LLT differ from the corresponding intrinsic operators .GE., .GT., .LE., and .LT., in that the latter use the processor’s character ordering (which is not ASCII on some targets), whereas the former always use the ASCII ordering.

On most modern processors the native character ordering will be ASCII, so the two will be de facto the same, but this is not mandated the standard. Hence, the need to maintain two sets of character comparison functions for consistency with the standard.

@ivan-pi
Copy link
Member

ivan-pi commented Mar 27, 2021

The new functions you propose sound like they could be useful in some circumstances, even if they offer the same functionality. For example they can be used in a select case construct:

select case (compare(str1,str2))
case (1) ! greater than
...
case(-1) ! less than
...
case default ! equal
...
end select

This can be weighed against the following if-else

if (lgt(str1,str2)) then
...
else if (llt(str1,str2)) then
...
else
...
end if

@awvwgk
Copy link
Member

awvwgk commented Mar 27, 2021

I wonder if Fortran would allow us to use operator(<=>)? But operator(compare) would work for me as well.

An implementation would be most suitable in stdlib_ascii with a thin wrapper in stdlib_string_type to allow normal character variables as input as well.

@ivan-pi
Copy link
Member

ivan-pi commented Mar 27, 2021

Do you have some reason to prefer the operator syntax? Personally, I find the functional version easier to grasp and also has less characters when importing. But I'm fine with having both.

use stdlib_string_type, only: compare, operator(.compare.)

if (compare(str1,str2) > 0) then
...
end if

if ((str1 .compare. str2) > 0)
...
end if

I assume these functions would follow the ASCII ordering and not the default processor one?

@ChetanKarwa
Copy link
Author

I would vote for the functional version, it would be easier to get used to it.
ASCII order is widely used and accepted so it's the best option for implementing the compare function.

@ChetanKarwa ChetanKarwa changed the title stdlib_string_type module's functions should be modified add new functions to stdlib_string_type.f90 Mar 27, 2021
@ChetanKarwa

This comment has been minimized.

@awvwgk

This comment has been minimized.

@ivan-pi

This comment has been minimized.

@awvwgk awvwgk changed the title add new functions to stdlib_string_type.f90 Three-way comparison function for strings Mar 27, 2021
@awvwgk awvwgk added idea Proposition of an idea and opening an issue to discuss it topic: utilities containers, strings, files, OS/environment integration, unit testing, assertions, logging, ... and removed idea Proposition of an idea and opening an issue to discuss it labels Mar 27, 2021
@ivan-pi
Copy link
Member

ivan-pi commented Mar 27, 2021

I wonder if Fortran would allow us to use operator(<=>)? But operator(compare) would work for me as well.

Perl uses cmp for strings. I would suggest the operator version be .cmp..

I would also note that the qsort routine in C expect a three way compare function. Maybe something we can think about in connection with issue #98 on sorting.

On the other side, Python 3.x decided to remove cmp (reasons in Why is the cmp parameter removed from sort/sorted in Python3.0?).

@awvwgk awvwgk added topic: strings String processing and removed topic: utilities containers, strings, files, OS/environment integration, unit testing, assertions, logging, ... labels Sep 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Proposition of an idea and opening an issue to discuss it topic: strings String processing
Projects
None yet
Development

No branches or pull requests

3 participants