Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type hints and type loop variables #946

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Conversation

Joao-Dionisio
Copy link
Collaborator

Based on Dominik's comments, I decided to try to add types again.

Needs a thorough revision before merging, as it's dangerous. All tests are passing, but of course something might have escaped.

The main benefit is not so much the speed from the looping variables, but rather the type checking in the arguments, which should improve error catching.

(For some very annoying reason we have a branch called main which is different from master)

@Joao-Dionisio Joao-Dionisio marked this pull request as draft January 16, 2025 15:00
@Joao-Dionisio Joao-Dionisio marked this pull request as ready for review January 16, 2025 15:12
Copy link

codecov bot commented Jan 16, 2025

Codecov Report

Attention: Patch coverage is 0% with 215 lines in your changes missing coverage. Please review.

Project coverage is 52.96%. Comparing base (1db095e) to head (ead3365).
Report is 45 commits behind head on master.

Files with missing lines Patch % Lines
src/pyscipopt/scip.pxi 0.00% 174 Missing ⚠️
src/pyscipopt/conshdlr.pxi 0.00% 18 Missing ⚠️
src/pyscipopt/expr.pxi 0.00% 8 Missing ⚠️
src/pyscipopt/lp.pxi 0.00% 7 Missing ⚠️
src/pyscipopt/branchrule.pxi 0.00% 3 Missing ⚠️
src/pyscipopt/cutsel.pxi 0.00% 1 Missing ⚠️
src/pyscipopt/heuristic.pxi 0.00% 1 Missing ⚠️
src/pyscipopt/nodesel.pxi 0.00% 1 Missing ⚠️
src/pyscipopt/reader.pxi 0.00% 1 Missing ⚠️
src/pyscipopt/sepa.pxi 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #946      +/-   ##
==========================================
+ Coverage   52.54%   52.96%   +0.41%     
==========================================
  Files          20       21       +1     
  Lines        4345     4473     +128     
==========================================
+ Hits         2283     2369      +86     
- Misses       2062     2104      +42     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -23,16 +23,16 @@ cdef class Branchrule:
'''informs branching rule that the branch and bound process data is being freed'''
pass

def branchexeclp(self, allowaddcons):
def branchexeclp(self, SCIP_Bool allowaddcons):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these methods not defined with cdef? I would say in the python world we should use bool and in the c world SCIP_Bool. So where are we here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said in the bigger comment, these functions should be accessible from the Python side, so we need to use def.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I guess to avoid confusion we should also use bool instead of SCIP_Bool and convert when entering the C world again.

Copy link
Collaborator

@Opt-Mucca Opt-Mucca Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SCIP_Bool does not make sense here. The user should never have access to such a type and the type hint should simply be bool (like Dominik suggested).
Cython will handle the conversion or use a local declaration of SCIP_Bool inside of the function.

@@ -29,7 +29,7 @@ cdef class Conshdlr:
'''informs constraint handler that the branch and bound process is being started '''
pass

def consexitsol(self, constraints, restart):
def consexitsol(self, constraints, SCIP_Bool restart):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about constraints?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure if they were allowed to be just one or not and was too lazy to check

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the list of constraints, I would not allow it to be a single constraint, can we declare it an iterable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can declare something as an iterable in Python type hints. We might need to then import the collections library

@@ -23,7 +23,7 @@ cdef class Cutsel:
'''executed before the branch-and-bound process is freed'''
pass

def cutselselect(self, cuts, forcedcuts, root, maxnselectedcuts):
def cutselselect(self, list[SCIP_ROW] cuts, int forcedcuts, SCIP_Bool root, int maxnselectedcuts):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if we have def then the int means the Python int whereas SCIP_Bool is the C unsigned int, or am I wrong?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure how this works, because in things like addCons I type all the options as SCIP_Bools, but then assign them Python booleans. I guess Cython makes some kind of automatic conversion?

Copy link
Contributor

@DominikKamp DominikKamp Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am not aware of the Cython rules, but I find these type declarations confusing because

  1. list[SCIP_ROW] seems to be a Python-container of a C-type
  2. int is the C-int
  3. SCIP_Bool is the C-uint

and since this should be a Python-callback, I would prefer if it is consistently supplied with Python-objects.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure about Cython, but I think there must be a definitive difference between this callback and a normal Python function. That's because using this sort of syntax (e.g. int forcedcuts) leads to an invalid syntax error.

Type hints in Python would use the following syntax (and not be enforced, making them somewhat useless):

def cutselselect(self, cuts: list[SCIP_ROW] , forcedcuts: int, root: bool, maxnselectedcuts: int):

Further, in these callbacks, bool is not recognized as a type identifier.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we're mixing type hints and variable type declarations here.
Regardless: list[SCIP_ROW] would be incorrect because the callback receives list[Row], where one in theory could then access row._scip_row, which has type SCIP_ROW.

tests/test_gomory.py Outdated Show resolved Hide resolved
@DominikKamp
Copy link
Contributor

Could you maybe explain to me at first the convention across def and cdef and its influence on the interpretation of int?

In general, it would be good to not mix up types from C and Python in a function signature.

@Joao-Dionisio
Copy link
Collaborator Author

Joao-Dionisio commented Jan 17, 2025

Hey @DominikKamp, thank you for being so quick! From my understanding, when you def a function, you're treating it as a Python function and arguments are interpreted as Python objects, and the opposite is true for cdef. The problem is that cdef functions aren't accessible from the Python side.

One crazy idea is to have an intermediate cdef function between the Python functions and the SCIP functions. Something like:

def getConss(self, bool transformed=True):
    cdef SCIP_Bool _transformed
    if transformed: 
         _transformed = True
    else:
        _transformed = False
    return _getConss(_transformed)

cdef _getConss(SCIP_Bool _transformed):
    ...

We would get a slightly bigger speedup from doing this, buuut this does feel too crazy.

Not sure if relevant, but using cdef to type a variable inside a function defined by def can optimize the code. E.g., cdefing a looping variable tells Cython to treat it like a C loop.

EDIT: I guess I could replace the signature types with pure Python types. I'm just worried about replacing SCIP_Real with float because of what I tried to do back in the other PR. I'm also not sure how I feel by replacing SCIP_PROPTIMING, etc., with int, either.

@DominikKamp
Copy link
Contributor

I like your suggestion as it is much cleaner. The only thing to avoid is combining cdef and float because this gives single precision. Combining def and float should be fine, since this results in a Python-float with double precision.

@Joao-Dionisio
Copy link
Collaborator Author

When using float instead of SCIP_Real, I get two failing tests: test_pricer, test_gomory. When I convert isLT, isGE, etc. to use SCIP_Reals, while the other functions keep using float, then test_gomory passes. I don't really know what goes on under the hood, but maybe Cython is making some weird conversions.

I cannot type booleans as bool (I get a type not identified error). I believe there are three possibilities. Use the object type for a pure Python argument, use cint which ends up being the same but slightly faster, or use SCIP_Bool.

The crazy idea with the intermediate function wasn't too well received, as it would complicate the codebase for a very minor gain.

@DominikKamp
Copy link
Contributor

It is difficult to understand what you have tested exactly. Could you maybe push your local version, even if it fails?

Using object does maybe not apply actual restrictions so that it could also be omitted. What about keeping the def function arguments unconstrained and only change cdef functions and local declarations?

@Joao-Dionisio
Copy link
Collaborator Author

Joao-Dionisio commented Jan 18, 2025

@DominikKamp just pushed the version that fails.

My main desire with adding type hints was to make the codebase more robust and help users understand where their bugs might be coming from. In some cases, we get a Not implemented error or something equally uninformative because the functions are being called with arguments from an incorrect type. So, in this view, adding types to def functions would be the main goal.

@DominikKamp
Copy link
Contributor

Well, Python just does not support type declarations, see https://stackoverflow.com/a/3933210. This is only a Cython feature, so that also every type declared there is a C-type. Hence, using float value always means declaring a single-precision C-float number, whereas saying value : float means to recommend assigning a double-precision Python-float object here (this is where the numerical errors come from).

My desire here actually was only to revert 5e6a747 and use SCIP_Real instead of float there at first. The improvement seems to be blocked only by this bug. The more involved interface change can then still be done.

However, I would prefer to use only type hints in def functions since otherwise the users will never be able provide the intended Python-objects (because C-types would be required), which defeats the sense of this interface. The C-conversions should be done internally.

@Joao-Dionisio
Copy link
Collaborator Author

Joao-Dionisio commented Jan 19, 2025

My point was that these callbacks are not Python functions, precisely because they accept type declarations. I can call model.isLT(3,4) even if I declare the input to be SCIP_REAL (which is defined using ctypedef double SCIP_REAL).
After adding C-type declaration to the methods, the tests pass, meaning that the users can pass Python objects and Cython does some sort of conversion internally by itself. (EDIT 2: Yeah, also basically what you already said. Didn't sleep very well, sorry :) )

Regarding your middle point, I can create another PR just for the loop variables and take a breather on this one, but I would still like to see it merged one day. Would this be okay? (EDIT: Just reread, and it was basically what you said hehe I'll work on doing it today, then)

@DominikKamp
Copy link
Contributor

Okay, then these functions can be defined by cdef, right?

@Joao-Dionisio
Copy link
Collaborator Author

Joao-Dionisio commented Jan 19, 2025

Okay, then these functions can be defined by cdef, right?

No, because functions defined with cdef can't be called from Python, unfortunately. But cpdef also exists, I'll take a look. Useful link: Cython Function Declarations

@DominikKamp
Copy link
Contributor

Alright, I just mean that the documentation might become difficult when functions expect types Python is not aware of. Furthermore, there is no way to detect conversion issues if an implicit conversion is enforced. This is particularly troublesome for implicit Python-int -> C-int conversions because this will just overflow silently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants