Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Use safe accessor to vector elements #1633

Merged
merged 6 commits into from
Jan 2, 2025
Merged

Conversation

Antonov548
Copy link
Contributor

@Antonov548 Antonov548 commented Dec 11, 2024

For #1631 .

Copy link
Contributor

aviator-app bot commented Dec 11, 2024

Current Aviator status

Aviator will automatically update this comment as the status of the PR changes.
Comment /aviator refresh to force Aviator to re-examine your PR (or learn about other /aviator commands).

This PR was merged manually (without Aviator). Merging manually can negatively impact the performance of the queue. Consider using Aviator next time.


See the real-time status of this PR on the Aviator webapp.
Use the Aviator Chrome Extension to see the status of your PR within GitHub.

@Antonov548 Antonov548 changed the title Using safety accessor to vector elements Using safe accessor to vector elements Dec 11, 2024
@szhorvat
Copy link
Member

szhorvat commented Dec 17, 2024

There are two things that bother me in the implementation:

  • We should avoid using int for indexing. The R API has a different type for this (I forget the name) that resolves to either a 32-bit or a 64-bit integer depending on configuration.
  • Naked calls to igraph_errorf() should be avoided. The behaviour depends on context (see how IGRAPH_R_CHECK sets global variables to control this). Normally, error checks are done through return codes using either IGRAPH_CHECK (usual) or IGRAPH_R_CHECK (top level). If we allow such naked calls, then we must keep in mind in what context functions like R_get_int_scalar are or aren't allowed to be called, which is a bit too much mental burden. It is always possible to have a function that takes a pointer to the variable it is going to set, and returns an error code.

I just wanted to note these potential issues. Proceed as you see fit.

src/rinterface_extra.c Outdated Show resolved Hide resolved
src/rinterface.h Outdated Show resolved Hide resolved
Comment on lines 105 to 106
igraph_errorf("Wrong index. Attempt to get element with index %" PRIuPTR " from vector of length %" PRIuPTR ".",
__FILE__, __LINE__, IGRAPH_EINVAL, (uintptr_t) index, (uintptr_t) Rf_xlength(sexp));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps:

Suggested change
igraph_errorf("Wrong index. Attempt to get element with index %" PRIuPTR " from vector of length %" PRIuPTR ".",
__FILE__, __LINE__, IGRAPH_EINVAL, (uintptr_t) index, (uintptr_t) Rf_xlength(sexp));
Rf_error("Wrong index. Attempt to get element with index %" PRIuPTR " from vector of length %" PRIuPTR ".",
file, line, IGRAPH_EINVAL, (uintptr_t) index, (uintptr_t) Rf_xlength(sexp));

with file and line passed from the invocation point (e.g., https://github.com/igraph/rigraph/pull/1633/files#diff-44d6b09dbc4b7f024f1b2df8dcc51a6e3d489408a6f82def8dd49ffb1bcd5dc4R4610) .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't properly unwind the "finally" stack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still better than undefined behavior.

I don't have a good answer here. Let's draft and iterate.

Copy link
Contributor Author

@Antonov548 Antonov548 Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't properly unwind the "finally" stack.

Is it not only the way to properly unwind the finally stack is to call igraph_error() directly?

Copy link
Member

@szhorvat szhorvat Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the C core we use this approach:

igraph_error_t R_get_int_scalar(SEXP sexp, R_xlen_t index, igraph_integer_t *res) {
    if (Rf_xlength(sexp) <= index) {
        IGRAPH_ERRORF("Wrong index. Attempt to get element with index %" PRIuPTR " from vector of length %" PRIuPTR ".", IGRAPH_EINVAL, (uintptr_t) index, (uintptr_t) Rf_xlength(sexp));
    }
    *res = (igraph_integer_t)REAL(sexp)[index];
    return IGRAPH_SUCCESS;
}

Then R_get_int_scalar returns an error code as usual, and can be wrapped in IGRAPH_CHECK or IGRAPH_R_CHECK.

Would this work for you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the C core we use this approach:

igraph_error_t R_get_int_scalar(SEXP sexp, R_xlen_t index, igraph_integer_t *res) {
    if (Rf_xlength(sexp) <= index) {
        IGRAPH_ERRORF("Wrong index. Attempt to get element with index %" PRIuPTR " from vector of length %" PRIuPTR ".", IGRAPH_EINVAL, (uintptr_t) index, (uintptr_t) Rf_xlength(sexp));
    }
    *res = (igraph_integer_t)REAL(sexp)[index];
    return IGRAPH_SUCCESS;
}

Then R_get_int_scalar returns an error code as usual, and can be wrapped in IGRAPH_CHECK or IGRAPH_R_CHECK.

Would this work for you?

Yes, this is what I was thinking about. Thanks.
I'll try this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not only the way to properly unwind the finally stack is to call igraph_error() directly?

Yes in the sense that igraph_error() should be used, but normally it is called through IGRAPH_ERROR, which ensures returning from the function with an error code, which can then be processed by IGRAPH_CHECK.

In igraph 0.10, the unwinding process can potentially have multiple stages, each of which need to be triggered appropriately. This means that igraph_error() should not blindly longjmp. That might leave some stages untriggered.

This is why in the R interface we have IGRAPH_R_CHECK which must only be used at the top level. Only at this level is it safe to longjmp from the error handler, i.e. to call Rf_error(). IGRAPH_R_CHECK controls whether Rf_error() is called by igraph_error() by setting a global flag.

Now one might argue that R_get_int_scalar() is (or should be) only called at the top level anyway, so we can forget all about this. (Note that I am not actually sure whether this is true, there might be some non-top-level functions where things like REAL(sexp)[0] are still used!) If you decide to not worry about this, there's a risk that someone will use R_get_int_scalar() in an inappropriate context in the future, which causes a bug that's only triggered in an error condition, and will be very painful to track down. Thus if you go this route, I recommend clearly documenting what check function is allowed to be called in what context, and perhaps even naming these functions in such a way that any misuse is immediately apparent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thus if you go this route, I recommend clearly documenting what check function is allowed to be called in what context, and perhaps even naming these functions in such a way that any misuse is immediately apparent.

Not sure right now how to manage this, but I would think that functions which are have SEXP type as argument should be used only in top level functions. Which is also the case for R_get_int_scalar(). Maybe it make sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that R_get_int_scalar() and friends should only be called at the top level and before any finalizers are set up. Also agree to document the purpose and the constraints.

@krlmlr krlmlr changed the title Using safe accessor to vector elements feat: Use safe accessor to vector elements Jan 2, 2025
@krlmlr krlmlr enabled auto-merge (squash) January 2, 2025 09:58
@krlmlr krlmlr merged commit 12f0694 into main Jan 2, 2025
6 checks passed
@krlmlr krlmlr deleted the f-1631-vector-index branch January 2, 2025 10:13
@krlmlr
Copy link
Contributor

krlmlr commented Jan 2, 2025

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants