Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-127787: refactor helpers for PyUnicodeErrorObject internal interface #127789

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

picnixz
Copy link
Contributor

@picnixz picnixz commented Dec 10, 2024

  • Unify get_unicode and get_string in a single function.

  • Allow to retrieve the underlying object attribute, its size and its start and end indices in one round.

  • Use a common implementation for the following functions:

    • PyUnicode{Decode,Encode}Error_GetEncoding
    • PyUnicode{Decode,Encode,Translate}Error_GetObject
    • PyUnicode{Decode,Encode,Translate}Error_{Get,Set}Reason
    • PyUnicode{Decode,Encode,Translate}Error_{Get,Set}{Start,End}

Note that there are some cosmetic changes here and there (in the naming of parameters) but these are essentially in prevision of #127694 in order to reduce the conflicts I'll need to solve (there will be conflicts probably but ideally, I want them to be minimal).

I've moved all helpers before the public API. I could move them inbetween but I felt that it's cleaner that way (it also allowed me to put double blank lines between functions a bit more easily).

- Unify `get_unicode` and `get_string` in a single function.
- Allow to retrieve the underlying `object` attribute and its
  size in one round.
- Use a common implementation for the following functions:

  - `PyUnicode{Decode,Encode}Error_GetEncoding`
  - `PyUnicode{Decode,Encode,Translate}Error_GetObject`
  - `PyUnicode{Decode,Encode,Translate}Error_{Get,Set}Reason`
  - `PyUnicode{Decode,Encode,Translate}Error_{Get,Set}{Start,End}`
@picnixz picnixz marked this pull request as ready for review December 10, 2024 12:27
@picnixz
Copy link
Contributor Author

picnixz commented Dec 10, 2024

@encukou I've designed a _PyUnicodeError_GetParams which allows to retrieve object, size, start, end and check whether start and end are consistent or not as well. This could help in the codecs handlers (but I just need to check whether I need < or <=).

NVM: just removing the parameter. It's easier to make the check start < end outside.

@picnixz picnixz marked this pull request as draft December 13, 2024 16:41
@picnixz
Copy link
Contributor Author

picnixz commented Dec 13, 2024

@encukou A little implementation question. Do you think it's preferrable to have

PyObject *
PyUnicodeEncodeError_GetEncoding(PyObject *self)
{
    int rc = check_unicode_error_type(self, "UnicodeEncodeError");
    return rc < 0 ? NULL : unicode_error_get_encoding_impl(self);
}

with unicode_error_get_encoding_impl working on generic UnicodeError objects (just assertion casts) or do you prefer unicode_error_get_encoding_impl to actually be the one performing the following check with an additional expect_type parameter:

int rc = check_unicode_error_type(self, expect_type);

Unless I use generating maocrs, I'll end up either duplicating the expect_type strings, or by duplicating int rc = .... Personally, today I feel that it reads better as it is now, but tomorrow maybe I may prefer a "short" implementation of the public API itself.

@picnixz picnixz marked this pull request as ready for review December 13, 2024 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant