-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support "T" and "V" dtypes in from_dtype
#4226
base: master
Are you sure you want to change the base?
Conversation
10ef133
to
c683d41
Compare
It seems possible that |
https://github.com/HypothesisWorks/hypothesis/pull/472/files#diff-8bf1182f853824fabcaef20e0c3fb9db9e2c512422633969813c00afcf0c3b69R36 suggests that I just didn't see a sensible semantics for it, and https://github.com/HypothesisWorks/hypothesis/pull/826/files#diff-f750b73af7d56838bbd646dd35542feafdec3b6f08243d1edd677d22965d8230R290-R292 that support for record/compound/nested dtypes was a concern - plausibly obviated by the dtypes overhaul in Numpy 2.0. I'd like to see tests here which explicitly pass all of record-dtypes, array-dtypes, nested combinations, etc. If we provide a
I'm generally against this; there are just too many things that can be coerced to dtypes - and e.g.: |
result = st.binary( | ||
**compat_kw( | ||
"min_size", max_size=None if dtype.itemsize == 0 else dtype.itemsize | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should generate all itemsize
bytes if that's not None, rather than variable-length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally missed that Vn
pads to n with \x00
, thanks
# (except for void, which does not have a default) | ||
cond = lambda _: True if typ is np.void else lambda x: x != type(x)() | ||
x = find_any(from_type(typ), cond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just skip the void
case here; as you comment there's no notion of a default value and so this test is not meaningful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left it in because this test doubles as a "can draw at all from from_dtype" - that's probably clearer if I split this into two (albeit mostly redundant) tests, and assume-away void in this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just stick a top-level if type is np.void:
/else:
statement in this test; it'll save a bit of time and still be clear.
# Also allow generating undetermined-width dtypes like "S" / "S0"? Possibly with | ||
# a new parameter allow_undetermined? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weakly against uncapped dtypes; they make sense as an aid to interactive use (eg notebooks) but aren't actually a dtype that an array can have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the flipside, they *are* a valid dtype, and I think we should avoid compromising completeness unless there's a reason to. Generating them seems harmless if you're passing directly to np.array(dtype=)
(equivalent to the largest elem size), and if you're testing a function which accepts generic np.dtype
then imo that's the place where you most want hypothesis to generate weird dtypes like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see your point, but I can't think of an API that I like more than writing | st.just(np.dtype("S"))
! Perhaps we can just document that trick, and explain why/when it's useful?
(probably) resolves #4039. I did look around for what it would take to support user-defined dtypes, but found...little documentation on the topic in the numpy docs 😅. I found along the way that we don't support the void dtype, so this pull adds support for that too.
I have some local work on accepting dtype-coercible strings in
from_dtype
(dtype: Union[np.dtype, str]
) - would that change be welcome as well?nps.from_dtype(np.dtype("int8"))
feels a bit cumbersome, but I also understand wanting to keep the api surface tight.