-
-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid silent integer downcast in TimeDelta.deserialize #2654
base: dev
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR ❤️ This seems like the right behavior.
Lgtm but I'll give @lafrech some time to take a look in case I'm not thinking of some case where end users might rely on the current behavior.
@@ -741,7 +748,7 @@ def test_timedelta_field_deserialization(self): | |||
assert isinstance(result, dt.timedelta) | |||
assert result.days == 0 | |||
assert result.seconds == 12 | |||
assert result.microseconds == 0 | |||
assert result.microseconds == 900000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 This was explicitly asserting that microseconds were lost. I'm not sure why, though
[I enabled issues again. The spam streak is over on flask-smorest. Let's hope it is here too.] I don't have a full understanding of the issue but in case it went unnoticed, here are related issues/PRs:
Also, the docstring should be updated if we merge. |
Hi @lafrech 👋 You're right, the docstring needed some work. Please see the last commit. I hope that also clarifies the bug that this PR fixes: needlessly using |
ea0509a
to
0755fe1
Compare
Docs update looks good to me. Thanks @ddelange . Thanks for digging up the original issue and PRs @lafrech. Looking back at those, I think allowing truncation to an integer was an oversight on my part rather than an explicit decision. Even the ISO8601 spec mentioned in #105 (comment) allows for fractional components:
@lafrech @deckar01 Do we consider this a breaking change? Even though this is a much smaller change than previous major version bumps, I'm not opposed to a major version bump with some explanatory text in the changelog that says that the change is very limited in scope and only affects users of the |
fwiw, this change only affects (helps) users that were serializing using I would say this does not warrant a major version bump, but rather a bugfix release. |
It would also affect users who didn't explicitly pass |
right. the good ol' rely on a bug motto. programming sucks... |
lol, yeah... i guess i'm not 100% sure it's a bug, given that there was an explicit test for the current behavior. i'm like 80% sure it was oversight, but it was long ago so want to give a bit of time for others to chime in before shipping the change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure.
One of our concerns when designing a field is how values round-trip.
Current behaviour lets the user decide whether the serialized version should be int
or float
, and considers it is being given serialized values accordingly. If values are serialized as int
, then the type that is expected as input when deserializing is int
.
The original name of the parameter was serde_type
, not very nice but showing it works both ways.
Before this PR:
# TimeDelta(serialization_type=int)
12 -> timedelta(seconds=12) -> 12
12.42 -> timedelta(seconds=12) -> 12
# TimeDelta(serialization_type=float)
12 -> timedelta(seconds=12) -> 12.0
12.42 -> timedelta(seconds=12, microseconds=420000) -> 12.42
After this PR:
# TimeDelta(serialization_type=int)
12 -> timedelta(seconds=12) -> 12
12.42 -> timedelta(seconds=12, microseconds=420000) -> 12
# TimeDelta(serialization_type=float)
12 -> timedelta(seconds=12) -> 12.0
12.42 -> timedelta(seconds=12, microseconds=420000) -> 12.42
From this perspective, I'm not sure the latter is less surprising behaviour.
The PR makes the field a bit more permissive on inputs, accepting floats when expecting integers. This is not in line with Integer
field which truncates floats.
If floats are to be expected, then serialization_type=float
should be used.
If the issue is that TimeDelta
defaults to int
then maybe the default could be changed in next major version. Until then, nothing prevents the user to specify float
, right?
I might be totally mistaken, in which case please ignore.
@@ -1526,7 +1528,7 @@ def _serialize(self, value, attr, obj, **kwargs): | |||
|
|||
def _deserialize(self, value, attr, data, **kwargs): | |||
try: | |||
value = self.serialization_type(value) | |||
value = float(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that when deserializing an int
this introduces an error due to floating point precision?
Like float(42)
-> 42.0000000000000000001
Answering to myself, no because this will be rounded in timedelta:
>>> field.deserialize(12.9000000001)
datetime.timedelta(seconds=12, microseconds=900000)
Allow (de)serialization to `float` through use of a new `serialization_type` parameter. | ||
`int` is the default to retain previous behaviour. | ||
Allow serialization to `float` through use of a new `serialization_type` parameter. | ||
Defaults to `int` for backwards compatibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a new versionchanged line here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never truncate a `float` to `int` during deserialization.
too concise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe
Never truncate a `float` to an `int` before deserializing it into a :class:`datetime.timedelta`.
Thanks for your elaborate review!
To me the latter definitely looks 'right'. From another perspective: that float came from marshmallow (starting off with a timedelta). It got serialized and sent elsewhere for deserialization. When the receiving party deserializes, in the latter example he doesn't need prior knowledge about the sender's settings. Deserialization has now become robust. |
Is there anything wrong with I mean does the change in this PR allow a use case that can't be achieved with current parameters? Or is the only benefit to make it the default choice? |
re: anything wrong - my argument here is that serialization_type should be used only for serialization and not for deserialization. but that's about it :) |
So would it be fine if the parameter was named If it is just that, we could use a better naming and deprecate the old one. |
from my side there's no need for a breaking change here, this change just seemed like a nice robustness improvement to me (not considering this breaking). it only came up because a user was surprised that his float was being silently truncated when feeding into @sloria's environs package. since that package only ever deserializes (values from unknown sources), this seemed like the right way forward. but feel free to close / take over if you do consider this a 'bad' change, no hard feelings :) will look uglier than now, but environs could fix this without any PR om marshmallow by subclassing |
This PR allows propagating floats/strings with decimals to TimeDelta.deserialize without silently downcasting the input to int. Currently,
serialization_type
is not only being used for serialization, but also being used for deserialization, which adds no added value: a float cast can always safely be used for deserialization and is backwards compatible (but fixes this bug).originally reported here: sloria/environs#372
MRE:
After this PR:
I added one test for string input, and changed the existing test for float input.
There are already
math.isclose
tests when passing floats to microseconds (which will be rounded by the datetime.timedelta constructor).