-
-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Box casting and sampling edge-cases #774
Fix Box casting and sampling edge-cases #774
Conversation
@pseudo-rnd-thoughts Looks like casting |
@Jammf That seems pretty bad. Before we make a change, can we be precise on the current functionally about this.
If you can think of any more relevant questions please add, would like to not mess up the Box space for old users where there isn't an issue already. |
@Jammf how are you casting >>> import numpy as np
>>> int(np.inf)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: cannot convert float infinity to integer
>>> np.inf.astype(np.int)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'float' object has no attribute 'astype' |
@Kallinteris-Andreas The relevant code is in Gymnasium/gymnasium/spaces/box.py Lines 322 to 329 in 8333df8
(There's also similar logic for casting+broadcasting scalar It looks like With just numpy's casting, since >>> import numpy as np
>>> a = np.array(np.inf)
>>> a.dtype
dtype('float64')
>>> a.astype(np.int16)
<stdin>:1: RuntimeWarning: invalid value encountered in cast
array(-1, dtype=int16)
>>> a.astype(np.int16, casting="safe")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Cannot cast scalar from dtype('float64') to dtype('int16') according to the rule 'safe' |
@pseudo-rnd-thoughts I spent some time looking into it the past few days, and it does seem pretty messy. I tried to condense the issues down to just the main points, but let me know anything doesn't make sense. Core issues:
Other related discussions are at
Current behaviorI made some charts to capture the current casting behavior in
Also, Exponential sampling is replaced with Normal if both high and low are unbounded, but I didn't include those cases. Normal sampling should lead to valid samples for floats and sints, and have a 50% chance of generating an invalid value for uints. Sampling from something as biased as a standard normal or exponential distribution doesn't seem ideal, but that's really a separate discussion. As for potential fixes, I have a few ideas ordered by how much it'd break existing code. There's probably other solutions I missed, or mixes of different solutions would work. And of course, there's always the option to just do nothing. Proposal 1We could change uints/sints to be like how inf sints are currently handled, add clipping to sampling to prevent invalid samples, and raise errors for clearly nonsensical values (having nans, or
Proposal 2We could change infs and out-of-range ints to be like they were before d1f35fe, making it so they're always bounded.
Proposal 3We could raise an error if an out-of-range uint/sint is passed to
Proposal 4We could make Box a continuous space only, plan to deprecate non-float Box, and redirect to the proper discrete spaces (MultiDiscrete and MultiBinary) instead. This would break existing pixel-observation envs that uses Box to encode image observations (and most of them probably do use Box), and would make a lot of tutorials be incorrect. |
@Jammf Thank you for your highly detailed comment, this is as complex as I feared I found the proposals a bit confusing so used your tables and some testing to propose what I think is the optimal solution that makes the most sense (and preserves backward compatibility) Proposed (optimal) solutionLooking at your proposals this is a bit different from all of yours with my primary change being to raise more errors in cases where sampling will most likely not make sense / work for users and not change anything anywhere else. What are your thoughts? Edit: the error raising would happen within
The exponential sampling doesn't make much sense as the exponential distribution values (with scale=1.0) are so small compared to the range of values. However, due to backward compatibility, I don't think we can change this. We could add clipping at the end to ensure that samples are within the bounds but this is very unlikely, only possible with extreme values of one bound being inf and the other being very close to the bound value. @Kallinteris-Andreas @jjshoots @RedTachyon do any of you have additional comments or thoughts? |
I like @pseudo-rnd-thoughts's approach, makes it very verbose and doesn't silently allow the user to do something that the Box space was not supposed to do. |
@jjshoots apologies I didn't put in my comment that the error raising would happen all within the |
Foremost, the most important purposes of the spaces if to inform the learning algorithm what its observation/action spaces are, and secondly to sample an action space.
[1] as @JJshots mentioned, this should be an error.
To address this, we could add an optional argument (example name |
I can't quite remember why I did min+2/max-2, but it was fairly important for some other part of the Box space. Perhaps we'll figure out why that's needed when we remove it and see if tests fail. |
I like the idea of raising errors for For For unsigned I also agree with @Kallinteris-Andreas that we should either have infs for both For sampling, I think that geometric is more correct conceptually than rounding an exponential sample. But it would also break backward compatibility. But if users need that precise of reproducibility, then they'll probably be pinning their gymnasium version anyway, so perhaps it's not that much of an issue? I also like the idea of having adding |
I suspect that the MAX-2 and MIN-2 was to account for I believe we are very close to an agreed change to
This adds the scale implicitly and removes unsigned integers in-range / inf and still uses exponential rather than geometric. For the geometric vs exponential, there seems to be minimal difference between the distribution (someone can correct but I believe that exponential is the continuous version of geometric distribution). So I'm not sure it makes sense to change Looking at the tests that will need changing, could we add more testing for non-standard distributions, I would call the scaling variable, |
What is the reason for this choice Also, I do not think (other than that I agree) |
I found relatively few usages of bool Box in a search of public GitHub repos, so it'd probably only have minor impact to remove it or redirect to MultiBinary. (Though, the link only has single-line usages, I wasn't able to figure out how to get a multi-line regex to work nicely) |
Crap thought I was looking at a different PR. My bad. |
I will try to finish this PR by the end of the week. As the technical details are complex, I've decided to start with the tests to outline our expectations of what the space should be. This doesn't cover the sampling currently but that hasn't changed. I found it easier to rewrite most of the tests to order them into different groupings: shape, dtype, low/high and infinite space I will work on implementing the changes this evening hopefully |
Over a month later, I've finished this PR by spending 5 hours on this and rewriting most of the Box init function. A large amount of the function is processing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine from a first pass, i will have to do a second (more detailed) review pass
gymnasium/spaces/box.py
Outdated
assert ( | ||
high.shape == shape | ||
), f"high.shape doesn't match provided shape, high.shape: {high.shape}, shape: {shape}" | ||
# most of the code between these two functions are copied with minor changes for < and > |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the comment for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next two function calls, I can remove it
gymnasium/spaces/box.py
Outdated
|
||
super().__init__(self.shape, self.dtype, seed) | ||
|
||
def _cast_low(self, low, dtype_min): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like a docstring here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
Description
Changes Box to raise an error during init if either
low
orhigh
are outside the limits of the Box's dtype. Values that arenp.inf
,-np.inf
, ornp.nan
are ignored.Also changes Box's sampling to prevent an edge-case. Having a
low
value near a dtype's max int and an unboundedhigh
value (or vice versa) would lead to an integer overflow (underflow) and invalid sampled values. Now the values are clipped before casting to ensure they're within the limits of the Box's dtype. Fornp.int64
, it also clips again after casting, since the cast itself can cause the value to go outside the Box bounds for extreme values.I also had to change a unit test for the DtypeObservation wrapper, since CartPole's observation space now cannot be cast to
np.int32
due to it using large (but finite) bounds outside thenp.int32
range.Fixes #768
Type of change
Checklist:
pre-commit
checks withpre-commit run --all-files
(seeCONTRIBUTING.md
instructions to set it up)I have made corresponding changes to the documentation