Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a OneOf space for exclusive unions #812

Merged
merged 15 commits into from
Mar 11, 2024
Merged

Conversation

RedTachyon
Copy link
Member

@RedTachyon RedTachyon commented Dec 4, 2023

See #594

This is more of an exclusive tuple, but effectively the same thing.

I opted for the name OneOf as opposed to Union, sacrificing some math-notation-niceness, but making it more immediately obvious what it is. Plus I think GRPC uses oneof similarly.

Some design choices might be controversial:

  • A sample in this space has to contain the index of the subspace, as well as the proper sample. This is to disambiguate spaces like OneOf(Discrete(5), Discrete(5))
    • This makes mathematical sense - elements of a disjoint union of sets have to contain the index, and this is the best mathematical model for what we're doing
    • Supporting repeated subspaces is imo very important, otherwise cases like OneOf(Discrete(n), Discrete(m)) need special treatment when n==m, and nobody likes that
  • It flattens to a box of size max_size + 1, where max_size is the maximum size of any subspace. The 1 is the index, and it comes at the beginning of the representation
  • Flattened samples are filled with nans if they don't use up all the space
    • Because of this, a Box can now be nullable, which is used when checking belonging to a space

Overall I think this is the best combination of questionable choices, but I'm not super attached to them if someone has better options.

I also think this is useful to actually include for two reasons:

  • Practical - it enables things like "move a joystick or press a key"
  • Mathematical - with this, we're kinda emulating type theory with product (Tuple) and sum (OneOf) types. Maybe we'll see a math nerd use it for some cool theoretical results.

@pseudo-rnd-thoughts
Copy link
Member

Spending some time thinking about this new space, I'm not sure about adding it.
This is where I would like us to a gymnasium-contrib for random extra features that should not be core but could be helpful to use users.
As this space seems to me very rare in use and for the environment is makes it difficult to use as it can be unclear what option was taken.

Therefore, I would currently be in favour of closing this PR and not merging but if I have missed something, please say

@RedTachyon
Copy link
Member Author

for the environment is makes it difficult to use as it can be unclear what option was taken

I don't think that's true? By design, it includes the index of the space. So in a space OneOf(Discrete(5), Box(-1, 1, (2,)), Discrete(3)), samples from each respective subspace would be (0, 3), (1, [0.3, -0.2]), (2, 1), with no ambiguity

As this space seems to me very rare in use

There are two things here. Is it currently rarely in use? Yes, because it doesn't exist. People will naturally gravitate to things that exist and are supported.

If we add it, will it still be rarely in use? Probably to some extent, same as the other "weird" spaces like graph or text. But I think it's likely that it will see some interesting use - certain video games (stacraft), desktop environments, maybe even some industrial simulations. It's basically the generic version of "You can do A or B", with A or B having some internal structure.

Finally, it really makes sense in terms of theory. Ask any programming language theory nerd, they'll tell you that algebraic datatypes are the best thing ever. Every time you write TypeA | TypeB, i.e. a sum type, that's a OneOf space. Pattern matching that was recently introduced in Python, is basically a utility for sum types (if it's a string, do this, and if it's a float, do that).

You could get more creative and use it for observations - resizeable observations, custom in-environment error handling, active perception (do you want to read some in-game text, or look around?)

My point with research tooling is always that it should be as flexible as reasonably possible. I don't think adding this space has significant downsides or that much maintenance overhead, but it adds a ton of flexibility, and allows expressing things that are very natural to think about.

A reasonable contrib-y middle ground might be a new experimental? Then over time, we'd either let it chill there, fizzle out, or merge it in the main code if people use it. But tbh I'd still prefer to just add it as a "regular" supported space, it just makes too much sense.

@daniel-redder
Copy link

daniel-redder commented Jan 12, 2024

Hello, I am a member of The University of Georgia's THINC Lab, which focuses on RL, IRL, and Robotics. (see article, AI Mag Article ) discusses Open Task environments.

These are environments where groups of states (and associated actions) change dynamically in the environment. An example we reference is a RideShare domain

image

Where each passenger constitutes a task with actions associated to each.

A use case for the OneOf space could be to handle making non-task-specific actions when handling the dynamic nature of an open-task environment. A good example of this type of global action is NOOP.

d = Dict(
    {
        "taskID": Discrete(len(tasks)),
        "taskActions": Discrete(len(B)),
    }
    
)

c = Discrete(2) #NOOP

Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through the code, I understand much more what is happening.
With a couple of changes, I think we can get this merged quickly.

The primary change I made was to flatten such that the fill value is the flat_sample's first value removing the need for nan support. As a result, nullible can be removed from Box.

I added comments to the rest of the PR of proposed changes (along with removing nullible from Box)

Finally, you need to add the following to gymnasium/wrappers/utils

@create_zero_array.register(OneOf)
def _create_one_of_zero_array(space: OneOf):
    return 0, create_zero_array(space.spaces[0])

gymnasium/spaces/oneof.py Outdated Show resolved Hide resolved
This method draws independent samples from the subspaces.

Args:
mask: An optional tuple of optional masks for each of the subspace's samples,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we support masking of the subspace index? This adds more complexity that might not be needed

gymnasium/spaces/utils.py Outdated Show resolved Hide resolved
gymnasium/spaces/utils.py Outdated Show resolved Hide resolved
gymnasium/vector/utils/space_utils.py Outdated Show resolved Hide resolved
@@ -271,7 +271,7 @@ def _flatten_oneof(space: OneOf, x: tuple[int, Any]) -> NDArray[Any]:
max_flatdim = flatdim(space) - 1 # Don't include the index
if flat_sample.size < max_flatdim:
padding = np.full(
max_flatdim - flat_sample.size, np.nan, dtype=flat_sample.dtype
max_flatdim - flat_sample.size, flat_sample[0], dtype=flat_sample.dtype
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now wondering if it might be better to use np.empty instead if we're not using a dedicated placeholder value.

Either way, the array will be filled with some throwaway data, but with empty we're not pretending that this data has any actual meaning.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran a quick check and this fails test_flat_space_contains_flat_points as the flattened version might not be contained within the flattened space.
But this is just a weird artifact of flatten rather than anything actually incorrect with the approach

@pseudo-rnd-thoughts pseudo-rnd-thoughts merged commit 2b2e853 into main Mar 11, 2024
16 checks passed
@pseudo-rnd-thoughts pseudo-rnd-thoughts deleted the oneof-space branch March 11, 2024 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants