Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Possible ways to implement "exclusive dictionary" action spaces #594

Closed
SiyuanQi opened this issue Jul 9, 2023 · 10 comments
Closed
Labels
question Further information is requested

Comments

@SiyuanQi
Copy link

SiyuanQi commented Jul 9, 2023

Question

In Gymnasium we can define an action space with two possible actions like below:

box = gymnasium.spaces.Box(0,1)
action_space = gymnasium.spaces.Dict({ 'a' : box, 'b' : box })

where both action a and b would be executed.

I wonder if there's a way in Gymnasium to implement an "exclusive dictionary" or "discrete dictionary", so that the agent can only choose from either a or b to perform?
In a more general case, a and b can be different composite spaces (and possibly "exclusive dictionaries"), which makes the space hierarchical.

@SiyuanQi SiyuanQi added the question Further information is requested label Jul 9, 2023
@pseudo-rnd-thoughts
Copy link
Member

This is an interesting idea, we considered adding this more for PettingZoo where you could have a variable number of agents acting in a timestep.
This is easy to implement, I'm just uncertain if this is common enough for us to add it
@RedTachyon Do you have any thoughts?

@SiyuanQi
Copy link
Author

This is an interesting idea, we considered adding this more for PettingZoo where you could have a variable number of agents acting in a timestep. This is easy to implement, I'm just uncertain if this is common enough for us to add it @RedTachyon Do you have any thoughts?

I think it might be easy to have this space, but how to make an appropriate flatten version of that would be something to think about.
I feel there are a lot of use cases for that, basically when an agent can perform only one action at a time but with different potential parameters for each possible action.

@RedTachyon
Copy link
Member

I'm cautiously positive about this, and I feel like this might have a solid mathematical foundation somewhere behind it, much like algebraic data types. We already have product types (tuples and dicts), this would be a sum type. A practical-ish example of a sum space would be something like:

Union(
    Discrete(num_keys),
    Box(..., size=(2,))
)

which roughly represents what you can do with a keyboard and mouse - press a button or move the mouse (ignoring some domain-specific nuances like the ability to press multiple buttons at once)

To have it a bit more grounded, could you share your specific application? Like where would this actually be useful?

Flattening probably would be fine the same way as for Tuple spaces, it can't be bijective anyways

@SiyuanQi
Copy link
Author

SiyuanQi commented Jul 10, 2023

I think the keyboard & mouse example is a good example, especially for different structures of exclusive actions.
As another specific example (that might be less heterogeneous), say we are controlling a unit in Starcraft. We can either "move" or "attack" at a certain time step, but we would likely choose different coordinates for different actions. We can include more actions into this if the unit has more capabilities, but the unit might by allowed to do only one action at a time.

@pseudo-rnd-thoughts
Copy link
Member

pseudo-rnd-thoughts commented Jul 10, 2023

My only issue is the name, Union, generally implies only one of some options is selected where I understood exclusive to mean a couple of the options can be used at the same time. I proposed Container previously but I don't think it is a much better name

@SiyuanQi Would you be interested making a PR for this?

@RedTachyon
Copy link
Member

where I understood exclusive to mean a couple of the options can be used at the same time.

Wait isn't that exactly the opposite of what exclusive means? My undestanding was that the whole idea is you can only choose one branch

@pseudo-rnd-thoughts
Copy link
Member

That is a good point, ignore tired mark

@SiyuanQi
Copy link
Author

I am interested in making a PR for this, but I still feel the flattening is worth more discussion. Some questions that I don't have a clear idea now: how do we distinguish it from ordinary dictionaries in the flattened version? How do we know if an action is chosen or not from the flattened vector? Maybe we can simply add a Discrete in the space to represent the choice, but it might not be the best solution.

@pseudo-rnd-thoughts
Copy link
Member

I think for flatten_space(Union) -> Union, it doesn't change the space, the problem I see is more the batch_space(Union) function, how do we stack observations if the two samples from two different subspaces are selected

@pseudo-rnd-thoughts
Copy link
Member

OneOf space was added in #812 for is an exclusive dictionary space

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants