Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a multi-objective pistonball environment #10

Merged
merged 8 commits into from
Dec 8, 2023
Merged

Conversation

wilrop
Copy link
Collaborator

@wilrop wilrop commented Nov 6, 2023

I implemented a multi-objective version of Pistonball. This essentially boils down to separating the three components of the original reward function and exposing these as a vector reward instead.

@wilrop wilrop requested review from ffelten and umutucak and removed request for ffelten November 6, 2023 16:10
Copy link
Collaborator

@ffelten ffelten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, only one minor comment

)
self.reward_dim = 3 # [global, local, time]
self.reward_spaces = {
f"piston_{i}": Box(low=-np.inf, high=np.inf, shape=(self.reward_dim,), dtype=np.float32)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it possible to have better bounds on this?

@wilrop
Copy link
Collaborator Author

wilrop commented Nov 24, 2023

I've added more informative reward bounds with documentation for how I obtained these bounds. If someone could briefly check whether this makes sense that would be good. I also ran random policies with 50 different seeds to verify whether the rewards were indeed within the specified bounds and everything seemed okay.

@wilrop wilrop merged commit 0b83951 into main Dec 8, 2023
5 checks passed
@wilrop wilrop deleted the mo-pistonball branch February 1, 2024 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants