You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In #72 we automated the observers, states, and dones, but we did not do actors because the step function cannot be easily consolidated into a simple format. After we do #337, we should have a better idea of how to streamline the action processing.
# Note: This is the theoretical approach we could take in stepping the simulation.
# The first 4 lines are boilerplate and good to have in this super class.
# After that, the result is unique to the actor. The output may be different, and
# what the simulation should do with that output is also different. This
# could be streamlined after we do #337. For now, we leave it up to the subclass
# to implement the actors and step function.
# def step(self, action_dict, **kwargs):
# if not self._warning_issued:
# raise UserWarning("It is best practice to implement your own step function.")
# for actor in self._actors:
# for agent_id, action in action_dict.items():
# agent = self.agents[agent_id]
# if agent.active:
# result = actor.process_action(agent, action, **kwargs)
# if result: # Positive result
# self.rewards[agent_id] += 1
# else:
# self.rewards[agent_id] -= 0.1
# for agent_id in action_dict:
# self.rewards[agent_id] -= 0.01
In #72 we automated the observers, states, and dones, but we did not do actors because the step function cannot be easily consolidated into a simple format. After we do #337, we should have a better idea of how to streamline the action processing.
To be done after #337
The text was updated successfully, but these errors were encountered: