Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why create_colocated_worker_cls and spawn #29

Open
eelxpeng opened this issue Nov 28, 2024 · 0 comments
Open

Why create_colocated_worker_cls and spawn #29

eelxpeng opened this issue Nov 28, 2024 · 0 comments

Comments

@eelxpeng
Copy link

It seems that there are two ways to make two worker group to use the same resource pool:

  1. Separate Workers on Same Resource Pool:
# Creates two separate worker groups sharing resources
actor_wg = RayWorkerGroup(resource_pool=resource_pool, ray_cls_with_init=actor_cls)
critic_wg = RayWorkerGroup(resource_pool=resource_pool, ray_cls_with_init=critic_cls)
  1. Colocated Workers:
# Creates a single worker group that implements both actor and critic
cls_dict = {'actor': actor_cls, 'critic': critic_cls}
ray_cls_with_init = create_colocated_worker_cls(cls_dict)
wg_dict = RayWorkerGroup(resource_pool=resource_pool, ray_cls_with_init=ray_cls_with_init)

# Spawns interfaces to access different functionalities
spawn_wg = wg_dict.spawn(prefix_set=cls_dict.keys())

With the latest main branch of Ray package where the fix ray-project/ray#48088 is merged, the first method should work without any problem to allow two different worker group to re-use the same GPUs. It appears that the 2nd method is not that straightforward. But the PPO implementation uses the second method to put all worker groups to the global pool. Is there any specific reason for preferring the 2nd method? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant