Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request]: consolidate "jobs" and "slots"/"chains" for simulation / projection #395

Open
pearsonca opened this issue Nov 11, 2024 · 3 comments
Assignees
Labels
batch Relating to batch processing. low priority Low priority.

Comments

@pearsonca
Copy link
Contributor

Label

meta/workflow

Priority Label

low priority

Is your feature request related to a problem? Please describe.

User interface item. Currently, simulation / projection takes both a jobs (parallelization) and slots/chains argument (which appears to be used in a ... samples? parallelization? other?) sense.

Is your feature request related to a new application, scenario round, pathogen? Please describe.

No response

Describe the solution you'd like

Eliminate (or otherwise clarify) the point of this argument.

@TimothyWillard
Copy link
Contributor

TimothyWillard commented Nov 11, 2024

GH-394 is still a work in progress, but I believe the gempyor.batch.JobSize class in that PR addresses, or at least provides a start for, this issue. See:

@dataclass(frozen=True, slots=True)
class JobSize:
"""
A batch submission job size.
Attributes:
jobs: The number of jobs to use.
simulations: The number of simulations to run per a block.
blocks: The number of sequential blocks to run per a job.
Raises:
ValueError: If any of the attributes are less than 1.
"""
jobs: int
simulations: int
blocks: int
def __post_init__(self) -> None:
for p in self.__slots__:
if (val := getattr(self, p)) < 1:
raise ValueError(
(
f"The '{p}' attribute must be greater than 0, "
f"but instead was given '{val}'."
)
)
@classmethod
def size_from_jobs_sims_blocks(
cls,
jobs: int | None,
simulations: int | None,
blocks: int | None,
inference_method: Literal["emcee"] | None,
) -> "JobSize":
"""
Infer a job size from several explicit and implicit parameters.
Args:
jobs: An explicit number of jobs.
simulations: An explicit number of simulations per a block.
blocks: An explicit number of blocks per a job.
inference_method: The inference method being used as different methods have
different restrictions.
Returns:
A job size instance with either the explicit or inferred job sizing.
"""
if inference_method == "emcee":
return cls(jobs=jobs, simulations=blocks * simulations, blocks=1)
return cls(jobs=jobs, simulations=simulations, blocks=blocks)

@TimothyWillard TimothyWillard added batch Relating to batch processing. low priority Low priority. labels Nov 11, 2024
@pearsonca
Copy link
Contributor Author

GH-394 is still a work in progress, but I believe the gempyor.batch.JobSize class in that PR addresses, or at least provides a start for, this issue. See:

@dataclass(frozen=True, slots=True)
class JobSize:
"""
A batch submission job size.
Attributes:
jobs: The number of jobs to use.
simulations: The number of simulations to run per a block.
blocks: The number of sequential blocks to run per a job.
Raises:
ValueError: If any of the attributes are less than 1.
"""
jobs: int
simulations: int
blocks: int
def __post_init__(self) -> None:
for p in self.__slots__:
if (val := getattr(self, p)) < 1:
raise ValueError(
(
f"The '{p}' attribute must be greater than 0, "
f"but instead was given '{val}'."
)
)
@classmethod
def size_from_jobs_sims_blocks(
cls,
jobs: int | None,
simulations: int | None,
blocks: int | None,
inference_method: Literal["emcee"] | None,
) -> "JobSize":
"""
Infer a job size from several explicit and implicit parameters.
Args:
jobs: An explicit number of jobs.
simulations: An explicit number of simulations per a block.
blocks: An explicit number of blocks per a job.
inference_method: The inference method being used as different methods have
different restrictions.
Returns:
A job size instance with either the explicit or inferred job sizing.
"""
if inference_method == "emcee":
return cls(jobs=jobs, simulations=blocks * simulations, blocks=1)
return cls(jobs=jobs, simulations=simulations, blocks=blocks)

Seems likely - there will also need to be some adaptation at the dispatch stage, as these arguments get passed several layers deep from the simulate method.

Some of the problem here seems to be muddying together two considerations which are theoretically orthogonal (how much parallelization, how many independent inference processes) but are practically pretty much always the same, and then propagating that confusion to places where it doesn't actually apply (projection shouldn't know about inference chains; probably needs to know about parallelization and maybe samples?).

@TimothyWillard TimothyWillard self-assigned this Nov 18, 2024
@jcblemai
Copy link
Collaborator

jcblemai commented Dec 3, 2024

which are theoretically orthogonal (how much parallelization, how many independent inference processes) but are practically pretty much always the same, a

Just wanted to add to tgat that with some methods such as emcee, there is often need for more chains that parallel jobs depending on the number of parameters and the size of compute boxes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
batch Relating to batch processing. low priority Low priority.
Projects
None yet
Development

No branches or pull requests

3 participants