Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow more variables to be replaced in the @mpi decorator #10

Open
kinow opened this issue Feb 22, 2024 · 1 comment
Open

Allow more variables to be replaced in the @mpi decorator #10

kinow opened this issue Feb 22, 2024 · 1 comment

Comments

@kinow
Copy link

kinow commented Feb 22, 2024

Level

MINOR

Component

*PYTHON BINDING

Environment

  • COMPSs version
  • Java / Python / C version
  • Operating System

Description

The @mpi decorator allows some variables to be replaced during runtime, like working_dir:

@mpi(binary="srun", ...., working_dir="{{ working_dir_exe }}"
@task(
  working_dir_exe={Type=INOUT, Prefix: "#"},
  returns=int
)
def _esm_simulation(working_dir)
  ...

But others (e.g. processes and proceses_per_node) do not seem to work. Because of that, values cannot be easily passed as configuration, and ended up having to be exported as environment variables (I had some issues with that, not sure if my env/test), or hard-coded.

In the end, to avoid having to choose between these two options, I ended up wrapping my pyCOMPSs task function in another function, so that the outter function acts as a builder/factory for my task, passing the parameters (this way the decorator is able to replace them only when the parameters are ready). But that's not very user-friendly.

Thank you!

def esm_simulation(
        log_file: str,
        working_dir: str,
        binary: str,
        runner: str,
        processes: str,
        processes_per_node: str
) -> int:
    # N.B.: We return a function here, similar to a decorator, but we
    #       are binding all the variable values (doing a poor-man's
    #       variable hoisting?), so that PyCOMPSs and srun are able
    #       to access the values on runtime. Using export VAR=VALUE
    #       in a subprocess call did not work, neither did calling
    #       os.env[VAR]=VALUE.
    @on_failure(management='IGNORE')
    @mpi(binary=binary,
         runner=runner,
         processes=processes,
         processes_per_node=processes_per_node,
         working_dir="{{working_dir_exe}}",
         fail_by_exit_value=True,
         )
    @task(
        log_file={
            Type: FILE_OUT,
            StdIOStream: STDOUT
        },
        working_dir_exe={
            Type: INOUT,
            Prefix: "#"
        },
        returns=int)
    def _esm_simulation(
            log_file: str,
            working_dir_exe: str
    ) -> Optional[int]:  # type: ignore
        """PyCOMPSs task that executes the ``FESOM_EXE`` binary."""
        pass

    return _esm_simulation(log_file, working_dir)

Minimal example to reproduce

https://github.com/eflows4hpc/workflow-registry/blob/9b232cf8aa9c6d285a662168eea4ae9fef73a74b/Pillar_II/esm/src/fesom2/__init__.py

Exception

NA

Expected behaviour

Users can parametrize all the parameters in @mpi, replacing by parameters from the @task/function.

@jorgee
Copy link
Member

jorgee commented Feb 22, 2024

Supporting the use of task parameters in the proposed fields ais that parameter values can be dependencies to other tasks so they are not know at this time or will require a serialization. The ones that supported are because the variable is evaluated in the worker just before executing and they are not used at the master. However, processes and processes_per_node are required at scheduling, so the values need to be know at the master.

We will check if we can support just for parameters that are IN and used at the first time (no dependency from other task) and also clarify it in the documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants