Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add provenance information #11

Open
kreczko opened this issue Feb 21, 2024 · 1 comment
Open

Add provenance information #11

kreczko opened this issue Feb 21, 2024 · 1 comment

Comments

@kreczko
Copy link
Contributor

kreczko commented Feb 21, 2024

As #10 shows, we are starting to add on top of the provided YAML file. These things need to be documented and stored along the output before we move to a distributed system.

The provenance information should contain:

  • the YAML file used for this run
  • extra information added by fasthep-flow
  • a snapshot of software versions (e.g. use fasthep cli for this)
  • a hash for each workflow stage based on its values
  • ability to gather provenance info (executed on node, software versions, date, etc) PER TASK → will require a mechanism to inject work into task (kind of like a pre/post instruction)
  • ability to store all of the above in any format (implement HDF5 to start with)
@kreczko
Copy link
Contributor Author

kreczko commented Oct 15, 2024

#34 has introduced

  • storing YAML config, workflow pkl, source snapshot of tasks in ~/.fasthep/flow/{workflow_name}/{date}/{config_hash}/
  • some metadata

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant