-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Profile data mirroring #6723
Draft
GeigerJ2
wants to merge
60
commits into
aiidateam:main
Choose a base branch
from
GeigerJ2:feature/verdi-profile-mirror
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
9597527
to
1c4b67b
Compare
ce20e4c
to
2dfe2ca
Compare
09e6f01
to
074b053
Compare
Take it from here
Either in groups, or not associated with any group. Either sorted by groups, or in a top-level flat hierarchy. "De-duplication" works by symlinking calculations if they are part of a workflow. Next, check what happens if a workflow is part of two groups -> Here, de-deplucation should actually make more sense.
for more information, see https://pre-commit.ci
Add `BaseDumper`, `ProfileDumper` and `CollecionDumper` -> `GroupDumper` Remove code related to data and rich dumping
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
- Use the `BaseDumper` instead of passing arguments to the `ProcessDumper` - Append PKs to the test output paths and use `aiida_profile_clean` fixture for reproducible results
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
And back to `CollectionDumper`
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
06bc55e
to
b795eda
Compare
for more information, see https://pre-commit.ci
7cccc3a
to
99c7276
Compare
f1cce61
to
2ea72f9
Compare
6d9c50d
to
8971fbf
Compare
b9fd668
to
b4cc58b
Compare
e1be6cb
to
09c434c
Compare
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
ba25a5c
to
568af45
Compare
for more information, see https://pre-commit.ci
02acb56
to
8e0cfdc
Compare
for more information, see https://pre-commit.ci
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Design questions:
mtime
? Possibly not. Other way to get this change, e.g., check collections since the last dump?incremental=False, overwrite=False
: Will error out withFileExistsError
if the directory exists, goes through if doesn't exist or empty.incremental=True, overwrite=False
: Will keep original main directory, but update subdirectories with new data.incremental=False, overwrite=True
: Will clean main directory and perform the dumping again from scratch.True
,--overwrite
will take precedence, and a report message will be issued to the user. This is because--incremental
is by defaultTrue
, as it is the most sensible option, and should not be required to be always specified. However, if also--overwrite
is set, we don't raise an exception (as I had it initially implemented), as that would require the user to always pass--overwrite --no-incremental
, which is annoying. Automatically setting--incremental
toFalse
if--overwrite
is specified could be handled by aclick
callback, but for now I just change the options on the fly at a later stage in the code.dump
method of each Dumper class, so that the path can be set accordingly via the Python API.dump_parent_path
(absolute, defaults to CWD) and anoutput_path
part (relative, either provided by the user, or automatically generated), which, combined, yield the full top-level path where the files are dumped.General notes:
--delete-missing
option? -> Possibly usegraph_traversal_rules
like forverdi node delete
when updating directories after a node was deleted.verdi profile mirror --delete-missing
is executed? Should also keep track of the groups in the DumpLogger, and delete the directory in that case.dump_parent_path
is the CWD from which the dumping/mirroring command is called, whiledump
still provides anoutput_path
parameter to denote the directory name of the profile, group, or process that will be dumped. This is optional, and if not provided by the user, it will be auto-generated.graph_traversal_rules
and addget_nodes_dump
tosrc/aiida/tools/graph/graph_traversers.py
, as well asAiidaEntitySet
fromsrc/aiida/tools/graph/age_entities.py
, etc., to first obtain the nodes, and then run the dumping.(Possible) future TODOs:
DumpLogger
?CollectionDumper
and allow for mixed node typesBugs
README
for dumped processes in wrong (too high) directory