You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is "inspired" by the problem originally reported in #438 (comment) with a proposed fix (closed without merge) in #451 to just let tar ignore disappearing files.
Although we still do not know underlying trigger (lingering cleanup process or alike), this specific behavior reminded that in many cases we would like to provide a path to some location (on remote resource) which pipelines could use as a scratch space.
In https://github.com/ReproNim/reproman/pull/438/files#diff-5b4aa18b79cf44a38ba925fff658fd8cR129 I just added that work/ directory to .gitignore. And that probably (will try next) should theoretically be sufficient if I use datalad-pair orchestrator which should datalad save remotely and use datalad update to fetch results.
In case of datalad-pair-run, the content is first tar'ed on remote side (hence that original "inspirational" issue of files disappearing in a work/ directory) including the not-so-needed work dir, which might be huge, so we should allow for that to be avoided.
The easiest way is to specify some work directory outside of the dataset which gets datalad saved/transferred. But ideally
it cannot be a fixed name
it should be allowed to be not job specific (e.g. if I am to rerun some failed computation, would not be reused across jobs),
it should be allowed to be job specific (to avoid any side effects).
so I guess we should
allow to define variables per each resource (e.g. I could assign to smaugscratchdir = /mnt/btrfs/scrap/tmp)
expose those, jobid, datalad_dataset_id (if datalad-pair*) variables so they could be used to format the command to be executed. So I would specify -w {scratchdir}/jobidfor the case avoiding side-effects, and smth like-w {scratchdir}/myanal` if I want it to be shared .
Also relates to #467 ("cleanup") on what to do with such directories upon success/failure.
The text was updated successfully, but these errors were encountered:
This is "inspired" by the problem originally reported in #438 (comment) with a proposed fix (closed without merge) in #451 to just let
tar
ignore disappearing files.Although we still do not know underlying trigger (lingering cleanup process or alike), this specific behavior reminded that in many cases we would like to provide a path to some location (on remote resource) which pipelines could use as a scratch space.
In https://github.com/ReproNim/reproman/pull/438/files#diff-5b4aa18b79cf44a38ba925fff658fd8cR129 I just added that
work/
directory to.gitignore
. And that probably (will try next) should theoretically be sufficient if I usedatalad-pair
orchestrator which shoulddatalad save
remotely and usedatalad update
to fetch results.In case of
datalad-pair-run
, the content is firsttar
'ed on remote side (hence that original "inspirational" issue of files disappearing in awork/
directory) including the not-so-neededwork
dir, which might be huge, so we should allow for that to be avoided.The easiest way is to specify some work directory outside of the dataset which gets
datalad saved
/transferred. But ideallyso I guess we should
smaug
scratchdir = /mnt/btrfs/scrap/tmp
)jobid
,datalad_dataset_id
(ifdatalad-pair*
) variables so they could be used to format the command to be executed. So I would specify-w {scratchdir}/
jobidfor the case avoiding side-effects, and smth like
-w {scratchdir}/myanal` if I want it to be shared .Also relates to #467 ("cleanup") on what to do with such directories upon success/failure.
The text was updated successfully, but these errors were encountered: