Skip to content

Commit

Permalink
deploy: 8ab93c3
Browse files Browse the repository at this point in the history
  • Loading branch information
zingale committed Jan 30, 2024
1 parent d5e335e commit f3b32a7
Show file tree
Hide file tree
Showing 4 changed files with 44 additions and 30 deletions.
8 changes: 4 additions & 4 deletions _downloads/f1505febfbe5937d242281ec790ebc2d/process.xrb
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ work_dir=`pwd`
HPSS_DIR=`basename $work_dir`

# set HTAR command
HTAR=/usr/bin/htar
HTAR=htar

# path to the ftime executable -- used for making a simple ftime.out file
# listing the name of the plotfile and its simulation time
Expand Down Expand Up @@ -229,12 +229,12 @@ function process_files
datestr=$(date +"%Y%m%d_%H%M_%S")
ftime_files=$(find . -maxdepth 1 -name "ftime.out" -print)
inputs_files=$(find . -maxdepth 1 -name "inputs*" -print)
probin_files=$(find . -maxdepth 1 -name "probin*" -print)
diag_files=$(find . -maxdepth 1 -name "*diag.out" -print)
model_files=$(find . -maxdepth 1 -name "*.hse.*" -print)
slurm_files=$(find . -maxdepth 1 -name "*.slurm" -print)
job_files=$(find . -maxdepth 1 -name "*.slurm" -print) $(find . -maxdepth 1 -name "*.submit" -print)
process_files=$(find . -maxdepth 1 -name "process*" -print)

${HTAR} -cvf ${HPSS_DIR}/diag_files_${datestr}.tar ${model_files} ${ftime_files} ${inputs_files} ${probin_files} ${slurm_files} ${process_files} >> /dev/null
${HTAR} -cvf ${HPSS_DIR}/diag_files_${datestr}.tar ${model_files} ${ftime_files} ${inputs_files} ${probin_files} ${job_files} ${process_files} >> /dev/null


# Loop, waiting for plt and chk directories to appear.
Expand Down
34 changes: 20 additions & 14 deletions _sources/nersc-hpss.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -67,21 +67,27 @@ The following describes how to use the scripts:
overwriting the stored copy, especially if a purge took place. The
same is done with checkpoint files.

Some additional notes:

Additionally, if the ``ftime`` executable is in your path
(``ftime.cpp`` lives in ``amrex/Tools/Plotfile/``), then
the script will create a file called ``ftime.out`` that lists the name
of the plotfile and the corresponding simulation time.

Finally, right when the job is submitted, the script will tar up all
of the diagnostic files, ``ftime.out``, submission script, inputs and
probin, and archive them on HPSS. The .tar file is given a name that
contains the date-string to allow multiple archives to co-exist. When
``process.xrb`` is running, it creates a lockfile (called
``process.pid``) that ensures that only one instance of the script is
running at any one time. Sometimes if the machine crashes, the
``process.pid`` file will be left behind, in which case, the script
aborts. Just delete that if you know the script is not running.
* If the ``ftime`` executable is in your path (``ftime.cpp`` lives in
``amrex/Tools/Plotfile/``), then the script will create a file
called ``ftime.out`` that lists the name of the plotfile and the
corresponding simulation time.

* Right when the job is run, the script will tar up all of the
diagnostic files, ``ftime.out``, submission script, and inputs and
archive them on HPSS. The ``.tar`` file is given a name that contains
the date-string to allow multiple archives to co-exist.

* When ``process.xrb`` is running, it creates a lockfile (called
``process.pid``) that ensures that only one instance of the script
is running at any one time.

.. warning::

Sometimes if the job is not terminated normally, the
``process.pid`` file will be left behind, in which case, the script
aborts. Just delete that if you know the script is not running.

Jobs in the xfer queue start up quickly. The best approach is to start
one as you start your main job (or make it dependent on the main
Expand Down
30 changes: 19 additions & 11 deletions nersc-hpss.html
Original file line number Diff line number Diff line change
Expand Up @@ -144,19 +144,27 @@ <h1>Archiving Data to HPSS<a class="headerlink" href="#archiving-data-to-hpss" t
same is done with checkpoint files.</p>
</li>
</ol>
<p>Additionally, if the <code class="docutils literal notranslate"><span class="pre">ftime</span></code> executable is in your path
(<code class="docutils literal notranslate"><span class="pre">ftime.cpp</span></code> lives in <code class="docutils literal notranslate"><span class="pre">amrex/Tools/Plotfile/</span></code>), then
the script will create a file called <code class="docutils literal notranslate"><span class="pre">ftime.out</span></code> that lists the name
of the plotfile and the corresponding simulation time.</p>
<p>Finally, right when the job is submitted, the script will tar up all
of the diagnostic files, <code class="docutils literal notranslate"><span class="pre">ftime.out</span></code>, submission script, inputs and
probin, and archive them on HPSS. The .tar file is given a name that
contains the date-string to allow multiple archives to co-exist. When
<code class="docutils literal notranslate"><span class="pre">process.xrb</span></code> is running, it creates a lockfile (called
<code class="docutils literal notranslate"><span class="pre">process.pid</span></code>) that ensures that only one instance of the script is
running at any one time. Sometimes if the machine crashes, the
<p>Some additional notes:</p>
<ul>
<li><p>If the <code class="docutils literal notranslate"><span class="pre">ftime</span></code> executable is in your path (<code class="docutils literal notranslate"><span class="pre">ftime.cpp</span></code> lives in
<code class="docutils literal notranslate"><span class="pre">amrex/Tools/Plotfile/</span></code>), then the script will create a file
called <code class="docutils literal notranslate"><span class="pre">ftime.out</span></code> that lists the name of the plotfile and the
corresponding simulation time.</p></li>
<li><p>Right when the job is run, the script will tar up all of the
diagnostic files, <code class="docutils literal notranslate"><span class="pre">ftime.out</span></code>, submission script, and inputs and
archive them on HPSS. The <code class="docutils literal notranslate"><span class="pre">.tar</span></code> file is given a name that contains
the date-string to allow multiple archives to co-exist.</p></li>
<li><p>When <code class="docutils literal notranslate"><span class="pre">process.xrb</span></code> is running, it creates a lockfile (called
<code class="docutils literal notranslate"><span class="pre">process.pid</span></code>) that ensures that only one instance of the script
is running at any one time.</p>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>Sometimes if the job is not terminated normally, the
<code class="docutils literal notranslate"><span class="pre">process.pid</span></code> file will be left behind, in which case, the script
aborts. Just delete that if you know the script is not running.</p>
</div>
</li>
</ul>
<p>Jobs in the xfer queue start up quickly. The best approach is to start
one as you start your main job (or make it dependent on the main
job). The sample <code class="docutils literal notranslate"><span class="pre">process.xrb</span></code> script will wait for output and then
Expand Down
Loading

0 comments on commit f3b32a7

Please sign in to comment.