deploy: 4976ae5

DmitryLitvintsev · May 16, 2024 · 8074871 · 8074871
commit 8074871
Show file tree

Hide file tree

Showing 74 changed files with 7,669 additions and 0 deletions.
diff --git a/.buildinfo b/.buildinfo
@@ -0,0 +1,4 @@
+# Sphinx build info version 1
+# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
+config: ac28587959edc37a54325ca3490f1441
+tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/.doctrees/cta_schema.doctree b/.doctrees/cta_schema.doctree
diff --git a/.doctrees/dcache_locations.doctree b/.doctrees/dcache_locations.doctree
diff --git a/.doctrees/dcache_setup.doctree b/.doctrees/dcache_setup.doctree
diff --git a/.doctrees/dcache_sfa.doctree b/.doctrees/dcache_sfa.doctree
diff --git a/.doctrees/enstore2cta_config.doctree b/.doctrees/enstore2cta_config.doctree
diff --git a/.doctrees/enstore2cta_mapping.doctree b/.doctrees/enstore2cta_mapping.doctree
diff --git a/.doctrees/enstore2cta_script.doctree b/.doctrees/enstore2cta_script.doctree
diff --git a/.doctrees/enstore_schema.doctree b/.doctrees/enstore_schema.doctree
diff --git a/.doctrees/environment.pickle b/.doctrees/environment.pickle
diff --git a/.doctrees/example_migration.doctree b/.doctrees/example_migration.doctree
diff --git a/.doctrees/index.doctree b/.doctrees/index.doctree
diff --git a/.nojekyll b/.nojekyll
diff --git a/_images/cta.relationships.real.compact.png b/_images/cta.relationships.real.compact.png
diff --git a/_images/enstoredb.relationships.real.compact.png b/_images/enstoredb.relationships.real.compact.png
diff --git a/_sources/cta_schema.rst.txt b/_sources/cta_schema.rst.txt
@@ -0,0 +1,27 @@
+CTA Schema
+==========
+
+Here is CTA schema:
+
+.. image:: images/cta.relationships.real.compact.png
+
+Unlike Enstore schema, the CTA db schema is more normalized and specifically
+separates out the concepts of logical libraries, virtual organization, storage class into corresponding tables. Therefore, the names of ``virtual_organization``, ``logical_library`` and  ``storage_class`` have to be defined by admin in advance before any file can be written.
+
+Additionally CTA has a concept of ``tape_pool`` that represents logical grouping
+of tapes. Each tape belongs to exactly one tape pool. Tape pools are used to keep data belonging to different VOs, storage_class (via ``archive_route``).
+
+The ``archive_route`` table connects ``storage_class`` to ``tape_pool`` and specifies how many copies a file must have.
+
+File table
+----------
+
+CTA separates the concept of `file` into an abstract ``archive_file`` that may
+have multiple corresponding ``tape_file`` entries. The ``archive_file`` table stores file size; adler32 checksum; disk instance name; ``disk_file_id`` - an inode number on storage front end; unique file id (``archive_file_id``); user UID/GID and a deleted flag.
+
+A ``tape_file`` references ``archive_file`` and contains information that ties it to the tape - like volume id (``vid``); location on the tape and copy number.
+
+Storage class table
+-------------------
+
+The storage class concept is somewhat similar to file family concept of Enstore. Besides unique name ``storage_class_name`` it specifies how many copes a file must have. And it has a reference to ``virtual_organization``.
diff --git a/_sources/dcache_locations.rst.txt b/_sources/dcache_locations.rst.txt
@@ -0,0 +1,9 @@
+dCache file location
+====================
+
+dCache needs to know location of files in CTA. It is done by utilizing URI location style and looks like::
+
+ cta://cta/<pnfsid>?archive_id=<archive_file_id>
+
+When running migration, for each file in `archive_file` these locations are
+back-filled into existing chimera database.
diff --git a/_sources/dcache_setup.rst.txt b/_sources/dcache_setup.rst.txt
@@ -0,0 +1,46 @@
+dCache setup with CTA
+=====================
+
+
+Pool
+----
+
+Deploy dCache-CTA driver on pool node::
+
+ wget https://download.dcache.org/nexus/repository/dcache-cta/dcache-cta-0.8.0-1.noarch.rpm
+ rpm -Uvh --force dcache-cta-0.8.0-1.noarch.rpm
+
+
+Define hsm on pool::
+
+ hsm create cta cta dcache-cta -cta-user=adm -cta-group=eosusers -cta-instance-name=eosdev -cta-frontend-addr=ctahost:17017 -io-port=1094
+
+Each pool on the pool node has to have dedicated port.
+
+Define queue on pool::
+
+ queue define class cta * -pending=100 -total=1 -expire=7200 -open=true
+
+CTA
+---
+
+On CTA end define storage class and archive route::
+
+ cta-admin sc add -n test.cta@cta -c 1 --vo vo -m dcachetest
+ cta-admin ar add -s test.cta@cta -c 1 -t ctasystest -m dcachetest
+
+PoolManager
+-----------
+
+In PoolManager define example dedicated CTA pool group::
+
+ psu create unit -store test.cta@cta
+ psu create ugroup CtaSelGrp
+ psu addto ugroup CtaSelGrp test.cta@cta
+
+ psu create pgroup CtaPoolGroup
+ psu addto pgroup CtaPoolGroup rw-stkendca28a-1
+
+ psu create link CtaLink CtaSelGrp any-protocol world-net
+ psu  set link  CtaLink -readpref=10 -writepref=10 -cachepref=10 -section=default
+ psu addto link CtaLink CtaPoolGroup
diff --git a/_sources/dcache_sfa.rst.txt b/_sources/dcache_sfa.rst.txt
@@ -0,0 +1,29 @@
+SFA files
+=========
+
+One of the issues that has been identified - CTA does not have
+functionality corresponding to Enstore SFA (Small File Aggregation).
+In the nutshell the SFA system is as extension of Enstore system that
+manages  intermediate disk storage on the side (intermediate between
+dCache and Enstore). Depending on policies based on ``file_family``,
+``storage_group``, ``library`` and file size Enstore directs files
+to the intermediate storage for subsequent periodic packaging - tarring
+the small files into large package files that then are written to
+tape more efficiently.
+
+The child/parent relation is captured in the same ``file`` table by
+setting child's ``file.package_id`` to be equal to BFID of the package file.
+
+To read SFA files in dCache/CTA setup this relation has to translate in
+chimera.
+
+There is a solution for it, used by similar to SFA, SAPPHIRE system by dCache.
+We need to translate::
+
+ child_pnfsid, package_pnfsid ->
+    -> dcache://dcache/?store=vo&group=file_family&bfid=child_pnfsid:package_pnfsid
+
+I.e. the child/package relation exists as location in ``t_locationinfo`` Chimera
+table. As long as these locations exist dCache can read these files from CTA using an hsm script. T.e. SAPPHIRE system is not need for reading of SFA files.
+
+This can be populated out of band.
diff --git a/_sources/enstore2cta_config.rst.txt b/_sources/enstore2cta_config.rst.txt
@@ -0,0 +1,7 @@
+
+Configuration
+--------------
+
+Script expects configuration file ``enstore2cta.yaml`` in the current directory or pointed to by environment variable ``MIGRATION_CONFIG``. The yaml file has to have "0600" permission bits and has to have the following parameters defined:
+
+.. literalinclude:: ../etc/enstore2cta.yaml
diff --git a/_sources/enstore2cta_mapping.rst.txt b/_sources/enstore2cta_mapping.rst.txt
@@ -0,0 +1,64 @@
+Enstore to CTA mapping
+======================
+
+.. list-table:: Enstore to CTA mapping
+   :header-rows: 1
+
+   * - Enstore
+     - CTA
+     - Comment
+   * - ``volume.label``
+     - ``vid = volume.label[:6]``
+     -
+   * - ``volume.storage_group``
+     - ``virtual_organization_name``
+     -
+   * - | ``volume.storage_group``
+       | ``volume.file_family``
+     - ``storage_class.storage_class_name=volume.storage_group+"."+volume.file_family+"@cta"``
+     - | This is needed so that dCache can
+       | communicate to CTA and still use ``storage_class``
+       | for data steering within dCache.
+   * - ``volume.library``
+     - ``logical_library_name``
+     -
+   * - ``file.bfid``
+     - ``archive_file.archive_file_id``
+     - Sequence in CTA
+   * - ``file.pnfs_id``
+     - ``archive_file.disk_file_id``
+     -
+   * - | if bfid has entry in
+       | ``file_copies_map``
+     - | ``storage_class.nb_copies``
+       | ``archive_route.copy_nb``
+       | And extra entries in ``tape_file``
+     - | The ``storage_class.nb_copies`` is set to 2
+       | if ``volume.file_family ~ '.*_copy_1'``
+       | and or each enry in ``file_copies_map``
+       | an extra entry is made in ``file_copies_map``
+       | corresponding to file copy
+
+The script ``enstore2cta.py`` running with ``--all`` options performs the following steps:
+
+1. creates ``disk_instance`` with name corresponding to ``"disk_instance_name"``  key in configuration
+   file ``enstore2cta.yaml``;
+2. selects distinct names of ``volume.storage_group`` -> creates entries in ``virtual_organization``;
+3. selects distinct names of ``volume.library`` -> creates CTA ``logical_library`` entries
+   with the same names;
+4. selects distinct ``volume.storage_group||'.'||volume.file_family||'@cta'`` -> creates corresponding
+   entries in ``storage_class`` table. If ``volume.file_family ~ '.*_copy_1`` the ``nb_copies`` is set to 2;
+5. for each vo creates ``tape_pool`` entry;
+6. for each storage class and corresponding tape_pool (by vo) creates ``archive_route`` entry;
+7. selects all Enstore volumes, that do not have ``"_copy_1"`` suffix and puts them on the Queue;
+8. spawns number of processes (default - number of cores), each process takes volume to process from Queue;
+9. each process:
+  1. inserts volume into ``tape`` table;
+  2. selects all active direct files, together with all their copies (if there are copies)
+    from the ``file``, ``volume``, ``file_copies_map`` join
+    and loops over them inserting entries into  ``archive_file`` and ``tape_file``, for each
+    copy, it also makes an entry into ``tape`` for copy volume (does it only once for each
+    new copy volume)  and ``tape_file`` for file copies;
+  3. calculates CTA file location and inserts in into Chimera ``t_locationinfo`` table;
+10. when Queue drops to 0, the processes shutdown and a single bootstrap query is run to
+    updtate copy counts on all entries in ``tape`` table.
diff --git a/_sources/enstore2cta_script.rst.txt b/_sources/enstore2cta_script.rst.txt
@@ -0,0 +1,62 @@
+enstore2cta - Enstore to CTA migration script
+=============================================
+
+The script ``enstore2cta.py``, located in ``enstore2cta/scripts``, implements
+database migration from Enstore DB to CTA DB. Both databases must be
+`PostgreSQL` databases. The script has various steering options (see below).
+It spawns multiple processes, each process processing a unique Enstore volume.
+
+
+Requirements
+------------
+
+
+The script works both with python2 and python3 and requires ``psycopg2`` module be installed (using ``pip`` or ``yum install python-psycopg2``).
+
+
+Invocation
+----------
+To run the script a config file ``enstore2cta.yaml`` *must* exist in
+the current directory or be pointed at by ``MIGRATION_CONFIG`` environment variable.
+Look for example in ``enstore2cta/etc``. It must have "0600" permission (to protect database passwords if any).
+
+::
+
+ $ python enstore2cta.py
+ usage: enstore2cta.py [-h] [--label LABEL] [--all] [--skip_locations] [--add]
+                       [--storage_class STORAGE_CLASS] [--vo VO]
+                       [--cpu_count CPU_COUNT]
+
+ This script converts Enstore metadata to CTA metadata. It looks for YAML
+ configuration file pointed to by MIGRATION_CONFIG environment variable or, if
+ it is not defined, it looks for file enstore2cta.yaml in current directory.
+ Script will quit if configuration YAML is not found.
+
+ optional arguments:
+   -h, --help            show this help message and exit
+   --label LABEL         comma separated list of labels (default: None)
+   --all                 do all labels (default: False)
+   --skip_locations      skip filling chimera locations (good for testing)
+                         (default: False)
+   --add                 add volume(s) to existing system, do not create vos,
+                         pools, archive_routes etc. These need to pre-exist in
+                         CTA db (default: False)
+   --storage_class STORAGE_CLASS
+                         Add storage class corresponding to volume. Needed when
+                         adding single volume to existing system using --add
+                         option (default: None)
+   --vo VO               vo corresponding to storage_class. Needed when adding
+                         single volume to existing system using --add option
+                         (default: None)
+   --cpu_count CPU_COUNT
+                         override cpu count - number of simultaneously processed
+                         labels (default: 8)
+                         single volume to existing system using --add option
+
+
+(default cpu_count is equal to ``multiprocessing.cpu_count()``)
+
+The script can work with individual label(s) passed as comma separated values to ``--label`` option. Or it can be invoked with ``--all`` switch to migrate all labels. The migration is done by label.
+
+Additionally, on an existing CTA system one can use
+``--add`` option to add a volume also specifying its ``--storage_class`` (e.g. "cms.foo") and ``--vo`` (e.g. "cms").
diff --git a/_sources/enstore_schema.rst.txt b/_sources/enstore_schema.rst.txt
@@ -0,0 +1,48 @@
+Enstore Schema
+==============
+
+Here is Enstore DB schema that does not contain "unattached"
+tables like ``media_capacity``:
+
+.. image:: images/enstoredb.relationships.real.compact.png
+
+There are two main tables - ``file`` and ``volume`` and a bunch of ancillary
+tables of which only ``file_copies_map`` is important for Enstore -> CTA transition. This table maps `primary` file copy and secondary file copies. In reality
+Enstore uses maximum one extra copy of a file. Not all files have extra copies.
+
+File table
+----------
+
+Each file copy in Enstore is uniquely identified by a BFID  -  bit file id,
+which is a string obtained by adding a three letter `brand` (which is the same for all files in a given Enstore instance), the Unix epoch, multiplied by 100000 and a counter which is reserved to resolve collisions. BFID is generated in the code base and is inserted into ``file`` table where it has unique constraint. If insert fails, the counter is incremented and the record insertion is tried again. And so on until it succeeds.
+
+.. code-block:: python
+
+    bfid = "CDMS" + str(time.time()*100000)
+
+Each file record contains PNFSID (dCache inode identifier) that ties it back to
+the front end storage system; adler32 checksum; a reference to the file package for small files in SFA (Small File Aggregation) equal BFID of the package or ``null`` for `direct` files; file size; original file name;  UID/GID of user who created the file; tape location and a ``deleted`` flag that indicates whether or not the file
+has been removed from namespace.
+
+Volume table
+------------
+
+Every tape in Enstore is stored in the ``volume`` table.
+The many to one ``file`` to ``volume`` relation is done on integer ``volume.id`` primary key via ``file.volume`` foreign key.
+
+Each volume record tracks how many active/deleted/total files and bytes exist
+on the volume (via DB trigger on insert/update/delete). It has a volume label; total/remaining bytes; number of mounts; number of read and write accesses; several status fields that allow to classify tapes (e.g. ``full``, ``NOACCESS``, ``NOTALLOWED``, ``migrated``, ``migrating``). The values of status fields are arbitrary strings.
+
+The Enstore system  has a concept of virtual library, so called library manager (LM). The LM
+manages a set of movers that have SCSI tape drives attached. LM (and movers) are Enstore servers and are configured based in Enstore instance configuration and are not captured in database schema. Each LM has a unique name and draws specific tapes allocated for it. This relation is captured in ``volume.library`` field.
+
+Since Enstore LMs map  to actual  physical tape libraries, the volumes have to be pre-allocated to specific LMs.
+
+Accounting and data steering aspects of Enstore operations use ``volume.storage_group`` field (usually corresponding to a VO name); ``volume.file_family`` a string field that tells Enstore to use the same set of tapes to write data having this attribute. ``volume.file_family_width`` an integer that specified how many tape drives can be used simultaneously to write data with specific ``file_family``.
+
+Enstore does not have pre-defined ``library``, ``storage_group`` and ``file_family`` concepts. When files are written to  Enstore it receives the instruction of what (``library``, ``file_family``, ``file_family_width``) to use from Enstore command line client ``encp``. When invoked, the ``encp`` client takes these parameters from directory tags of the destination directory or they can be passed as options to encp. File family value can be completely arbitrary and user defined. Specifying random ``library`` string results in failure to write if Enstore does not actually have a running LM with matching name.
+
+File copies
+-----------
+
+If `encp` is passed comma separated list of libraries via directory tag or command line option Enstore will make as many copies of the file on volumes belonging to these libraries. In practice Public Enstore system uses maximum 2 file copies for a subset of data. The relation between `primary` and `secondary` is captured in ``file_copies_map`` table having ``bfid`` and ``alt_bfid`` to express the relation.