Skip to content

Commit

Permalink
Update docs for more info about relative path
Browse files Browse the repository at this point in the history
  • Loading branch information
stuartmcalpine committed Oct 22, 2024
1 parent 8736c5f commit 43ac2fd
Showing 1 changed file with 22 additions and 3 deletions.
25 changes: 22 additions & 3 deletions docs/source/tutorial_notebooks/datasets_deeper_look.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,9 @@
"\n",
"The files and directories of registered datasets are stored under a path relative to the root directory (`root_dir`), which, by default, is a shared space at NERSC.\n",
"\n",
"By default, the `relative_path` is constructed from the `name` and `version`, in the format `relative_path=<name>/<version>. However, one can also manually select the relative_path during registration, for example"
"By default, when not manually specified, the `relative_path` is constructed from the `name` and `version`, in the format `relative_path=.gen_paths/<name>_<version>/`. \n",
"\n",
"One can manually select the `relative_path` during registration if they explicitly care about where the data is located relative to the `root_dir`, for example"
]
},
{
Expand Down Expand Up @@ -208,9 +210,26 @@
"source": [
"will register a dataset under the `relative_path` of `nersc_tutorial/my_desc_dataset`.\n",
"\n",
"For those interested, the eventual full path for the dataset will be `<root_dir>/<schema>/<owner_type>/<owner>/<relative_path>`. Naturally, the `relative_path` you select cannot already be taken by another dataset (an error will be raised in this case).\n",
"If the registered dataset was a single file, the `relative_path` will be the explicit (relative) pathname to that file, e.g., `.gen_paths/mydataset_1.0.0/myfile.txt` or `my/manual/path/myfile.txt`. If the registered dataset was a directory, the `relative_path` is the pathname to the directory containing the dataset contents.\n",
"\n",
"For those interested, the eventual full path for the dataset will be `<root_dir>/<schema>/<owner_type>/<owner>/<relative_path>`. Naturally, the `relative_path` you select cannot already be taken by another dataset (an error will be raised in this case), and any manually specified `relative_path` cannot start with `.gen_paths` as this directory is reserved for autogenerated `relative_path`s.\n",
"\n",
"When you overwrite a previous dataset entry using the `replace()` function, the original `relative_path` at registration (automatically generated or manual) will be used.\n",
"\n",
"One can construct the full absolute path to a dataregistry file using the `get_dataset_absolute_path()` helper function, e.g.,"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f193290d-892a-47b6-8e91-8e94a10d506f",
"metadata": {},
"outputs": [],
"source": [
"# Find the full absolute path to a dataset using the dataset id\n",
"absolute_path = datareg.Query.get_dataset_absolute_path(dataset_id)\n",
"\n",
"When you overwrite a previous dataset entry using the `replace()` function, the original `relative_path` at registration (automatically generated or manual) will be used."
"print(f\"The absolute path for {dataset_id} is {absolute_path}\""
]
},
{
Expand Down

0 comments on commit 43ac2fd

Please sign in to comment.