-
Notifications
You must be signed in to change notification settings - Fork 26
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
YDA-5829: add documentation for publication troubleshooting tool
Co-authored-by: Lazlo Westerhof <[email protected]> Co-authored-by: claravox <[email protected]>
- Loading branch information
1 parent
a90635b
commit b4c3d36
Showing
2 changed files
with
100 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
99 changes: 99 additions & 0 deletions
99
docs/administration/troubleshooting-published-data-packages.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
--- | ||
parent: Administration Tasks | ||
title: Troubleshooting published data packages | ||
nav_order: 21 | ||
--- | ||
# How to troubleshoot published data packages | ||
|
||
This documentation explains how users can diagnose issues with all existing published data packages using our new troubleshooting tool. The tool performs a series of checks to verify the integrity and compliance of data packages. The scope of this tool includes both data packages that have been successfully published and those that have failed to publish (packages that initiated the publication process but did not succeed). Specifically, it targets data packages with their Attribute-Value Units (AVUs) including `org_publication_status` of `OK`, `Retry`, `Unrecoverable`, or `Unknown`. Note, the `org_` prefix is defined by the constant variable of `UUORGMETADATAPREFIX` stored in `constants.py` in the ruleset. | ||
|
||
Alternatively, the tool can diagnose a specific data package when provided with its name. | ||
|
||
**Requirements:** | ||
- Python 3 or higher | ||
- Yoda version 1.10 or later | ||
- Script must be run as rodsadmin user | ||
|
||
## Check Steps | ||
|
||
The tool performs the following checks: | ||
|
||
### Metadata Schema Conformance | ||
|
||
This step verifies that the metadata of the data package conforms to the associated schema. | ||
|
||
|
||
### System AVUs Verification | ||
|
||
This step checks whether the data package has the expected system Attribute-Value Units (AVUs). It does this by comparing AVUs that start with `org_publication` against the expected AVU keys (ground truth). The check results reveal if there are missing or unexpected AVUs, which will be printed to the terminal and the log file. | ||
|
||
### DOI Registration Status | ||
|
||
This step checks the registration status of both `versionDOI` (if available) and `baseDOI` using the DataCite API. It retrieves the DOIs from the package's metadata AVUs and sends API requests to DataCite to verify if these DOIs are registered. | ||
|
||
### Landing Page Integrity | ||
|
||
This step compares the contents of the local landing page file with the remote landing page to ensure they match. It does this by sending a URL request to download the HTML of the data package's landing page and comparing it with the local HTML file. Note that if there is no internet connection, you should enable the `offline` mode. In offline mode, this step checks if the local landing page file exists but does not verify the correctness of its content. | ||
|
||
### Combined JSON Integrity | ||
|
||
This step checks the integrity of the combined JSON file by verifying its URL online and confirming the existence of the file. It accomplishes this by checking if the metadata JSON sent to OAI-PMH server can be found in the OAI-PMH repository. In offline mode, it only checks whether package's `-combi.json` file exists locally | ||
|
||
|
||
## Commands Execution Guide | ||
|
||
The tool can be used with various options as detailed below. Ensure you are logged in as an irodsadmin user for the necessary permissions and navigate to the 'yoda-ruleset/tools' directory before running any commands, e.g., | ||
|
||
```bash | ||
cd /etc/irods/yoda-ruleset/tools | ||
``` | ||
|
||
### 1. General Check | ||
|
||
To perform checks on all published data packages: | ||
|
||
```bash | ||
python3 troubleshoot-published-data.py | ||
``` | ||
|
||
### 2. Specific Package Check | ||
|
||
To inspect a single data package: | ||
|
||
```bash | ||
python3 troubleshoot-published-data.py -p <package-name> | ||
``` | ||
|
||
The package can either be specified as the short name (the name of the folder that you see in the vault), for example `research-core-0[1722266819]`, or the path to the package, for example: `vault-core-0/research-core-0[1722266819]`. Be aware that if the package short name contains spaces then the package must be specified in quotes. | ||
|
||
### 3. Log results and offline mode | ||
|
||
By default, the results are displayed to terminal (stdout). Furthermore, to save the detailed output to a log file execute: | ||
|
||
```bash | ||
python3 troubleshoot-published-data.py -l | ||
``` | ||
|
||
- The -l option enables logging mode. This saves the log to `/var/lib/irods/log/troubleshoot_publications.log` | ||
- The -o option enables offline mode, which skips several tests related to connecting to remote servers, but does not skip the datacite test. This is useful when testing on a local development environment. | ||
- The -n option enables no datacite mode, which skips the datacite checks. This is also useful when testing on a local development environment. | ||
|
||
## Example output | ||
|
||
When checking a single data package, the output containing successful and failed checks displayed in the terminal is as follows: | ||
|
||
``` | ||
Troubleshooting data package: /tempZone/home/vault-core-0/research-core-0[1722266819] | ||
compare_local_remote_landingpage: File contents at irods path </tempZone/yoda/publication/JCY2C2.html> and remote landing page <https://public.yoda.test/allinone/UU01/JCY2C2.html> do not match. | ||
Results for: /tempZone/home/vault-core-0/research-core-0[1722266819] | ||
Package FAILED one or more tests: | ||
Schema matches: True | ||
All expected AVUs exist: True | ||
No unexpected AVUs: True | ||
Version DOI matches: False | ||
Base DOI matches: False | ||
Landing page matches: False | ||
Combined JSON matches: True | ||
``` | ||
|
||
For checks involving multiple data packages, the output for each package is aggregated, displaying the results consecutively in the terminal. This allows for a comprehensive view of the results across different packages. |