Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata data store improvements to customise folder naming #7711

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

josegar74
Copy link
Member

@josegar74 josegar74 commented Feb 7, 2024

By default, GeoNetwork uses the following folder structure to store the metadata files:

datadir
  |-{{sequence_folder}}
  |    |-{{metadata_id}}
  |    |    |-private
  |    |    |-public
  |    |        |--doc.pdf

This pull request allows to customise the naming of folders to store the metadata files, for example instead of using the internal metadata id, using the metadata uuid or the metadata resource identifier.

datadir
  |-{{metadata_resource_identifier}}
  |    |--doc.pdf

Custom folder structure allows user to be able to connect to an externally managed data store which has to follow convention about naming. Quite often data are well organized on file system and this facilitate interaction between the catalogue and external datastore avoiding duplication of files. Externally managed data store also support in general large data file upload.

Use case

For example, at EEA, all datasets are identified using a unique resource identifier eg. eea_v_4258_100_k_msfd-marine-regions_p_2010-2017_v01_r00 following this convention. All data files, reports and documentation about those datasets are published in a datastore with an internal and public area (https://sdi.eea.europa.eu/webdav/datastore/public/). Connecting the catalogue to that storage allow to easily reference files in metadata records.

Configuration in config.properties

  • datastore.folderStructureType. Values: DEFAULT (default value), CUSTOM

    • DEFAULT uses the default structure to store the metadata files
    • CUSTOM allows to customise the naming of folders to store the metadata files, for example instead of usingthe internal metadata id, using the metadata uuid or the metadata resource identifier.
  • datastore.folderPrivilegesStrategy. Values: DEFAULT (default value), NONE

    • DEFAULT uses the PRIVATE / PUBLIC subfolder structure
    • NONE stores all the files in the same metadata folder
  • datastore.folderStructure: folder structure to store published metadata. Example to use the metadata resource identifier: datastore/$.resourceIdentifier[0].code/

  • datastore.folderStructureFallback: folder structure to store published metadata if the criteria for previous property doesn't match (in the previous example, when the metadata has no resource identifier, use the metadata uuid instead: datastore/$.uuid/)

  • datastore.folderStructureNonPublic: folder structure to store non published metadata (if don't want to store the files in the same folder structure as datastore.folderStructure for non published metadata)

  • datastore.folderStructureFallbackNonPublic: analog to datastore.folderStructureFallback

The datastore.folderStructure... values support JSON Path to refer to Elasticsearch index properties (for example, the resource identifier, the metadata uuid, etc.).

Testing

Tested only with the Filesystem store.

Changes

  • Subfolder browsing: If the content of the datastore is also externally managed, editors can select files in subfolders:

image

  • No private/public icon when using a custom folder layout

image

Checklist

  • I have read the contribution guidelines
  • Pull request provided for main branch, backports managed with label
  • Good housekeeping of code, cleaning up comments, tests, and documentation
  • Clean commit history broken into understandable chucks, avoiding big commits with hundreds of files, cautious of reformatting and whitespace changes
  • Clean commit messages, longer verbose messages are encouraged
  • API Changes are identified in commit messages
  • Testing provided for features or enhancements using automatic tests
  • User documentation provided for new features or enhancements in manual
  • Build documentation provided for development instructions in README.md files
  • Library management using pom.xml dependency management. Update build documentation with intended library use and library tutorials or documentation

Funded by EEA

…names - support renaming of metadata folders and refactor code to calculate folder names
…names - configuration folder for draft metadata
…names - allow to use different base folders for public / non-public metadata
@josegar74 josegar74 added this to the 4.4.3 milestone Feb 7, 2024
@josegar74 josegar74 force-pushed the 44-datastoreimprovements branch from ed2b441 to 37ed44c Compare February 7, 2024 10:10
@josegar74 josegar74 force-pushed the 44-datastoreimprovements branch from 37ed44c to a47494f Compare February 7, 2024 10:14
@josegar74 josegar74 modified the milestones: 4.4.3, 4.4.4 Mar 4, 2024
@fxprunayre fxprunayre marked this pull request as ready for review March 7, 2024 11:59
Copy link

sonarcloud bot commented Mar 7, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

@fxprunayre fxprunayre removed this from the 4.4.4 milestone Apr 16, 2024
@fxprunayre fxprunayre added this to the 4.4.5 milestone Apr 16, 2024
@fxprunayre fxprunayre modified the milestones: 4.4.5, 4.4.6 Jun 3, 2024
@fxprunayre fxprunayre modified the milestones: 4.4.6, 4.4.7 Oct 15, 2024
@CLAassistant
Copy link

CLAassistant commented Dec 8, 2024

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants