Skip to content

zylklab/alfresco-export-scripts

Repository files navigation

Alfresco Export Scripts

Alfresco shell scripts for extracting user, groups, sites, data and metadata information from Alfresco repository. For the extraction of metadata information it is needed to deploy a webscript in Alfresco Repository.

Table of Contents

Installation

For running the shell scripts we need curl, wget, sed and jq shell utilities on the command line. For using metadata extraction, we need to deploy all webscript files under in /Data Dictionary/Web Scripts/net/zylk and then, to refresh Webscripts in /alfresco/service/index page.

Environment vars

Originally each shell script was provided with parameters in the command line (-e -u -p ). For making the script execution easier, we provide exportENVARS.sh script that may be used according to your environment, every script invokes it.

$ cat exportENVARS.sh

#! /bin/bash
export ALFURL=http://localhost:8080/alfresco
export MYUSER=admin
export MYPASS=secret

Bulk Export Scripts

The following two scripts (downloadSite.sh and getMetadata.sh) are needed to extract Alfresco documents and their corresponding metadata from repository. For running getMetadata.sh properly we need to deploy export-bulk-metadata webscript in Alfresco Server.

Note: A better approach is probably done with Alfresco Bulk Export Module but it only works from Alfresco 4.2 and above (JDK7 needed).

downloadSite.sh

It downloads a site (-s) or a given repository folder (-f) via wget using webdav,

$ ./downloadSite.sh -h
Usage: ./downloadAlfrescoSite.sh [-s <site-shortname>] | [-f <folder>]

For downloading the example site in Alfresco (Web Site Design Project):

$ ./downloadSite.sh -s swsdp
├── webdav
│   └── Sitios
│       └── swsdp
│           └── documentLibrary
│               ├── Agency Files
│               │   ├── Contracts
│               │   │   └── Project Contract.pdf
│               │   ├── Images
│               │   │   ├── coins.JPG
│               │   │   ├── graph.JPG
│               │   │   ├── grass.jpg
│               │   │   ├── header.png
│               │   │   ├── low consumption bulb.png
│               │   │   ├── money.JPG
│               │   │   ├── plugs.jpg
│               │   │   ├── turbine.JPG
│               │   │   ├── windmill.png
│               │   │   ├── wind turbine.JPG
│               │   │   └── wires.JPG
│               │   ├── Logo Files
│               │   │   ├── GE Logo.png
│               │   │   └── logo.png
│               │   ├── Mock-Ups
│               │   │   ├── sample 1.png
│               │   │   ├── sample 2.png
│               │   │   └── sample 3.png
│               │   └── Video Files
│               │       └── WebSiteReview.mp4
│               ├── Budget Files
│               │   ├── budget.xls
│               │   └── Invoices
│               │       ├── inv I200-109.png
│               │       └── inv I200-189.png
│               ├── Meeting Notes
│               │   ├── Meeting Notes 2011-01-27.doc
│               │   ├── Meeting Notes 2011-02-03.doc
│               │   └── Meeting Notes 2011-02-10.doc
│               └── Presentations
│                   ├── Project Objectives.ppt
│                   └── Project Overview.ppt

getMetadata.sh

It gets metadata files (needed for a bulk import) of a previously downloaded site or folder.

$ ./getMetadata.sh -h
Usage: ./getMetadata.sh [-f <local-webdav-folder>]

$ ./getMetadata.sh -f webdav

generating the corresponding metadata.properties.xml foreach document and folder.

├── webdav
│   ├── Sitios
│   │   ├── swsdp
│   │   │   ├── documentLibrary
│   │   │   │   ├── Agency Files
│   │   │   │   │   ├── Contracts
│   │   │   │   │   │   ├── Project Contract.pdf
│   │   │   │   │   │   └── Project Contract.pdf.metadata.properties.xml
│   │   │   │   │   ├── Contracts.metadata.properties.xml
│   │   │   │   │   ├── Images
│   │   │   │   │   │   ├── coins.JPG
│   │   │   │   │   │   ├── coins.JPG.metadata.properties.xml
│   │   │   │   │   │   ├── graph.JPG
│   │   │   │   │   │   ├── graph.JPG.metadata.properties.xml
│   │   │   │   │   │   ├── grass.jpg
│   │   │   │   │   │   ├── grass.jpg.metadata.properties.xml
│   │   │   │   │   │   ├── header.png
│   │   │   │   │   │   ├── header.png.metadata.properties.xml
│   │   │   │   │   │   ├── low consumption bulb.png
│   │   │   │   │   │   ├── low consumption bulb.png.metadata.properties.xml
│   │   │   │   │   │   ├── money.JPG
│   │   │   │   │   │   ├── money.JPG.metadata.properties.xml
│   │   │   │   │   │   ├── plugs.jpg
│   │   │   │   │   │   ├── plugs.jpg.metadata.properties.xml
│   │   │   │   │   │   ├── turbine.JPG
│   │   │   │   │   │   ├── turbine.JPG.metadata.properties.xml
│   │   │   │   │   │   ├── windmill.png
│   │   │   │   │   │   ├── windmill.png.metadata.properties.xml
│   │   │   │   │   │   ├── wind turbine.JPG
│   │   │   │   │   │   ├── wind turbine.JPG.metadata.properties.xml
│   │   │   │   │   │   ├── wires.JPG
│   │   │   │   │   │   └── wires.JPG.metadata.properties.xml
│   │   │   │   │   ├── Images.metadata.properties.xml
│   │   │   │   │   ├── Logo Files
│   │   │   │   │   │   ├── GE Logo.png
│   │   │   │   │   │   ├── GE Logo.png.metadata.properties.xml
│   │   │   │   │   │   ├── logo.png
│   │   │   │   │   │   └── logo.png.metadata.properties.xml
│   │   │   │   │   ├── Logo Files.metadata.properties.xml
│   │   │   │   │   ├── Mock-Ups
│   │   │   │   │   │   ├── sample 1.png
│   │   │   │   │   │   ├── sample 1.png.metadata.properties.xml
│   │   │   │   │   │   ├── sample 2.png
│   │   │   │   │   │   ├── sample 2.png.metadata.properties.xml
│   │   │   │   │   │   ├── sample 3.png
│   │   │   │   │   │   └── sample 3.png.metadata.properties.xml
│   │   │   │   │   ├── Mock-Ups.metadata.properties.xml
│   │   │   │   │   ├── Video Files
│   │   │   │   │   │   ├── WebSiteReview.mp4
│   │   │   │   │   │   └── WebSiteReview.mp4.metadata.properties.xml
│   │   │   │   │   └── Video Files.metadata.properties.xml
│   │   │   │   ├── Agency Files.metadata.properties.xml
│   │   │   │   ├── Budget Files
│   │   │   │   │   ├── budget.xls
│   │   │   │   │   ├── budget.xls.metadata.properties.xml
│   │   │   │   │   ├── Invoices
│   │   │   │   │   │   ├── inv I200-109.png
│   │   │   │   │   │   ├── inv I200-109.png.metadata.properties.xml
│   │   │   │   │   │   ├── inv I200-189.png
│   │   │   │   │   │   └── inv I200-189.png.metadata.properties.xml
│   │   │   │   │   └── Invoices.metadata.properties.xml
│   │   │   │   ├── Budget Files.metadata.properties.xml
│   │   │   │   ├── Meeting Notes
│   │   │   │   │   ├── Meeting Notes 2011-01-27.doc
│   │   │   │   │   ├── Meeting Notes 2011-01-27.doc.metadata.properties.xml
│   │   │   │   │   ├── Meeting Notes 2011-02-03.doc
│   │   │   │   │   ├── Meeting Notes 2011-02-03.doc.metadata.properties.xml
│   │   │   │   │   ├── Meeting Notes 2011-02-10.doc
│   │   │   │   │   └── Meeting Notes 2011-02-10.doc.metadata.properties.xml
│   │   │   │   ├── Meeting Notes.metadata.properties.xml
│   │   │   │   ├── Presentations
│   │   │   │   │   ├── Project Objectives.ppt
│   │   │   │   │   ├── Project Objectives.ppt.metadata.properties.xml
│   │   │   │   │   ├── Project Overview.ppt
│   │   │   │   │   └── Project Overview.ppt.metadata.properties.xml
│   │   │   │   └── Presentations.metadata.properties.xml
│   │   │   └── documentLibrary.metadata.properties.xml
│   │   └── swsdp.metadata.properties.xml
│   └── Sitios.metadata.properties.xml

The following webscript is needed to deploy in Alfresco in /Data Dictionary/Web Scripts/net/zylk:

  • export-bulk-metadata.get.desc.xml
  • export-bulk-metadata.get.js
  • export-bulk-metadata.get.text.ftl

Other helper scripts

This helper scripts are examples based on the blog post Alfresco REST API examples using curl and jq

Note: A similar approach is done with Alfresco Shell Tools.

getPeople.sh

It provides a complete list of users of Alfresco repository. With -f option it adds first name, surname and user email.

$ ./getPeople.sh -h
Usage: ./getPeople.sh [-f]

$ ./getPeople.sh
guest
admin
abeecher
mjackson

$ ./getPeople.sh -f
guest,Guest,,
admin,Administrator,,[email protected]
abeecher,Alice,Beecher,[email protected]
mjackson,Mike,Jackson,[email protected]

getGroups.sh

It gives the list of repository groups. With -f option you may obtain additionally info.

$ ./getGroups.sh -h
Usage: ./getGroups.sh [-f]

$ ./getGroups.sh
ALFRESCO_ADMINISTRATORS
ALFRESCO_MODEL_ADMINISTRATORS
ALFRESCO_SEARCH_ADMINISTRATORS
EMAIL_CONTRIBUTORS
SITE_ADMINISTRATORS
site_swsdp
site_swsdp_SiteCollaborator
site_swsdp_SiteConsumer
site_swsdp_SiteContributor
site_swsdp_SiteManager

getSites.sh

It gives a list with the shortnames of the sites. With -f option you additionally get the visibility and the title of the site.

$ ./getSites.sh -h
Usage: ./getSites.sh [-f]

$ ./getSites.sh
swsdp

$ ./getSites.sh -f
swsdp,PUBLIC,Sample: Web Site Design Project

getSiteMemberships.sh

It provides the list of users and roles of a given site (-s ).

$ ./getSiteMemberships.sh -h
Usage: ./getAlfrescoSiteMemberships.sh [-f | -s <site>]

$ ./getSiteMemberships.sh -s swsdp
swsdp,mjackson,SiteManager
swsdp,admin,SiteManager
swsdp,abeecher,SiteCollaborator

With -f option you obtain the full list of users and roles for every site in Alfresco repository.

getUserGroups.sh

./getUserGroups.sh -h
Usage: ./getUserGroups.sh [-f] [-a user]

It provides the groups of a given user (-a ).

$ ./getUserGroups.sh -a admin
GROUP_ALFRESCO_ADMINISTRATORS
GROUP_ALFRESCO_MODEL_ADMINISTRATORS
GROUP_ALFRESCO_SEARCH_ADMINISTRATORS
GROUP_EMAIL_CONTRIBUTORS
GROUP_SITE_ADMINISTRATORS

getAuthority.sh

It provides the users and groups of a given group (-g ). With -f option you obtain further details.

$ ./getAuthority.sh -h
Usage: ./getAuthority.sh [-f] [-g <group>]

$ ./getAuthority.sh -g ALFRESCO_ADMINISTRATORS
admin

$ ./getAuthority.sh -g ALFRESCO_ADMINISTRATORS -f
admin,Administrator,Administrator,USER

More Download Scripts

downloadDoc.sh

It provides a download script for a given Alfresco uuid and filename

$ ./downloadDoc.sh -h
Usage: ./downloadDoc.sh [-d uuid] [-n name]

downloadList.sh

Download files selected from a webscript list resultset.

$ ./downloadList.sh

A second webscript is necessary to deploy in Alfresco in /Data Dictionary/Web Scripts/net/zylk:

  • get-download-list.get.desc.xml
  • get-download-list.get.js
  • get-download-list.get.text.ftl

The webscript obtains a list of files flagged with "critical" tag, but may be customized for any alfresco-fts query:

workspace://SpacesStore/5515d3e1-bb2a-42ed-833c-52802a367033;Sitios/swsdp/documentLibrary/Presentations;Project Objectives.ppt
workspace://SpacesStore/99cb2789-f67e-41ff-bea9-505c138a6b23;Sitios/swsdp/documentLibrary/Presentations;Project Overview.ppt

Permission Scripts

getPerms.sh

It provides the local permission template under a rootpath (given by a noderef). If no user is given it provides a full list of local permissions.

$ ./getPerms.sh <noderef> [user]

$ ./getPerms.sh workspace://SpacesStore/8f2105b4-daaf-4874-9e8a-2152569d109b

   /Company Home/Sites/swsdp/documentLibrary/Budget Files;mjackson;SiteManager;workspace://SpacesStore/8ab12916-4897-47fb-94eb-1ab699822ecb
   /Company Home/Sites/swsdp/documentLibrary/Budget Files;abeecher;SiteCollaborator;workspace://SpacesStore/8ab12916-4897-47fb-94eb-1ab699822ecb
   /Company Home/Sites/swsdp/documentLibrary/Agency Files;mjackson;SiteManager;workspace://SpacesStore/8bb36efb-c26d-4d2b-9199-ab6922f53c28
   /Company Home/Sites/swsdp/documentLibrary/Agency Files;abeecher;SiteCollaborator;workspace://SpacesStore/8bb36efb-c26d-4d2b-9199-ab6922f53c28
   /Company Home/Sites/swsdp/documentLibrary/Meeting Notes;mjackson;SiteManager;workspace://SpacesStore/a211774d-ba6d-4a35-b97f-dacfaac7bde3
   /Company Home/Sites/swsdp/documentLibrary/Presentations;abeecher;SiteCollaborator;workspace://SpacesStore/38745585-816a-403f-8005-0a55c0aec813

The aim of this script is to obtain a template of permissions for a new user, with respect a given user. For example, I want to give user johndoe the same exact permissions of abeecher. With the following output (changing abeecher by johndoe) I would generate the input for set-perms.sh script.

$ ./getPerms.sh workspace://SpacesStore/8f2105b4-daaf-4874-9e8a-2152569d109b abeecher
   /Company Home/Sites/swsdp/documentLibrary/Budget Files;abeecher;SiteCollaborator;workspace://SpacesStore/8ab12916-4897-47fb-94eb-1ab699822ecb
   /Company Home/Sites/swsdp/documentLibrary/Agency Files;abeecher;SiteCollaborator;workspace://SpacesStore/8bb36efb-c26d-4d2b-9199-ab6922f53c28
   /Company Home/Sites/swsdp/documentLibrary/Presentations;abeecher;SiteCollaborator;workspace://SpacesStore/38745585-816a-403f-8005-0a55c0aec813

This script may be useful for complex (local) permissions maps.

getFolders.sh

It provides a folder permission template under a rootpath (given by a noderef). It lists a complete folder list under a given node, normally when permissions are set.

$ ./getFolders.sh <noderef> [user] [role]

$ ./getFolders.sh workspace://SpacesStore/8f2105b4-daaf-4874-9e8a-2152569d109b

   /Company Home/Sites/swsdp/documentLibrary/Budget Files;johndoe;SiteManager;workspace://SpacesStore/8ab12916-4897-47fb-94eb-1ab699822ecb
   /Company Home/Sites/swsdp/documentLibrary/Agency Files;johndoe;SiteManager;workspace://SpacesStore/8bb36efb-c26d-4d2b-9199-ab6922f53c28
   /Company Home/Sites/swsdp/documentLibrary/Meeting Notes;johndoe;SiteManager;workspace://SpacesStore/a211774d-ba6d-4a35-b97f-dacfaac7bde3
   /Company Home/Sites/swsdp/documentLibrary/Presentations;johndoe;SiteManager;workspace://SpacesStore/38745585-816a-403f-8005-0a55c0aec813
   /Company Home/Sites/swsdp/documentLibrary/Agency Files/Contracts;johndoe;SiteManager;workspace://SpacesStore/e0856836-ed5e-4eee-b8e5-bd7e8fb9384c
   /Company Home/Sites/swsdp/documentLibrary/Agency Files/Images;johndoe;SiteManager;workspace://SpacesStore/880a0f47-31b1-4101-b20b-4d325e54e8b1
   /Company Home/Sites/swsdp/documentLibrary/Agency Files/Logo Files;johndoe;SiteManager;workspace://SpacesStore/b1a98357-4f7a-470d-bf4c-327501158689
   /Company Home/Sites/swsdp/documentLibrary/Agency Files/Mock-Ups;johndoe;SiteManager;workspace://SpacesStore/610771be-4d82-479a-a2d7-796adf498084
   /Company Home/Sites/swsdp/documentLibrary/Agency Files/Video Files;johndoe;SiteManager;workspace://SpacesStore/1d26e465-dea3-42f3-b415-faa8364b9692
   /Company Home/Sites/swsdp/documentLibrary/Budget Files/Invoices;johndoe;SiteManager;workspace://SpacesStore/d56afdc3-0174-4f8c-bce8-977cafd712ab

If a user / role are not provided, johndoe and SiteManager were used.

setPerms.sh

It sets local permissions from a permission template file. The permission file template is the one given by getPerms.sh or getFolders.sh scripts

$ ./setPerms.sh <permissions-file>

3 new webscripts are needed to deploy in Alfresco in /Data Dictionary/Web Scripts/net/zylk:

  • get-folder-perms.get.desc.xml
  • get-folder-perms.get.js
  • get-folder-perms.get.text.ftl
  • get-perms.get.desc.xml
  • get-perms.get.js
  • get-perms.get.text.ftl
  • set-perm.get.desc.xml
  • set-perm.get.js
  • set-perm.get.text.ftl

Tested on

  • Alfresco Enterprise 4.1.1
  • Alfresco Enterprise 5.2.3
  • Alfresco Community 201707GA
  • Alfresco Community 201911GA

Known Limitations

  • Not able to download versions of documents via downloadSite.sh script.
  • Not able to download documents via webdav when Kerberos or NTML SSO is enabled.
  • Use -k option in curl commands or --no-check-certificate in wget scripts, in case of dealing with self-signed SSL certificates

History

  • 202212 - Permissions scripts and webscripts.
  • 202209 - Maxlevel option for crawling and several encoding adjustments. Thanks to Romain Brochot.
  • 202206 - Fixing encoding functions for solving special character path issues
  • 202201 - Added download tagged doc list feature via webscript
  • 201808 - Initial release

Author

Links