DP.LA Service Hub

This project provides a lightweight DP.LA aggregation feed and a command-line interface for ingesting different bibliographic and metadata vocabularies like MODS, Dublin Core, and MARC into a [RDF triplestore][BL] as BIBFRAME 2.0 linked-data. This project is based KnowledgeLinks.io's Catalog Pull Platform using the RDF Framework and BIBCAT.

This project started as a pilot for the Colorado/Wyoming DP.LA service hub.

Setup

Clone or fork the project repository:

git clone https://github.com/KnowledgeLinks/dpla-service-hub.git

Initialize and update submodules

cd dpla-service-hub/
git submodule init
git submodule update

Create an instance directory for configuration and custom RDF rules:
```
mkdir instance
cd instance/
touch config.py
```

Config.py options

To configure dpla-service-hub, you'll need to add these minimum variables in your config.py file.

SECRET_KEY - Random string of characters for seeding Flask
BASE_URL - Base URL to use for IRI minting, defaults to http://bibcat.org/

Ingestion

Right now, the way to ingest records into the triplestore is open an interactive Python 3 session. Here is an example of setting-up your Python environment to use the these different types of source ingesters into the triplestore:

import sys
sys.path.append("/dpla-service-hub/bibcat")
from ingesters.ingester import NS_MGR, new_graph

Customizing

To customize the field mappings, add common properties, and other information to the triplestore, add Turtle RDF files in the custom directory. When you then create an ingester, include the title of the turtle file with the custom parameter to use your custom rules during the ingestion period.

MARC 21

Create a MARC21 ingester using a custom RDF rules graph for Colorado College along with a sample of Colorado College's MARC 21 records:

import pymarc
import ingesters.marc as marc2bf
marc_ingester = marc2bf.MARCIngester(rules_ttl=['cc-marc-bf-.ttl'])
with open("dpla-service-hub/tmp/cc-marc.mrc", "rb") as fo:
    reader = pymarc.MARCReader(fo, to_unicode=True)
for record in reader:
    marc_ingester.transform(record=record)

MODS XML

import requests
import xml.etree.ElementTree as etree
import ingesters.mods as mods
mods_ingester = mods.MODSIngester(xml=mods_xml, rules_ttl=["cc-mods-bf.ttl"])

Request the MODS XML datafile from a Colorado College's Islandora repository for a single Fedora Object:

mods_result = request.get("https://digitalcc.coloradocollege.edu/islandora/object/coccc:26262/datastream/MODS/view")
mods_xml = etree.XML(mods_result.text)
mods_ingester.transform(source=mods_xml)

Dublin Core XML

To test a random collection of Dublin Core RDF XML from Denver Public Library

import pickle
import pymarc
import ingesters.dc as dc
dc_ingester = dc.DCIngester(rules_ttlt st=['dpl-dc.ttl'])
with open("dpla-service-hub/tmp/sample_recs.pickle", "rb") as fo:
    sample_recs = pickle.load(fo)
for rdf_record in sample_recs:
    dc_ingester.transform(xml=etree.tostring(rdf_record))
    dc_ingester.add_to_triplestore()

Dublin Core CSV

Deploying with Docker and Docker-Compose

This project now supports Docker and Docker Compose. To run the DPLA Service Hub stack, run docker-compose up from the base directory. It will build a bibcat image using the instance/config.py file you created

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
custom		custom
doc		doc
example_instance		example_instance
profiles		profiles
rdfw-definitions		rdfw-definitions
reports		reports
templates		templates
tests		tests
.gitignore		.gitignore
DockerNginx		DockerNginx
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
api.py		api.py
bibcat-nginx.conf		bibcat-nginx.conf
commands.py		commands.py
date_generator.py		date_generator.py
docker-compose.yml		docker-compose.yml
load.py		load.py
manage.py		manage.py
output.py		output.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DP.LA Service Hub

Setup

Config.py options

Ingestion

Customizing

MARC 21

MODS XML

Dublin Core XML

Dublin Core CSV

Deploying with Docker and Docker-Compose

Server Aggregation Feed

About

Releases

Packages

Contributors 2

Languages

License

KnowledgeLinks/dpla-service-hub

Folders and files

Latest commit

History

Repository files navigation

DP.LA Service Hub

Setup

Config.py options

Ingestion

Customizing

MARC 21

MODS XML

Dublin Core XML

Dublin Core CSV

Deploying with Docker and Docker-Compose

Server Aggregation Feed

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages