Skip to content

Digitized Newspapers

Eben English edited this page Sep 23, 2019 · 5 revisions

title: Digitized Newspapers

author: Eben English

date: 2017-09-08

profile:

organization: Samvera Newspapers Interest Group project: Samvera Newspapers Interest Group


Introduction

This model describes digitized newspaper content, and is intended to inform the development of RDF-based models for all types of newspaper content objects (titles, containers, issues, pages, articles, files), such as would be used in Samvera- or Islandora-based digital asset management applications.

This model was greatly informed by earlier efforts from National Library of Wales and University of Maryland, as well as discussions of the Samvera Newspapers Interest Group. This is essentially an attempt to reconcile these efforts and express them as a formal PCDM profile.

NOTES:

  • This model attempts to use "upward-pointing" child-to-parent relationships wherever possible (i.e. <child> pcdm:memberOf <parent> is preferred over <parent> pcdm:hasMember <child>), due to the fact that many objects in this model are likely to have numerous children. Upward-pointing modeling has been found to provide better application performance.

Model

TK

NewspaperTitle

This is a pcdm:Object serving as an abstract intellectual container for one or more issues of a specific newspaper title (e.g. The Bloom Picayune). This object serves as a parent for NewspaperIssue and NewspaperContainer objects.

This object corresponds to a Collection in the IIIF Presentation API model.

# example RDF graph
<https://example.org/newspaper_titles/1> a pcdm:Object ;
    dcterms:title "Bloom Picayune" ;
    dcterms:type <http://id.loc.gov/vocabulary/marcgt/new> ;
    dcterms:language <http://id.loc.gov/vocabulary/iso639-2/epo> ;
    identifiers:issn "0000-1111" ;
    identifiers:lccn "sn98765432" .
Field Predicate Recommendation Expected Value Obligation
Title dcterms:title MUST rdfs:Literal {1,1}
Alternative Title dcterms:alternative rdfs:Literal {0,n}
Type dcterms:type SHOULD rdfs:Resource {0,1}
Publication date start schema:startDate rdfs:Literal {0,1}
Publication date end schema:endDate rdfs:Literal {0,1}
Publisher dc:publisher rdfs:Literal or rdfs:Resource {0,n}
Place of publication relators:pup rdfs:Resource {0,n}
Edition bf:editionStatement rdfs:Literal {0,1}
Frequency rdau:P60538 rdfs:Literal or rdfs:Resource {0,n}
Language dc:language SHOULD rdfs:Literal or rdfs:Resource {0,n}
ISSN identifiers:issn SHOULD rdfs:Literal {0,1}
LCCN identifiers:lccn SHOULD rdfs:Literal {0,1}
OCLC # bibo:oclcnum rdfs:Literal {0,1}
Rights edm:rights rdfs:Resource {0,n}
License dcterms:rights rdfs:Resource {0,n}
Holding location bf:heldBy SHOULD rdfs:Literal or rdfs:Resource {0,1}
Preceded by rdau:P60261 rdfs:Literal or rdfs:Resource {0,n}
Succeeded by rdau:P60278 rdfs:Literal or rdfs:Resource {0,n}

NewspaperContainer

This is a pcdm:Object representing a physical container that may have held bound, microfilmed, or otherwise collected newspaper issues prior to digitization. It serves as a way to maintain provenance and information about which microfilm reels or bound volumes that a particular digitized issue or page originated from. This is an optional construct, since not all implementations may be working from a microfilmed or bound container.

This object is a child of NewspaperIssue, and serves as a parent for NewspaperPage objects.

This object type has no corresponding object in the IIIF Presentation API model.

NOTES:

  • A NewspaperContainer SHOULD be a member of at least one NewspaperTitle object.
  • A NewspaperContainer MAY have associated binary files representing target frames or images sometimes included at the head of a microfilm reel; these pcdm:File objects should be related to the NewspaperContainer object using an intermediary pcdm:FileSet object.
# example RDF graph
<https://example.org/newspaper_containers/1> a pcdm:Object ;
    pcdm:memberOf <https://example.org/newspaper_titles/1> ;
    dcterms:title "New York Herald Tribune" ;
    dcterms:type <http://id.loc.gov/vocabulary/graphicMaterials/tgm006499> ;
    bf:heldBy <http://id.loc.gov/rwo/agents/n79117036> .
Field Predicate Recommendation Expected Value Obligation
Member of pdcm:memberOf SHOULD pcdm:Object {1,1}
Title dcterms:title MUST rdfs:Literal {1,1}
Alternative Title dcterms:alternative rdfs:Literal {0,n}
Type dcterms:type SHOULD rdfs:Resource {0,1}
Publication date start schema:startDate rdfs:Literal {0,1}
Publication date end schema:endDate rdfs:Literal {0,1}
Publisher dc:publisher rdfs:Literal or rdfs:Resource {0,n}
Place of publication relators:pup rdfs:Resource {0,n}
Language dc:language SHOULD rdfs:Literal or rdfs:Resource {0,n}
ISSN identifiers:issn SHOULD rdfs:Literal {0,1}
LCCN identifiers:lccn SHOULD rdfs:Literal {0,1}
OCLC # bibo:oclcnum rdfs:Literal {0,1}
Extent dcterms:extent rdfs:Literal {0,1}
Rights edm:rights rdfs:Resource {0,n}
License dcterms:rights rdfs:Resource {0,n}
Holding location bf:heldBy SHOULD rdfs:Literal or rdfs:Resource {0,1}

NewspaperIssue

A pcdm:Object representing a single issue of a newspaper title. This object is a child of NewspaperTitle, and serves as a parent for NewspaperPage objects representing individual pages and NewspaperArticle objects representing articles.

This object corresponds to a Manifest in the IIIF Presentation API model.

NOTES:

  • A NewspaperIssue SHOULD be a member of a NewspaperTitle object.
  • A NewspaperIssue SHOULD be a member of one and only one NewspaperTitle object.
  • A NewspaperIssue MAY have associated binary files representing the full issue (such as a PDF or TXT file); these pcdm:File objects should be related to the NewspaperContainer object using a single intermediary pcdm:FileSet object.
# example RDF graph
<https://example.org/newspaper_issues/1> a pcdm:Object ;
    pcdm:memberOf <https://example.org/newspaper_titles/1> ;
    dcterms:title "Daily Planet: April 1, 1963" ;
    dcterms:issued "1963-04-01" .
Field Predicate Recommendation Expected Value Obligation
Member of pdcm:memberOf SHOULD pcdm:Object {1,1}
Title dcterms:title MUST rdfs:Literal {1,1}
Alternative Title dcterms:alternative rdfs:Literal {0,n}
Type dcterms:type SHOULD rdfs:Resource {0,1}
Publication date dcterms:issued SHOULD rdfs:Literal {0,1}
Publisher dc:publisher rdfs:Literal or rdfs:Resource {0,n}
Place of publication relators:pup rdfs:Resource {0,n}
Volume bibo:volume rdfs:Literal {0,1}
Edition name bf:editionStatement rdfs:Literal {0,1}
Edition number bf:editionEnumeration rdfs:Literal {0,1}
Issue bibo:issue rdfs:Literal {0,1}
Language dc:language SHOULD rdfs:Literal or rdfs:Resource {0,n}
ISSN identifiers:issn SHOULD rdfs:Literal {0,1}
LCCN identifiers:lccn SHOULD rdfs:Literal {0,1}
OCLC # bibo:oclcnum rdfs:Literal {0,1}
Extent dcterms:extent rdfs:Literal {0,1}
Rights edm:rights rdfs:Resource {0,n}
License dcterms:rights rdfs:Resource {0,n}
Holding location bf:heldBy SHOULD rdfs:Literal or rdfs:Resource {0,1}

NewspaperPage

A pcdm:Object representing a single page of a newspaper issue. This object may be a child of NewspaperIssue, or NewspaperContainer, or NewspaperArticle; and serves as a parent for a pcdm:FileSet object representing image and/or text files.

This object corresponds to a Canvas in the IIIF Presentation API model.

NOTES:

  • A NewspaperPage SHOULD be a member of one and only one NewspaperIssue object.
  • A NewspaperPage MAY be a member of one and only one NewspaperContainer object.
  • A NewspaperPage MAY be a member of multiple NewspaperArticle objects.
  • A NewspaperPage SHOULD have a corresponding ore:Proxy object, to support ordering within a parent NewspaperContainer object.
  • A NewspaperPage SHOULD have a corresponding ore:Proxy object, to support ordering within a parent NewspaperIssue object.
  • A NewspaperPage SHOULD have a corresponding ore:Proxy object, to support ordering within a parent NewspaperArticle object.
  • If proxies and ordering are being used, a NewspaperPage MUST have a separate corresponding ore:Proxy object for each parent.
  • A NewspaperPage SHOULD have associated binary files representing the page image and/or text; these pcdm:File objects should be related to the NewspaperPage object using a single intermediary pcdm:FileSet object.
# example RDF graph
<https://example.org/newspaper_pages/1> a pcdm:Object ;
    pcdm:memberOf <https://example.org/newspaper_containers/1> ;
    pcdm:memberOf <https://example.org/newspaper_issues/1> ;
    pcdm:memberOf <https://example.org/newspaper_articles/1> ;
    dcterms:title "Daily Prophet: December 31, 1999: Page 2" ;
    oa:textDirection <http://www.w3.org/ns/oa#ltrDirection> .
Field Predicate Recommendation Expected Value Obligation
Member of pdcm:memberOf SHOULD pcdm:Object {1,1}
Label dcterms:title MUST rdfs:Literal {1,1}
Text direction oa:textDirection rdfs:Resource {0,n}
Page number schema:pagination rdfs:Literal {0,1}
Section bibo:section rdfs:Literal {0,1}

NewspaperPageFileSet

A pcdm:FileSet object, corresponding to image or text content representing a newspaper page. This object is a child of NewspaperPage, and serves as a parent for pcdm:File binary objects representing image or text files.

This object corresponds to a Canvas in the IIIF Presentation API model.

NOTES:

  • A NewspaperPageFileSet MUST be a member of one and only one NewspaperPage object.
  • A NewspaperPageFileSet SHOULD have associated binary files representing the article image and/or text; these pcdm:File objects should be related to the NewspaperPageFileSet object using pcdm:fileOf.
# example RDF graph
<https://example.org/newspaper_page_filesets/1> a pcdm:FileSet ;
    pcdm:memberOf <https://example.org/newspaper_pages/1> ;
    dbpedia-owl:height "28872" ;
    dbpedia-owl:width "18200" .
Field Predicate Recommendation Expected Value Obligation
Member of pdcm:memberOf MUST pcdm:Object {1,1}
Height dbpedia-owl:height MUST rdfs:Literal {0,1}
Width dbpedia-owl:width rdfs:Literal {0,1}

NewspaperArticle

A pcdm:Range object representing a single article within a newspaper issue. This object is a child of NewspaperIssue, and serves as a parent for one or more NewspaperPage objects. (Articles may appear on multiple pages.)

This object corresponds to a Range in the IIIF Presentation API model.

NOTES:

  • A NewspaperArticle SHOULD be a member of one and only one NewspaperIssue object.
  • A NewspaperArticle MAY have associated binary files representing a portion of the page image and/or text; these pcdm:File objects should be related to the NewspaperArticle object using a single intermediary pcdm:FileSet object.
# example RDF graph
<https://example.org/newspaper_articles/1> a pcdm:Range ;
    pcdm:memberOf <https://example.org/newspaper_issues/1> ;
    dcterms:title "Wall Street Mystery Man: Hud Head New to Halls of Power" ;
    relators:aut "Archer, Amy" .
Field Predicate Recommendation Expected Value Obligation
Member of pdcm:memberOf SHOULD pcdm:Object {1,1}
Title dcterms:title MUST rdfs:Literal {1,1}
Subtitle dcterms:alternative rdfs:Literal {0,n}
Description dcterms:description rdfs:Literal {0,n}
Author relators:aut rdfs:Literal {0,n}
Photographer relators:pht rdfs:Literal {0,n}
Creator dcterms:creator rdfs:Literal {0,n}
Contributor dcterms:contributor rdfs:Literal {0,n}
Type dcterms:type SHOULD rdfs:Resource {0,1}
Genre edm:hasType rdfs:Resource {0,1}
Publication date dcterms:issued SHOULD rdfs:Literal {0,1}
Publisher dc:publisher rdfs:Literal or rdfs:Resource {0,n}
Place of publication relators:pup rdfs:Resource {0,n}
Volume bibo:volume rdfs:Literal {0,1}
Edition name bf:editionStatement rdfs:Literal {0,1}
Edition number bf:editionEnumeration rdfs:Literal {0,1}
Issue bibo:issue rdfs:Literal {0,1}
Page number schema:pagination rdfs:Literal {0,1}
Section bibo:section rdfs:Literal {0,1}
Subject dc:subject rdfs:Resource {0,n}
Geographic coverage dcterms:spatial rdfs:Resource {0,n}
Language dc:language SHOULD rdfs:Literal or rdfs:Resource {0,n}
ISSN identifiers:issn SHOULD rdfs:Literal {0,1}
LCCN identifiers:lccn SHOULD rdfs:Literal {0,1}
OCLC # bibo:oclcnum rdfs:Literal {0,1}
Extent dcterms:extent rdfs:Literal {0,1}
Rights edm:rights rdfs:Resource {0,n}
License dcterms:rights rdfs:Resource {0,n}
Holding location bf:heldBy SHOULD rdfs:Literal or rdfs:Resource {0,1}

NewspaperArticleFileSet

A pcdm:FileSet object, corresponding to a single image or text content representing a part or all of a newspaper article. This object is a child of NewspaperArticle, and serves as a parent for pcdm:File binary objects representing image or text files.

This object corresponds to a Canvas in the IIIF Presentation API model.

NOTES:

  • A NewspaperArticleFileSet SHOULD be a member of one and only one NewspaperArticle object.
  • A NewspaperArticleFileSet MAY have a corresponding ore:Proxy object, to support ordering within a parent NewspaperArticle object, if maintaining article image order is needed.
  • A NewspaperArticleFileSet SHOULD have associated binary files representing the article image and/or text; these pcdm:File objects should be related to the NewspaperArticleFileSet object using pcdm:fileOf.
# example RDF graph
<https://example.org/newspaper_article_filesets/1> a pcdm:FileSet ;
    pcdm:memberOf <https://example.org/newspaper_articles/1> .
Field Predicate Recommendation Expected Value Obligation
Member of pdcm:memberOf MUST pcdm:Object {1,1}
Height dbpedia-owl:height rdfs:Literal {0,1}
Width dbpedia-owl:width rdfs:Literal {0,1}