Skip to content
This repository has been archived by the owner on Oct 3, 2019. It is now read-only.

Inist-CNRS/node-ezs-istex

Repository files navigation

ISTEX statements for ezs

Warning: the package has been renamed to @ezs/istex.

It will no more be maintained here, but in the Inist-CNRS/ezs repository.

This package cannot be used alone. EZS has to be installed.

Usage

var ezs = require('ezs');
ezs.use(require('ezs-istex'));

Statements

Table of Contents

ISTEXFacet

Take an object containing a query string field, a facet, and output aggregations from the ISTEX API.

Parameters

  • query string ISTEX query (optional, default "*")
  • facet string ISTEX facet (optional, default "corpusName")
  • sid string User-agent identifier (optional, default "ezs-istex")

Examples

from([{ query: 'ezs', facet: 'corpusName' }])
  .pipe(ezs('ISTEXFacet', { sid: 'test', }))

Returns Array<Object>

ISTEXFilesContent

  • See: ISTEXFiles

Take an Object with ISTEX source and check the document's file. Warning: to access fulltext, you have to give a token parameter. ISTEXFetch produces the stream you need to save the file.

Parameters

Returns Object

ISTEXFilesWrap

  • See: ISTEXFiles

Take and Object with ISTEX stream and wrap into a single zip

Returns Buffer

ISTEXFiles

  • See: ISTEXScroll

Take an Object with ISTEX id and generate an object for each file

Parameters

  • fulltext string typology of the document to save (optional, default pdf)
  • metadata string format of the files to save (optional, default json)
  • enrichment string? enrichment of the document to save
  • sid string User-agent identifier (optional, default "ezs-istex")

Returns Array

ISTEXFetch

Take Object with id and returns the document's metadata

Parameters

  • source string Field to use to fetch documents (optional, default "id")
  • target string
  • id string ISTEX Identifier of a document (optional, default data.id)
  • sid string User-agent identifier (optional, default "ezs-istex")

Examples

Input:

[{
  id: '87699D0C20258C18259DED2A5E63B9A50F3B3363',
}, {
  id: 'ark:/67375/QHD-T00H6VNF-0',
}]

will produce two JSON records.

.pipe(ezs('ISTEXFetch', { source: 'id' }))

Returns Array<Object>

ISTEXParseDotCorpus

Parse a .corpus file content, and execute the action contained in the .corpus file.

Examples

1query.corpus

[ISTEX]
query = language.raw:rum
field = doi
field = author
field = title
field = language
field = publicationDate
field = keywords
field = host
field = fulltext

1notice.corpus

[ISTEX]
id 2FF3F5B1477986B9C617BB75CA3333DBEE99EB05

Returns Object

ISTEXResult

  • See: ISTEXScroll

Take Object containing results of ISTEX API, and returns hits value (documents).

This should be placed after ISTEXScroll.

Parameters

  • source string (optional, default data)
  • target string (optional, default feed)

Returns Array<Object>

ISTEXSave

  • See: ISTEXFetch

Take and Object with ISTEX id and save the document's file. Warning: to access fulltext, you have to give a token parameter. ISTEXFetch produces the stream you need to save the file.

Parameters

  • directory string path for the PDFs (optional, default currentworkingdirectory)
  • typology string typology of the document to save (optional, default "fulltext")
  • format string format of the files to save (optional, default "pdf")
  • sid string User-agent identifier (optional, default "ezs-istex")
  • token string? authentication token (see documentation)

Returns Array

ISTEXTriplify

  • See: ISTEXResult
  • See: OBJFlatten (from ezs-basics)

Take Object containing flatten hits from ISTEXResult.

If the environment variable DEBUG is set, some errors could appear on stderr.

Parameters

  • property Object path to uri for the properties to output (property and uri separated by ->) (optional, default [])
  • source string the root of the keys (ex: istex/) (optional, default "")

Examples

data:

{
  'author/0/name': 'Geoffrey Strickland',
  'author/0/affiliations/0': 'University of Reading',
  'host/issn/0': '0047-2441',
  'host/eissn/0': '1740-2379',
  'title': 'Maupassant, Zola, Jules Vallès and the Paris Commune of 1871',
  'publicationDate': '1983',
  'doi/0': '10.1177/004724418301305203',
  'id': 'F6CB7249E90BD96D5F7E3C4E80CC1C3FEE4FF483',
  'score': 1
}

javascript:

.pipe(ezs('ISTEXTriplify', {
   property: [
     'doi/0 -> http://purl.org/ontology/bibo/doi',
     'language -> http://purl.org/dc/terms/language',
     'author/\\d+/name -> http://purl.org/dc/terms/creator',
     'author/\\d+/affiliations -> https://data.istex.fr/ontology/istex#affiliation',
   ],
 ));

output:

<https://data.istex.fr/document/F6CB7249E90BD96D5F7E3C4E80CC1C3FEE4FF483>
    a <http://purl.org/ontology/bibo/Document> ;
      "10.1002/zaac.19936190205" ;
    <https://data.istex.fr/ontology/istex#idIstex> "F6CB7249E90BD96D5F7E3C4E80CC1C3FEE4FF483" ;
    <http://purl.org/dc/terms/creator> "Geoffrey Strickland" ;
    <https://data.istex.fr/ontology/istex#affiliation> "University of Reading" ;

Returns string

ISTEX

Take an array and returns matching documents for every value of the array

Parameters

  • query (string | Array<string>) ISTEX query (or queries) (optional, default data.query||[])
  • id (string | Array<string>) ISTEX id (or ids) (optional, default data.id||[])
  • maxPage number maximum number of pages to get
  • size number size of each page of results
  • duration string maximum duration between two requests (ex: "30s")
  • field Array<Object> fields to output

Examples

.pipe(ezs('ISTEX', {
  query: 'this is a test',
  size: 3,
  maxPage: 1,
  sid: 'test'
}))

Returns Array<Object>

ISTEXScroll

Take an object containing a query string field and output records from the ISTEX API. Every output record is merged with the input object.

Parameters

  • query string ISTEX query (optional, default input)
  • sid string User-agent identifier (optional, default "ezs-istex")
  • maxPage number Maximum number of pages to get
  • size number size of each page of results (optional, default 2000)
  • duration string maximum duration between two requests (optional, default "5m")
  • field Array<string> fields to get (optional, default ["doi"])

Examples

from([{ query: 'this is a test' }])
  .pipe(ezs('ISTEXScroll', {
      maxPage: 2,
      size: 1,
      sid: 'test',
  }))

Returns Array<Object>

ISTEXUniq

Remove duplicates triples within a single document's set of triples (same subject).

Assume that every triple of a document (except the first one) follows another triple of the same document.

Examples

Input:

<https://api.istex.fr/ark:/67375/NVC-JMPZTKTT-R> <http://purl.org/dc/terms/creator> "S Corbett" .
<https://api.istex.fr/ark:/67375/NVC-JMPZTKTT-R> <https://data.istex.fr/ontology/istex#affiliation> "Department of Public Health, University of Sydney, Australia." .
<https://api.istex.fr/ark:/67375/NVC-JMPZTKTT-R> <https://data.istex.fr/ontology/istex#affiliation> "Department of Public Health, University of Sydney, Australia." .
<https://api.istex.fr/ark:/67375/NVC-JMPZTKTT-R> <https://data.istex.fr/ontology/istex#affiliation> "Department of Public Health, University of Sydney, Australia." .

Action in a .ezs script

[ISTEXUniq]

Output

<https://api.istex.fr/ark:/67375/NVC-JMPZTKTT-R> <http://purl.org/dc/terms/creator> "S Corbett" .
<https://api.istex.fr/ark:/67375/NVC-JMPZTKTT-R> <https://data.istex.fr/ontology/istex#affiliation> "Department of Public Health, University of Sydney, Australia." .

ISTEXUnzip

Take the content of a zip file, extract JSON files, and yield JSON objects.

The zip file comes from dl.istex.fr, and the manifest.json is not extracted.

Returns any Array