Skip to content

AllanChain/zotero-arxiv-workflow

Repository files navigation

arXiv Workflow for Zotero

GitHub release Using Zotero Plugin Template total downloads

This Zotero plugin addresses the pain when you store papers from arXiv and want to update your Zotero entry when they are published.

Warning

This plugin is in alpha stage and only suports Zotero 7!

I strongly recommend you to check the results manually after operations.

✨ Features

  • 🪢 Merge a preprint item and a journal article item without pain
  • 🗃️ Easy to set which PDF to open by default
  • 📄 Search online if a arXiv paper is published or updated and update the information and PDF accordingly
  • 🌐 Download the latest version of published PDF

🤔 Why?

Easier workflow with arXiv paper!

  • The Zotero built-in merging feature does not support merging items of different type, therefore an arXiv paper and a journal article cannot merge.
  • I need the journal information from the journal item, while keeping the ID the same as the arXiv one. Because many plugins store data based on item ID, and I do not want to lose them.

🪐 How?

This plugin focusing on the following workflow:

  • One parent item for both arXiv and published version.
    • Keeping both items and relating them is also possible in Zotero, but not the focus of this plugin.
  • Item ID from the preprint item is used.
    • Some plugins store data based on item ID.
  • Metadata from the published version is used.
    • Usually, only the metadata of the publish version is needed to cite.
    • However, something like the creation date follows the preprint item.
    • The URL for the preprint is kept in the snapshot or web link attachment.
  • PDFs from both versions are kept.
    • In case there are some annotations on the preprint PDF.
    • And it is configurable which PDF to open by default.

The main logic of the merging process is demonstrated by the following plot:

Before:                                     After:
====================                        ====================
ItemID A (preprint)                         ItemID B
--------------------                        --------------------
Metadata A:                                 Metadata A:
Date added (A)                              Date added (B)
URL (A)                                     URL (B)
...                                         ...
--------------------               \        --------------------
PDF attachment a*         ----------\       PDF attachment a*
...                       ----------/       PDF attachment b
====================               /        Web Link attachment
ItemID B (published)                        ...
--------------------                        ====================
Metadata B
Date added (B)                              * means prefered PDF
URL (B)
...
--------------------
PDF attachment b*
...
====================

📸 Screenshots

Merge arXiv Prefer PDF
Screenshot of merge arXiv Screenshot of prefer PDF

🔧 Installation

Download zotero-arxiv-workflow.xpi from the release page. Firefox users need to right-click on the link and use "Save link as" instead of direct downloading it. After downloading, click "Tools" > "Plugins" in Zotero menu and drag the downloaded file into the dialog.

🎈 Explanation of each feature

Features can be disabled from the plugin settings, but this will not affect the JavaScript API.

🪢 Merge arXiv paper and the published one

The main logic of merging items is described above. A few points to emphasis:

  • Select and only select two items: an arXiv paper and its published version.
    • Do NOT select anything else, including any attachments of an item.
  • An item is considered published if it has type "Journal Article" or "Conference Paper".
  • The item for published version is deleted and the item for arXiv version will have updated info and attachments.
  • If the titles of these two items are different, a dialog will popup to ask user confirmation.
JavaScript API
async Zotero.arXivWorkflow.api.merge(
  preprintItem: Zotero.Item,
  publishedItem: Zotero.Item,
  suppressWarn = false,
)

This function assumes that the first argument is an arXiv version and the second is the published one. Currently, no checks will be performed to ensure this. The function caller is responsible to make sure the type of items is correct.

If suppressWarn is true, no confirmation dialog will popup if the title of two items are different.

🗃️ Prefer to open a specific PDF

Maybe you have merged some items manually before. Or maybe you just want to change the default PDF to open. Either case, you will find the "Prefer PDF" feature useful. To use this feature, select (and only select) the PDF you want to open by default, right click, and select "Prefer this PDF".

Under the hood, this plugin does something "dirty".

That is because Zotero does not have the functionality of setting the default PDF to open. It determines the PDF to open by checking and sorting by:

  • The attachment is a PDF
  • The URL field of the PDF matches the URL of the parent item
  • dateAdded of the PDF

Or in SQL:

ORDER BY contentType='application/pdf' DESC, url=? DESC, dateAdded ASC

Therefore, to make Zotero perfer a specific PDF, this plugin

  1. sets URL field of the PDF attachment the same as that of parent item
  2. sets the dateAdded field to be the oldest among all PDFs of parent item
JavaScript API
async Zotero.arXivWorkflow.api.preferPDF(
  selectedAttachment: Zotero.Item
)

This function assumes that the argument is a PDF attachment. Currently, no checks will be performed to ensure this. The function caller is responsible to perform the checks.

📄 Search for updated version of an arXiv paper

If you have a preprint item for the arXiv paper, and you want to find if it has been published on journals or updated on arXiv, and then update the information, you can right click on the preprint item and select "Update arXiv paper". This will search:

  1. Published versions by trying:
    1. arXiv for the "Related DOI" field, which may be updated if the paper got published
    2. Semantic Scholar API
    3. DBLP API
  2. If no published version found, the plugin will search arXiv for updated versions

Note

It is not trivial to correctly find the published version. If it fails, you'd better add the journal article item manually and use the merge feature this plugin provides.

If a published version is found, a new item will be created automatically and the published PDF will be downloaded. If you do not have access to the journal PDFs, you can disable downloading PDF from settings, and just update the metadata. After that, the preprint item and the newly created journal item will be merged with the same logic as mentioned earlier.

JavaScript API
async Zotero.arXivWorkflow.api.arXivUpdate(
  preprintItem: Zotero.Item
)

This function assumes that the argument is an arXiv item, and no checks will be performed to ensure this. The function caller is responsible to perform the checks.

🌐 Download latest PDF

Tip

This feature requires journal subscription.

Say you have an arXiv paper PDF and import it into Zotero. Zotero finds that it has been published and uses the information from the published version. A few days later you might want to download the published version because it might be different from the arXiv one. With original Zotero, you have to open the journal URL, download the PDF, and add it as an attachment. With this plugin, it is as easy as right click and select "Download latest PDF".

JavaScript API
async Zotero.arXivWorkflow.api.updatePDF(
  journalItem: Zotero.Item
)

This function assumes that the argument is an journal item, and no checks will be performed to ensure this. The function caller is responsible to perform the checks.

Under the hood, this just calls Zotero.Attachments.addAvailableFile and limits the download source to DOI only.

💻 Development

This repo is created from the Zotero plugin template, please follow the quick start guide.

The following resources are also helpful: