Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix vignette #61

Merged
merged 2 commits into from
Jan 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: aRxiv
Title: Interface to the arXiv API
Version: 0.8
Version: 0.9.1
Date: 2024-01-22
Authors@R: c(person("Karthik", "Ram", role="aut",
email="[email protected]", comment=c(ORCID = "0000-0002-0233-1757")),
Expand Down
9 changes: 9 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
aRxiv 0.9.1
-----------

### BUG FIXES

* Small revision to aRxiv vignette to deal with the change in the
structure of the `arxiv_cats` dataset.


aRxiv 0.8
---------

Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,12 @@ install.packages("aRxiv")

__Development Version__

Or use `devtools::install_github()` to get the (more recent) version
Or use `remotes::install_github()` to get the (more recent) version
at [GitHub](https://github.com/rOpenSci/aRxiv):

```r
install.packages("devtools")
library(devtools)
install.packages("remotes")
library(remotes)
install_github("ropensci/aRxiv")
```

Expand Down
53 changes: 31 additions & 22 deletions inst/doc/aRxiv.html
Original file line number Diff line number Diff line change
Expand Up @@ -350,10 +350,10 @@ <h2>Installation</h2>
<p>You can install the <a href="https://github.com/rOpenSci/aRxiv">aRxiv
package</a> via <a href="https://cran.r-project.org">CRAN</a>:</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" tabindex="-1"></a><span class="fu">install.packages</span>(<span class="st">&quot;aRxiv&quot;</span>)</span></code></pre></div>
<p>Or use <code>devtools::install_github()</code> to get the (possibly
<p>Or use <code>remotes::install_github()</code> to get the (possibly
more recent) version at <a href="https://github.com/rOpenSci/aRxiv">GitHub</a>:</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="fu">install.packages</span>(<span class="st">&quot;devtools&quot;</span>)</span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a><span class="fu">library</span>(devtools)</span>
<div class="sourceCode" id="cb2"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1" tabindex="-1"></a><span class="fu">install.packages</span>(<span class="st">&quot;remotes&quot;</span>)</span>
<span id="cb2-2"><a href="#cb2-2" tabindex="-1"></a><span class="fu">library</span>(remotes)</span>
<span id="cb2-3"><a href="#cb2-3" tabindex="-1"></a><span class="fu">install_github</span>(<span class="st">&quot;ropensci/aRxiv&quot;</span>)</span></code></pre></div>
</div>
<div id="basic-use" class="section level2">
Expand Down Expand Up @@ -487,18 +487,27 @@ <h3>Search terms</h3>
<h3>Subject classifications</h3>
<p>arXiv has a set of 155 subject classifications, searchable with the
prefix <code>cat:</code>. The aRxiv package contains a dataset
<code>arxiv_cats</code> containing the abbreviations and descriptions.
Here are the statistics categories.</p>
<div class="sourceCode" id="cb32"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb32-1"><a href="#cb32-1" tabindex="-1"></a>arxiv_cats[<span class="fu">grep</span>(<span class="st">&#39;^stat&#39;</span>, arxiv_cats<span class="sc">$</span>abbreviation),]</span></code></pre></div>
<pre><code>## [1] category field subfield short_description long_description
## &lt;0 rows&gt; (or 0-length row.names)</code></pre>
<code>arxiv_cats</code> containing the categories, short and long
descriptions, as well as field (and, for Physics, subfield). Here are
the column names.</p>
<div class="sourceCode" id="cb32"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb32-1"><a href="#cb32-1" tabindex="-1"></a><span class="fu">colnames</span>(arxiv_cats)</span></code></pre></div>
<pre><code>## [1] &quot;category&quot; &quot;field&quot; &quot;subfield&quot; &quot;short_description&quot; &quot;long_description&quot;</code></pre>
<p>Here are the statistics categories.</p>
<div class="sourceCode" id="cb34"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb34-1"><a href="#cb34-1" tabindex="-1"></a>arxiv_cats[arxiv_cats<span class="sc">$</span>field<span class="sc">==</span><span class="st">&quot;Statistics&quot;</span>, <span class="fu">c</span>(<span class="st">&quot;category&quot;</span>, <span class="st">&quot;short_description&quot;</span>)]</span></code></pre></div>
<pre><code>## category short_description
## 150 stat.AP Applications
## 151 stat.CO Computation
## 152 stat.ME Methodology
## 153 stat.ML Machine Learning
## 154 stat.OT Other Statistics
## 155 stat.TH Statistics Theory</code></pre>
<p>To search these categories, you need to include either the full term
or use the <code>*</code> wildcard.</p>
<div class="sourceCode" id="cb34"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb34-1"><a href="#cb34-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&#39;cat:stat&#39;</span>)</span></code></pre></div>
<div class="sourceCode" id="cb36"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb36-1"><a href="#cb36-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&#39;cat:stat&#39;</span>)</span></code></pre></div>
<pre><code>## [1] 0</code></pre>
<div class="sourceCode" id="cb36"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb36-1"><a href="#cb36-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&#39;cat:stat.AP&#39;</span>)</span></code></pre></div>
<div class="sourceCode" id="cb38"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb38-1"><a href="#cb38-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&#39;cat:stat.AP&#39;</span>)</span></code></pre></div>
<pre><code>## [1] 17577</code></pre>
<div class="sourceCode" id="cb38"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb38-1"><a href="#cb38-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&#39;cat:stat*&#39;</span>)</span></code></pre></div>
<div class="sourceCode" id="cb40"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb40-1"><a href="#cb40-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&#39;cat:stat*&#39;</span>)</span></code></pre></div>
<pre><code>## [1] 114647</code></pre>
</div>
<div id="dates-and-ranges-of-dates" class="section level3">
Expand All @@ -513,26 +522,26 @@ <h3>Dates and ranges of dates</h3>
<code>2007-10-18 12:25:34</code>. You can use <code>*</code> for a
wildcard for the times. For example, to get all manuscripts with initial
submission on 2007-10-18:</p>
<div class="sourceCode" id="cb40"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb40-1"><a href="#cb40-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&#39;submittedDate:20071018*&#39;</span>)</span></code></pre></div>
<div class="sourceCode" id="cb42"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb42-1"><a href="#cb42-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&#39;submittedDate:20071018*&#39;</span>)</span></code></pre></div>
<pre><code>## [1] 196</code></pre>
<p>But you can’t use the wildcard within the <em>dates</em>.</p>
<div class="sourceCode" id="cb42"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb42-1"><a href="#cb42-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&#39;submittedDate:2007*&#39;</span>)</span></code></pre></div>
<div class="sourceCode" id="cb44"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb44-1"><a href="#cb44-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&#39;submittedDate:2007*&#39;</span>)</span></code></pre></div>
<pre><code>## [1] 0</code></pre>
<p>To get a count of all manuscripts with original submission in 2007,
use a date range, like <code>[from_date TO to_date]</code>. (If you give
a partial date, it’s treated as the earliest date/time that matches, and
the range appears to be up to but not including the second
date/time.)</p>
<div class="sourceCode" id="cb44"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb44-1"><a href="#cb44-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&#39;submittedDate:[2007 TO 2008]&#39;</span>)</span></code></pre></div>
<div class="sourceCode" id="cb46"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb46-1"><a href="#cb46-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&#39;submittedDate:[2007 TO 2008]&#39;</span>)</span></code></pre></div>
<pre><code>## [1] 55749</code></pre>
</div>
</div>
<div id="search-results" class="section level2">
<h2>Search results</h2>
<p>The output of <code>arxiv_search()</code> is a data frame with the
following columns.</p>
<div class="sourceCode" id="cb46"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb46-1"><a href="#cb46-1" tabindex="-1"></a>res <span class="ot">&lt;-</span> <span class="fu">arxiv_search</span>(<span class="st">&#39;au:&quot;Peter Hall&quot;&#39;</span>)</span>
<span id="cb46-2"><a href="#cb46-2" tabindex="-1"></a><span class="fu">names</span>(res)</span></code></pre></div>
<div class="sourceCode" id="cb48"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb48-1"><a href="#cb48-1" tabindex="-1"></a>res <span class="ot">&lt;-</span> <span class="fu">arxiv_search</span>(<span class="st">&#39;au:&quot;Peter Hall&quot;&#39;</span>)</span>
<span id="cb48-2"><a href="#cb48-2" tabindex="-1"></a><span class="fu">names</span>(res)</span></code></pre></div>
<pre><code>## [1] &quot;id&quot; &quot;submitted&quot; &quot;updated&quot; &quot;title&quot; &quot;abstract&quot;
## [6] &quot;authors&quot; &quot;affiliations&quot; &quot;link_abstract&quot; &quot;link_pdf&quot; &quot;link_doi&quot;
## [11] &quot;comment&quot; &quot;journal_ref&quot; &quot;doi&quot; &quot;primary_category&quot; &quot;categories&quot;</code></pre>
Expand All @@ -551,9 +560,9 @@ <h2>Search results</h2>
Classification System</a> (e.g., F.2.2). These are not searchable with
<code>cat:</code> but are searchable with a general search.</li>
</ul>
<div class="sourceCode" id="cb48"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb48-1"><a href="#cb48-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&quot;cat:14J60&quot;</span>)</span></code></pre></div>
<div class="sourceCode" id="cb50"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb50-1"><a href="#cb50-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&quot;cat:14J60&quot;</span>)</span></code></pre></div>
<pre><code>## [1] 0</code></pre>
<div class="sourceCode" id="cb50"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb50-1"><a href="#cb50-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&quot;14J60&quot;</span>)</span></code></pre></div>
<div class="sourceCode" id="cb52"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb52-1"><a href="#cb52-1" tabindex="-1"></a><span class="fu">arxiv_count</span>(<span class="st">&quot;14J60&quot;</span>)</span></code></pre></div>
<pre><code>## [1] 870</code></pre>
</div>
<div id="sorting-results" class="section level2">
Expand All @@ -567,9 +576,9 @@ <h2>Sorting results</h2>
the order in <code>id_list</code>.</p>
<p>Here’s an example, to sort the results by the date the manuscripts
were last updated, in descending order.</p>
<div class="sourceCode" id="cb52"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb52-1"><a href="#cb52-1" tabindex="-1"></a>res <span class="ot">&lt;-</span> <span class="fu">arxiv_search</span>(<span class="st">&#39;au:&quot;Peter Hall&quot; AND ti:deconvolution&#39;</span>,</span>
<span id="cb52-2"><a href="#cb52-2" tabindex="-1"></a> <span class="at">sort_by=</span><span class="st">&quot;updated&quot;</span>, <span class="at">ascending=</span><span class="cn">FALSE</span>)</span>
<span id="cb52-3"><a href="#cb52-3" tabindex="-1"></a>res<span class="sc">$</span>updated</span></code></pre></div>
<div class="sourceCode" id="cb54"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb54-1"><a href="#cb54-1" tabindex="-1"></a>res <span class="ot">&lt;-</span> <span class="fu">arxiv_search</span>(<span class="st">&#39;au:&quot;Peter Hall&quot; AND ti:deconvolution&#39;</span>,</span>
<span id="cb54-2"><a href="#cb54-2" tabindex="-1"></a> <span class="at">sort_by=</span><span class="st">&quot;updated&quot;</span>, <span class="at">ascending=</span><span class="cn">FALSE</span>)</span>
<span id="cb54-3"><a href="#cb54-3" tabindex="-1"></a>res<span class="sc">$</span>updated</span></code></pre></div>
<pre><code>## [1] &quot;2010-03-01 11:33:37&quot; &quot;2008-10-27 14:27:52&quot; &quot;2008-04-04 12:19:05&quot; &quot;2007-10-18 12:25:34&quot;</code></pre>
</div>
<div id="technical-details" class="section level2">
Expand Down Expand Up @@ -605,7 +614,7 @@ <h3>Limit time between search requests</h3>
period for the delay configurable with the R option
<code>&quot;aRxiv_delay&quot;</code> (in seconds). The default is 3 seconds.</p>
<p>To reduce the delay to 1 second, use:</p>
<div class="sourceCode" id="cb54"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb54-1"><a href="#cb54-1" tabindex="-1"></a><span class="fu">options</span>(<span class="at">aRxiv_delay=</span><span class="dv">1</span>)</span></code></pre></div>
<div class="sourceCode" id="cb56"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb56-1"><a href="#cb56-1" tabindex="-1"></a><span class="fu">options</span>(<span class="at">aRxiv_delay=</span><span class="dv">1</span>)</span></code></pre></div>
<p><strong>Don’t</strong> do searches in parallel (e.g., via the
parallel package). You may be locked out from the arXiv API.</p>
</div>
Expand Down
17 changes: 12 additions & 5 deletions vignettes/aRxiv.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,12 @@ via [CRAN](https://cran.r-project.org):
install.packages("aRxiv")
```

Or use `devtools::install_github()` to get the (possibly more recent) version
Or use `remotes::install_github()` to get the (possibly more recent) version
at [GitHub](https://github.com/rOpenSci/aRxiv):

```{r install_pkgs, eval=FALSE}
install.packages("devtools")
library(devtools)
install.packages("remotes")
library(remotes)
install_github("ropensci/aRxiv")
```

Expand Down Expand Up @@ -185,11 +185,18 @@ arxiv_count('au:"P Hall"')

arXiv has a set of `r nrow(arxiv_cats)` subject classifications,
searchable with the prefix `cat:`. The aRxiv package contains a
dataset `arxiv_cats` containing the abbreviations and descriptions.
dataset `arxiv_cats` containing the categories, short and long
descriptions, as well as field (and, for Physics, subfield).
Here are the column names.

```{r arxiv_cats_colnames}
colnames(arxiv_cats)
```

Here are the statistics categories.

```{r arxiv_cats}
arxiv_cats[grep('^stat', arxiv_cats$abbreviation),]
arxiv_cats[arxiv_cats$field=="Statistics", c("category", "short_description")]
```

To search these categories, you need to include either the full term
Expand Down