You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Certain records seem to cause a crash. We have narrowed it down to this query, which should retrieve all records submitted in a one-minute period of 22:16 to 22:17 on January 24, 2018.
dfy<-arxiv_search(query = "submittedDate:[201801242216 TO 201801242217]", limit = 15000, batchsize=2000)
which returns an error of:
> Error in attr(results, "search_info") <- search_attributes(query, id_list, :
> attempt to set an attribute on NULL
>
If we were to search using title, the same error appears: dfy<-arxiv_search(query = "ti:Fourfolds", limit = 1200, batchsize=300)
We therefore think that either the record is corrupt (e.g., hidden unintentional column delimiter, etc.)
A similar error occurs on this single-date range, though we have not isolated the individual record causing the error: dfy<-arxiv_search(query = "submittedDate:[201612030000 TO 201612040000]", limit = 15000, batchsize=2000)
Does the query need to be modified? Can the query auto-skip corrupt records? Should arxiv be notified?
The text was updated successfully, but these errors were encountered:
Thanks for your very clear bug report! I'll look into the details. I see that arxiv_search(query="ti:Fourfolds", limit=100) works but arxiv_search(query="ti:Fourfolds", limit=101) gives the error.
I'll follow both of your suggestions: trap such errors better and also report the problem to arxiv, if there's a problem either with the record or with their API.
Okay, I get it. For this search, you get proper results if limit <= 77, but if limit >= 78, it returns NULL. If batchsize < limit and you're in this latter case, you get the error about assigning attributes to NULL.
> dim(result <- arxiv_search(query="ti:Fourfolds", limit=77))
[1] 77 15
> dim(result <- arxiv_search(query="ti:Fourfolds", limit=78))
[1] 0 15
!> dim(result <- arxiv_search(query="ti:Fourfolds", limit=78, batchsize=50))
retrieved batch 1
Error in attr(results, "search_info") <- search_attributes(query, id_list, :
attempt to set an attribute on NULL
Certain records seem to cause a crash. We have narrowed it down to this query, which should retrieve all records submitted in a one-minute period of 22:16 to 22:17 on January 24, 2018.
dfy<-arxiv_search(query = "submittedDate:[201801242216 TO 201801242217]", limit = 15000, batchsize=2000)
which returns an error of:
We can isolate the record, which appears to be this one:
https://arxiv.org/abs/1610.04266
If we were to search using title, the same error appears:
dfy<-arxiv_search(query = "ti:Fourfolds", limit = 1200, batchsize=300)
We therefore think that either the record is corrupt (e.g., hidden unintentional column delimiter, etc.)
A similar error occurs on this single-date range, though we have not isolated the individual record causing the error:
dfy<-arxiv_search(query = "submittedDate:[201612030000 TO 201612040000]", limit = 15000, batchsize=2000)
Does the query need to be modified? Can the query auto-skip corrupt records? Should arxiv be notified?
The text was updated successfully, but these errors were encountered: