You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working through an example with 100,000s of names.
To easily recover the names I use data.frame input from the package.
Because there are many names, some of them may be empty, but they do correspond to proper uncorrected name and I would like to recover them.
However, empty strings are silently removed from input and never found in the output dataset. While NAs don't get any IDs associated to them.
# Problem with IDs when using empty name taxa_frame=data.frame(
ID= paste0("test-", 1:4),
name= c(NA, "Helianthus", "", "")
)
matched=TNRS::TNRS(taxa_frame)
# The ID is not preserved when testing for an empty string, while it is for NA# or spacesmatched[, 1:5]
#> ID Name_submitted Unmatched_terms Overall_score Name_matched_id#> 1 <NA> FALSE FALSE NA #> 2 test-2 Helianthus 1 668749#> 3 test-4 NA
@Rekyt@bmaitner This issue is closely related to #15 and has to do with how the perl controller (correction: in the core code, not the API) prepares the request before submitting parallel batches to the (non-parallel) batch-processing application, which in turn submits each name individually to the PHP+MySQL name resolver. Empty strings are stripped as these can cause the resolver to crash. I'm not sure what's going on with NAs. I will need to take a close look at how R NAs get transformed as they get passed from R to PHP to Perl to PHP to MySQL and back.
In any case, it seems like the best way to handle this would be to store the users original request (names + optional IDs) unaltered as an array, then stitch it back together with the response after the entire request has been processed. That way, rows missing from the response due to empty strings or NAs would be present in their original form in the data returned to the user. I don't think skills are up to messing with the controller (not my code), but perhaps I could handle it in PHP on the API end. I'll take a look and see what I can do...
I'm working through an example with 100,000s of names.
To easily recover the names I use data.frame input from the package.
Because there are many names, some of them may be empty, but they do correspond to proper uncorrected name and I would like to recover them.
However, empty strings are silently removed from input and never found in the output dataset. While
NA
s don't get any IDs associated to them.Created on 2023-02-14 with reprex v2.0.2
The text was updated successfully, but these errors were encountered: