Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"records are values not objects" on valid JSON #50

Open
smithjd opened this issue Jan 6, 2016 · 1 comment
Open

"records are values not objects" on valid JSON #50

smithjd opened this issue Jan 6, 2016 · 1 comment

Comments

@smithjd
Copy link

smithjd commented Jan 6, 2016

This snippet of valid JSON results in "records are value not objects":

library(dplyr)
library(tidyjson)

json <- '
    [{"country":"us","city":"Portland","topics":[{"urlkey":"videogame","name":"Video Games","id":4471},{"urlkey":"board-games","name":"Board Games","id":19585},{"urlkey":"computer-programming","name":"Computer programming","id":48471},{"urlkey":"opensource","name":"Open Source","id":563}],"joined":1416349237000,"link":"http://www.meetup.com/members/156440062","bio":"Analytics engineer.  Primarily work in the Hadoop space.","lon":-122.65,"other_services":{},"name":"Aaron Wirick","visited":1443078098000,"self":{"common":{}},"id":156440062,"state":"OR","lat":45.56,"status":"active"}]
    '
    json %>% as.tbl_json %>% gather_keys

The issue is posted on Sack Overflow: http://stackoverflow.com/questions/34624172/what-does-records-are-values-not-objects-mean-in-tidyjson

According to http://jsonlint.com/ the JSON is valid.

@colearendt
Copy link

Posted answer on SO:

As mentioned in one of the comments, gather_keys is looking for objects, where you have an array. What you should probably be using here is gather_array.

Further, the other answer (on SO) uses a more brute-force approach to parsing the JSON attribute that the tidyjson package creates. tidyjson provides methods for dealing with this in a bit cleaner pipeline if desired:

library(dplyr)
library(tidyjson)

json <- '
[{"country":"us","city":"Portland"
,"topics":[
 {"urlkey":"videogame","name":"Video Games","id":4471}
 ,{"urlkey":"board-games","name":"Board Games","id":19585}
 ,{"urlkey":"computer-programming","name":"Computer programming","id":48471}
 ,{"urlkey":"opensource","name":"Open Source","id":563}
]
,"joined":1416349237000
,"link":"http://www.meetup.com/members/156440062"
,"bio":"Analytics engineer.  Primarily work in the Hadoop space."
,"lon":-122.65,"other_services":{}
,"name":"Aaron Wirick","visited":1443078098000
,"self":{"common":{}}
,"id":156440062,"state":"OR","lat":45.56,"status":"active"
}]
'

mydf <- json %>% as.tbl_json %>% gather_array %>% 
spread_values(
 country=jstring('country')
 , city=jstring('city')
 , joined=jnumber('joined')
 , bio=jstring('bio')
) %>% 
enter_object('topics') %>% 
gather_array %>%
spread_values(urlkey=jstring('urlkey'))

This pipeline really shines if there are multiple such objects in the array. Hope that is helpful, even if very long after the fact!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants