Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lookup_release_by_id: can't include recordings #9

Open
gregrs-uk opened this issue Sep 16, 2020 · 6 comments
Open

lookup_release_by_id: can't include recordings #9

gregrs-uk opened this issue Sep 16, 2020 · 6 comments

Comments

@gregrs-uk
Copy link

Thanks for writing this package.

It seems that using lookup_release_by_id(mbid, includes = c("recordings")) doesn't actually get any recordings. Please see the reprex below.

library(musicbrainz)

This gives some recordings:

sanborn_id <- search_artists("David+Sanborn",1)$mbid
#> Returning artists 1 to 1 of 11127
lookup_artist_by_id(sanborn_id, includes=c("recordings"))$recordings
#> [[1]]
#> # A tibble: 25 x 5
#>    recording_mbid                 disambiguation length title              video
#>    <chr>                          <chr>           <int> <chr>              <lgl>
#>  1 116f19cb-f7ac-4d3a-8849-7fe8e… ""             269333 'way 'Cross Georg… FALSE
#>  2 340ee14b-5fb5-4be4-b35c-82804… ""             358866 5:15               FALSE
#>  3 bc77006d-520b-4e15-8aca-e1d9f… ""             407760 5:15               FALSE
#>  4 f71f2e3d-bf1e-43bf-a812-4c2a6… ""             334200 7th Ave.           FALSE
#>  5 71183654-6452-48e3-87be-11438… ""             307960 A Change of Heart  FALSE
#>  6 2d9aa101-819f-45f7-b8ce-bcfac… ""             320186 A La Verticale     FALSE
#>  7 f45e420b-240a-4adb-b5d3-26a57… ""             334333 A Tear for Crystal FALSE
#>  8 6ab8b1f4-de0d-40ea-96dc-95842… ""             423226 A Tear for Crystal FALSE
#>  9 4915c526-8747-40ff-9d07-4d1d1… ""             334000 A Tear For Crysta… FALSE
#> 10 2dc8530e-fd8c-4147-acc2-aa783… ""             315267 Again an Again     FALSE
#> # … with 15 more rows

This doesn't give any recordings, but I would've expected to get the recordings that make up the release.

the_wall_mbid <- search_releases("The Wall AND artist:Pink Floyd",1)$mbid
#> Returning releases 1 to 1 of 109
lookup_release_by_id(the_wall_mbid, includes = c("recordings"))$recordings
#> [[1]]
#> # A tibble: 0 x 0

Created on 2020-09-16 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.3 (2020-02-29)
#>  os       macOS Mojave 10.14.6        
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_GB.UTF-8                 
#>  ctype    en_GB.UTF-8                 
#>  tz       Europe/London               
#>  date     2020-09-16                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                              
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 3.6.0)                      
#>  backports     1.1.8      2020-06-17 [1] CRAN (R 3.6.2)                      
#>  callr         3.4.3      2020-03-28 [1] CRAN (R 3.6.2)                      
#>  cli           2.0.2      2020-02-28 [1] CRAN (R 3.6.0)                      
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 3.6.0)                      
#>  curl          4.3        2019-12-02 [1] CRAN (R 3.6.0)                      
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 3.6.0)                      
#>  devtools      2.3.1      2020-07-21 [1] CRAN (R 3.6.2)                      
#>  digest        0.6.25     2020-02-23 [1] CRAN (R 3.6.0)                      
#>  dplyr         1.0.2      2020-08-18 [1] CRAN (R 3.6.2)                      
#>  ellipsis      0.3.1      2020-05-15 [1] CRAN (R 3.6.2)                      
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 3.6.0)                      
#>  fansi         0.4.1      2020-01-08 [1] CRAN (R 3.6.0)                      
#>  fs            1.5.0      2020-07-31 [1] CRAN (R 3.6.2)                      
#>  generics      0.0.2      2018-11-29 [1] CRAN (R 3.6.0)                      
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 3.6.2)                      
#>  highr         0.8        2019-03-20 [1] CRAN (R 3.6.0)                      
#>  htmltools     0.5.0      2020-06-16 [1] CRAN (R 3.6.2)                      
#>  httr          1.4.2      2020-07-20 [1] CRAN (R 3.6.2)                      
#>  jsonlite      1.7.1      2020-09-07 [1] CRAN (R 3.6.2)                      
#>  knitr         1.29       2020-06-23 [1] CRAN (R 3.6.2)                      
#>  lifecycle     0.2.0      2020-03-06 [1] CRAN (R 3.6.0)                      
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 3.6.0)                      
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 3.6.0)                      
#>  musicbrainz * 0.0.0.9000 2020-09-16 [1] Github (dmi3kno/musicbrainz@537ed0d)
#>  pillar        1.4.6      2020-07-10 [1] CRAN (R 3.6.2)                      
#>  pkgbuild      1.1.0      2020-07-13 [1] CRAN (R 3.6.2)                      
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 3.6.0)                      
#>  pkgload       1.1.0      2020-05-29 [1] CRAN (R 3.6.2)                      
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 3.6.0)                      
#>  processx      3.4.3      2020-07-05 [1] CRAN (R 3.6.2)                      
#>  ps            1.3.4      2020-08-11 [1] CRAN (R 3.6.2)                      
#>  purrr         0.3.4      2020-04-17 [1] CRAN (R 3.6.2)                      
#>  R6            2.4.1      2019-11-12 [1] CRAN (R 3.6.0)                      
#>  ratelimitr    0.4.1      2018-10-07 [1] CRAN (R 3.6.0)                      
#>  remotes       2.2.0      2020-07-21 [1] CRAN (R 3.6.2)                      
#>  rlang         0.4.7      2020-07-09 [1] CRAN (R 3.6.2)                      
#>  rmarkdown     2.3        2020-06-18 [1] CRAN (R 3.6.2)                      
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.6.0)                      
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.6.0)                      
#>  stringi       1.4.6      2020-02-17 [1] CRAN (R 3.6.0)                      
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 3.6.0)                      
#>  testthat      2.3.2      2020-03-02 [1] CRAN (R 3.6.0)                      
#>  tibble        3.0.3      2020-07-10 [1] CRAN (R 3.6.2)                      
#>  tidyr         1.1.2      2020-08-27 [1] CRAN (R 3.6.2)                      
#>  tidyselect    1.1.0      2020-05-11 [1] CRAN (R 3.6.2)                      
#>  usethis       1.6.1      2020-04-29 [1] CRAN (R 3.6.2)                      
#>  utf8          1.1.4      2018-05-24 [1] CRAN (R 3.6.0)                      
#>  vctrs         0.3.4      2020-08-29 [1] CRAN (R 3.6.2)                      
#>  withr         2.2.0      2020-04-20 [1] CRAN (R 3.6.2)                      
#>  xfun          0.16       2020-07-24 [1] CRAN (R 3.6.2)                      
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 3.6.0)                      
#> 
#> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library
@gregrs-uk
Copy link
Author

This is probably related to #1

@gregrs-uk
Copy link
Author

Some further information below

library(musicbrainz)

the_wall_mbid <- search_releases("The Wall AND artist:Pink Floyd",1)$mbid
#> Returning releases 1 to 1 of 109
# using internal function that would be called by lookup_release_by_id
the_wall <- musicbrainz:::lookup_by_id("release", the_wall_mbid, "recordings")

# nothing is returned under recordings
the_wall$recordings
#> NULL

# but the recordings are included under media
cd1_tracks <- purrr::pluck(the_wall, "media", 1, "tracks")
purrr::map_chr(cd1_tracks, ~ purrr::pluck(.x, "recording", "id"))
#>  [1] "1be4459a-4bda-4530-bf8e-421fdddbdbfc"
#>  [2] "eff36e5e-5abb-4f7e-a0aa-64c154c5b28d"
#>  [3] "82d8444b-3478-480c-8652-bf8aa6cb3dc5"
#>  [4] "84978bae-46e2-4d9d-b390-5bd8a1c31983"
#>  [5] "597eae0b-ccb6-4076-b997-0ce940586076"
#>  [6] "2423ca5e-99d3-4b49-be53-542e6ee38a82"
#>  [7] "bb0d3c16-b684-4a5a-a107-e12bbd4f3041"
#>  [8] "b56432a2-6231-4c6f-af00-2656804a95d1"
#>  [9] "919dd9f4-cd78-4359-8e89-195abd9b79cf"
#> [10] "7003ec02-7bc2-4054-b477-f739efd6df45"
#> [11] "c0a217dc-4c2c-45f9-afd4-9b5f366386ef"
#> [12] "ae5096a0-4b53-455a-ac9c-0348108208c3"
#> [13] "bc3448ee-53bb-412a-99d6-ea6cc862928a"

Created on 2020-10-18 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.3 (2020-02-29)
#>  os       macOS Mojave 10.14.6        
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_GB.UTF-8                 
#>  ctype    en_GB.UTF-8                 
#>  tz       Europe/London               
#>  date     2020-10-18                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source        
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 3.6.0)
#>  backports     1.1.10     2020-09-15 [1] CRAN (R 3.6.2)
#>  callr         3.4.4      2020-09-07 [1] CRAN (R 3.6.2)
#>  cli           2.0.2      2020-02-28 [1] CRAN (R 3.6.0)
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 3.6.0)
#>  curl          4.3        2019-12-02 [1] CRAN (R 3.6.0)
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 3.6.0)
#>  devtools      2.3.2      2020-09-18 [1] CRAN (R 3.6.2)
#>  digest        0.6.25     2020-02-23 [1] CRAN (R 3.6.0)
#>  dplyr         1.0.2      2020-08-18 [1] CRAN (R 3.6.2)
#>  ellipsis      0.3.1      2020-05-15 [1] CRAN (R 3.6.2)
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 3.6.0)
#>  fansi         0.4.1      2020-01-08 [1] CRAN (R 3.6.0)
#>  fs            1.5.0      2020-07-31 [1] CRAN (R 3.6.2)
#>  generics      0.0.2      2018-11-29 [1] CRAN (R 3.6.0)
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 3.6.2)
#>  highr         0.8        2019-03-20 [1] CRAN (R 3.6.0)
#>  htmltools     0.5.0      2020-06-16 [1] CRAN (R 3.6.2)
#>  httr          1.4.2      2020-07-20 [1] CRAN (R 3.6.2)
#>  jsonlite      1.7.1      2020-09-07 [1] CRAN (R 3.6.2)
#>  knitr         1.30       2020-09-22 [1] CRAN (R 3.6.2)
#>  lifecycle     0.2.0      2020-03-06 [1] CRAN (R 3.6.0)
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 3.6.0)
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 3.6.0)
#>  musicbrainz * 0.0.0.9000 2020-10-18 [1] local         
#>  pillar        1.4.6      2020-07-10 [1] CRAN (R 3.6.2)
#>  pkgbuild      1.1.0      2020-07-13 [1] CRAN (R 3.6.2)
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 3.6.0)
#>  pkgload       1.1.0      2020-05-29 [1] CRAN (R 3.6.2)
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 3.6.0)
#>  processx      3.4.4      2020-09-03 [1] CRAN (R 3.6.2)
#>  ps            1.3.4      2020-08-11 [1] CRAN (R 3.6.2)
#>  purrr         0.3.4      2020-04-17 [1] CRAN (R 3.6.2)
#>  R6            2.4.1      2019-11-12 [1] CRAN (R 3.6.0)
#>  ratelimitr    0.4.1      2018-10-07 [1] CRAN (R 3.6.0)
#>  remotes       2.2.0      2020-07-21 [1] CRAN (R 3.6.2)
#>  rlang         0.4.7      2020-07-09 [1] CRAN (R 3.6.2)
#>  rmarkdown     2.4        2020-09-30 [1] CRAN (R 3.6.3)
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.6.0)
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.6.0)
#>  stringi       1.5.3      2020-09-09 [1] CRAN (R 3.6.2)
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 3.6.0)
#>  testthat      2.3.2      2020-03-02 [1] CRAN (R 3.6.0)
#>  tibble        3.0.3      2020-07-10 [1] CRAN (R 3.6.2)
#>  tidyr         1.1.2      2020-08-27 [1] CRAN (R 3.6.2)
#>  tidyselect    1.1.0      2020-05-11 [1] CRAN (R 3.6.2)
#>  usethis       1.6.3      2020-09-17 [1] CRAN (R 3.6.2)
#>  vctrs         0.3.4      2020-08-29 [1] CRAN (R 3.6.2)
#>  withr         2.3.0      2020-09-22 [1] CRAN (R 3.6.2)
#>  xfun          0.18       2020-09-29 [1] CRAN (R 3.6.2)
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 3.6.0)
#> 
#> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library

@dmi3kno
Copy link
Owner

dmi3kno commented Oct 18, 2020

Thank you for reporting this. The bug appeared with the stricter control in dplyr::bind_cols() for the length of the tibbles to be concatenated. In the older version of dplyr if the right table was of length(0) it still worked fine (binding would get ignored. I added a check for presence of includes to be added to the main table.

With regards to The Wall, you should be looking for release groups and then releases within them.

library(musicbrainz)
library(tidyverse)

pf_id <- search_artists("Pink Floyd",1)$mbid

tw_rg_id <- browse_release_groups_by("artist", pf_id) %>% 
  filter(str_detect(title, "Wall")) %>% 
  pull(mbid)

tw_r <- browse_releases_by("release-group", tw_rg_id) 

tw_r %>% View

It is true that release will have media associated with it, but i am not currently pulling it. Feel free to add parser for it and PR a lookup_release_media_by_id()

@gregrs-uk
Copy link
Author

Thanks @dmi3kno.

Regarding The Wall, I was using that example because it's present in the lookup_entities_by_id documentation, so it may be worth replacing that example if it's not ideal.

The main issue is that lookup_release_by_id allows recordings as a valid include, but doesn't actually return any recordings (as per the last example in my first post). Perhaps it would be clearer to remove recordings from the list of valid includes.

During the call to lookup_release_by_id, information about the recordings is fetched by the internal lookup_by_id function but doesn't get parsed because it's under media rather than recordings.

@dmi3kno
Copy link
Owner

dmi3kno commented Oct 19, 2020

It is true that lookup release by ID allows recordings as a valid include. That is what documentation says you can query. Medium is NOT the same as recording according to musicbrainz.

The help page on Recording contains examples of different Recordings of the same track (live vs studio, original vs remix) for Moby and Abba. Here's my example (from amazing Norwegian Emilie Nicolas) pulled with musicbrainz.

library(musicbrainz)
library(tidyverse)

gu_recs <- search_recordings("Grown+Up AND artist:Emilie+Nicolas") %>% 
  filter(score==100)

gu_recs %>% unnest(releases, names_sep = "_") 

or full information about each release (as supplied in the includes)

map_dfr(gu_recs$mbid, lookup_recording_by_id, includes="releases") %>% 
  unnest(releases, names_sep="_") %>% View

Question is how to get to recordings from releases and that is a good question

gu_releases <- gu_recs %>% unnest(releases, names_sep = "_") %>% 
  pull(releases_release_mbid) %>% 
  map_dfr(lookup_release_by_id, includes="recordings") 

But includes do not supply recordings (also directly in JSON)

gu_releases %>% 
  unnest(recordings, names_sep="_")

The proper way of doing it would be to pull a release (if not a release group), then extract media, parse through tracks and pick out recording information. But in any case, there will be only one single record per track (because the track represents only 1 version of the song). So as I said, we probably lack a functionality of extracting media per release (or release group), fair enough, but so far I did not consider this to be the task within a scope for a number of reasons:

  1. I don't trust Musicbrainz for media information. For that we have Discogs which is much stricter on registering unique media printings, and even then there are many duplicates
  2. The media is not a formal "include" in the API and therefore I thought it might be subject to change more often. It might be trivial to fetch it. Feel free to give it a try and PR. I hope the logic is sort of clear how I thought about the package: I have parser lists and then just take the right parser and apply it on the entity of interest. So it should be as easy as adding an entry for parser and adding a function or two for actually querying it.

@gregrs-uk
Copy link
Author

Thanks @dmi3kno. It just seemed confusing to me as a user that you could set recordings as an include to the lookup_release_by_id function but that nothing was returned. I now understand why this is but it could be a potential point of confusion for new users.

In relation to getting recordings from a release, I notice that if just recordings is supplied as the inc argument to an API call, the information regarding recordings is supplied within media e.g. this query. Without supplying an inc argument, media is not returned. So if you want to get the recordings for a release, you have to parse media, as you've said above.

The media is not a formal "include" in the API and therefore I thought it might be subject to change more often.

recordings and media do seem to be listed formally in the API docs (the latter under the heading inc= arguments which affect subqueries, but it's not made entirely clear that the recording information is returned under media, although the Note does give a hint.

To me, getting the recordings for a release seems like a useful thing to be able to do, partly because of the wealth of information that's associated with the recording, particularly in relationships for associated performers and works. It's also very handy for classical music where the main performers are listed as the recording artist rather than the track artist, which is used for the composer.

If I get time and can get my head around the necessary parsing then I will try to add a PR.

Thanks for your time and for writing this package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants