From f3214d5d4553ef54d7b52cccd4b1980f5314e56a Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Wed, 15 May 2024 14:47:33 +0100 Subject: [PATCH 01/19] add episode annotations --- episodes/08-annotations.Rmd | 70 +++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 episodes/08-annotations.Rmd diff --git a/episodes/08-annotations.Rmd b/episodes/08-annotations.Rmd new file mode 100644 index 0000000..2e4e3a9 --- /dev/null +++ b/episodes/08-annotations.Rmd @@ -0,0 +1,70 @@ +--- +source: Rmd +title: Working with genomics ranges +teaching: XX +exercises: XX +--- + +--- + +```{r, echo=FALSE, purl=FALSE, message=FALSE} +source("download_data.R") +``` + +::::::::::::::::::::::::::::::::::::::: objectives + +- Explain how annotations are organised in the Bioconductor project. +- Identify Bioconductor packages and methods available to fetch and use annotations. + +:::::::::::::::::::::::::::::::::::::::::::::::::: + +:::::::::::::::::::::::::::::::::::::::: questions + +- What Bioconductor packages provides methods to efficiently fetch and use annotations? +- How can I use annotation packages to convert between different gene identifiers? + +:::::::::::::::::::::::::::::::::::::::::::::::::: + +```{r, include=FALSE} +``` + +```{r, include=FALSE} +options(htmltools.dir.version = FALSE) +library(RefManageR) +library(bibtex) +bib <- ReadBib("files/bibliography.bib") +``` + +## Install packages + +Before we can proceed into the following sections, we install some Bioconductor packages that we will need. +First, we check that the `r BiocStyle::Biocpkg("BiocManager")` package is installed before trying to use it; otherwise we install it. +Then we use the `BiocManager::install()` function to install the necessary packages. + +```{r, message=FALSE, warning=FALSE, eval=FALSE} +if (!requireNamespace("BiocManager", quietly = TRUE)) + install.packages("BiocManager") + +BiocManager::install(c("biomaRt", "org.Hs.eg.db", "")) +``` + +## Overview + +Packages dedicated to query annotations exist in the 'Software' and 'Annotation' +categories of the Bioconductor [biocViews][biocviews], according to their +nature. + +In the 'Software' section, we find packages that do not _contain_ annotations, +but rather dynamically _query_ them from online resources +(e.g.,[Ensembl BioMart][biomart-ensembl]). +One such Bioconductor package is `r BiocStyle::Biocpkg("biomaRt")`. + +Instead, in the 'Annotation' section, we find packages that _contain_ +annotations. +Examples include `r BiocStyle::Biocpkg("org.Hs.eg.db")`, +`r BiocStyle::Biocpkg("EnsDb.Hsapiens.v86")`, +and `r BiocStyle::Biocpkg("TxDb.Hsapiens.UCSC.hg38.knownGene")`. + + +[biocviews]: https://www.bioconductor.org/packages/release/BiocViews.html +[biomart-ensembl]: https://www.ensembl.org/biomart/martview From be984c914dc49887951cdedbc28e7d19036a2f9c Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Wed, 15 May 2024 14:50:33 +0100 Subject: [PATCH 02/19] add annotations episode to yaml --- config.yaml | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/config.yaml b/config.yaml index c55fe3b..b60d0ec 100644 --- a/config.yaml +++ b/config.yaml @@ -58,7 +58,7 @@ contact: 'team@carpentries.org' # - another-learner.md # Order of episodes in your lesson -episodes: +episodes: - 01-setup.Rmd - 02-introduction-to-bioconductor.Rmd - 03-installing-bioconductor.Rmd @@ -66,15 +66,16 @@ episodes: - 05-s4.Rmd - 06-biological-sequences.Rmd - 07-genomic-ranges.Rmd +- 08-annotations.Rmd # Information for Learners -learners: +learners: # Information for Instructors -instructors: +instructors: # Learner Profiles -profiles: +profiles: # Customisation --------------------------------------------- # From c56bb83088385e3769bb024591349c26f25ec5e3 Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Wed, 15 May 2024 14:52:37 +0100 Subject: [PATCH 03/19] fix episode title --- episodes/08-annotations.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/episodes/08-annotations.Rmd b/episodes/08-annotations.Rmd index 2e4e3a9..e6b1e46 100644 --- a/episodes/08-annotations.Rmd +++ b/episodes/08-annotations.Rmd @@ -1,6 +1,6 @@ --- source: Rmd -title: Working with genomics ranges +title: Working with annotations teaching: XX exercises: XX --- From 8505985abda3efdca1c6a02fe043f4707efcedbb Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Wed, 15 May 2024 14:54:59 +0100 Subject: [PATCH 04/19] gene annotations --- episodes/08-annotations.Rmd | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/episodes/08-annotations.Rmd b/episodes/08-annotations.Rmd index e6b1e46..0cfeed5 100644 --- a/episodes/08-annotations.Rmd +++ b/episodes/08-annotations.Rmd @@ -1,6 +1,6 @@ --- source: Rmd -title: Working with annotations +title: Working with gene annotations teaching: XX exercises: XX --- @@ -13,15 +13,15 @@ source("download_data.R") ::::::::::::::::::::::::::::::::::::::: objectives -- Explain how annotations are organised in the Bioconductor project. -- Identify Bioconductor packages and methods available to fetch and use annotations. +- Explain how gene annotations are managed in the Bioconductor project. +- Identify Bioconductor packages and methods available to fetch and use gene annotations. :::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::: questions -- What Bioconductor packages provides methods to efficiently fetch and use annotations? -- How can I use annotation packages to convert between different gene identifiers? +- What Bioconductor packages provides methods to efficiently fetch and use gene annotations? +- How can I use gene annotation packages to convert between different gene identifiers? :::::::::::::::::::::::::::::::::::::::::::::::::: @@ -37,9 +37,12 @@ bib <- ReadBib("files/bibliography.bib") ## Install packages -Before we can proceed into the following sections, we install some Bioconductor packages that we will need. -First, we check that the `r BiocStyle::Biocpkg("BiocManager")` package is installed before trying to use it; otherwise we install it. -Then we use the `BiocManager::install()` function to install the necessary packages. +Before we can proceed into the following sections, we install some Bioconductor +packages that we will need. +First, we check that the `r BiocStyle::Biocpkg("BiocManager")` package is +installed before trying to use it; otherwise we install it. +Then we use the `BiocManager::install()` function to install the necessary +packages. ```{r, message=FALSE, warning=FALSE, eval=FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) @@ -50,11 +53,12 @@ BiocManager::install(c("biomaRt", "org.Hs.eg.db", "")) ## Overview -Packages dedicated to query annotations exist in the 'Software' and 'Annotation' -categories of the Bioconductor [biocViews][biocviews], according to their -nature. +Packages dedicated to query gene annotations exist in the 'Software' and +'Annotation' categories of the Bioconductor [biocViews][biocviews], according to +their nature. -In the 'Software' section, we find packages that do not _contain_ annotations, +In the 'Software' section, we find packages that do not _contain_ gene +annotations, but rather dynamically _query_ them from online resources (e.g.,[Ensembl BioMart][biomart-ensembl]). One such Bioconductor package is `r BiocStyle::Biocpkg("biomaRt")`. From db46e8f8319a76e967e88b06ada47843f4b9980b Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Wed, 15 May 2024 15:00:38 +0100 Subject: [PATCH 05/19] remove empty package name --- episodes/08-annotations.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/episodes/08-annotations.Rmd b/episodes/08-annotations.Rmd index 0cfeed5..a35397d 100644 --- a/episodes/08-annotations.Rmd +++ b/episodes/08-annotations.Rmd @@ -48,7 +48,7 @@ packages. if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") -BiocManager::install(c("biomaRt", "org.Hs.eg.db", "")) +BiocManager::install(c("biomaRt", "org.Hs.eg.db")) ``` ## Overview From c6a57a25b2b3a8f1d40f9b8196ce6914c2877808 Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Tue, 30 Jul 2024 13:24:03 +0100 Subject: [PATCH 06/19] update renv --- renv/activate.R | 41 +++++++++++------------------------------ 1 file changed, 11 insertions(+), 30 deletions(-) diff --git a/renv/activate.R b/renv/activate.R index d13f993..9daa13a 100644 --- a/renv/activate.R +++ b/renv/activate.R @@ -6,9 +6,7 @@ local({ attr(version, "sha") <- NULL # the project directory - project <- Sys.getenv("RENV_PROJECT") - if (!nzchar(project)) - project <- getwd() + project <- getwd() # use start-up diagnostics if enabled diagnostics <- Sys.getenv("RENV_STARTUP_DIAGNOSTICS", unset = "FALSE") @@ -131,21 +129,6 @@ local({ } - heredoc <- function(text, leave = 0) { - - # remove leading, trailing whitespace - trimmed <- gsub("^\\s*\\n|\\n\\s*$", "", text) - - # split into lines - lines <- strsplit(trimmed, "\n", fixed = TRUE)[[1L]] - - # compute common indent - indent <- regexpr("[^[:space:]]", lines) - common <- min(setdiff(indent, -1L)) - leave - paste(substring(lines, common), collapse = "\n") - - } - startswith <- function(string, prefix) { substring(string, 1, nchar(prefix)) == prefix } @@ -648,9 +631,6 @@ local({ # if the user has requested an automatic prefix, generate it auto <- Sys.getenv("RENV_PATHS_PREFIX_AUTO", unset = NA) - if (is.na(auto) && getRversion() >= "4.4.0") - auto <- "TRUE" - if (auto %in% c("TRUE", "True", "true", "1")) return(renv_bootstrap_platform_prefix_auto()) @@ -842,23 +822,24 @@ local({ # the loaded version of renv doesn't match the requested version; # give the user instructions on how to proceed - dev <- identical(description[["RemoteType"]], "github") - remote <- if (dev) + remote <- if (!is.null(description[["RemoteSha"]])) { paste("rstudio/renv", description[["RemoteSha"]], sep = "@") - else + } else { paste("renv", description[["Version"]], sep = "@") + } # display both loaded version + sha if available friendly <- renv_bootstrap_version_friendly( version = description[["Version"]], - sha = if (dev) description[["RemoteSha"]] + sha = description[["RemoteSha"]] ) - fmt <- heredoc(" - renv %1$s was loaded from project library, but this project is configured to use renv %2$s. - - Use `renv::record(\"%3$s\")` to record renv %1$s in the lockfile. - - Use `renv::restore(packages = \"renv\")` to install renv %2$s into the project library. - ") + fmt <- paste( + "renv %1$s was loaded from project library, but this project is configured to use renv %2$s.", + "- Use `renv::record(\"%3$s\")` to record renv %1$s in the lockfile.", + "- Use `renv::restore(packages = \"renv\")` to install renv %2$s into the project library.", + sep = "\n" + ) catf(fmt, friendly, renv_bootstrap_version_friendly(version), remote) FALSE From 3b7af2823790b90d84567299f8855a1916acaee1 Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Tue, 30 Jul 2024 13:30:09 +0100 Subject: [PATCH 07/19] restore renv/activate.R script to main branch --- renv/activate.R | 615 +++++++++++++++++++++++++----------------------- 1 file changed, 317 insertions(+), 298 deletions(-) diff --git a/renv/activate.R b/renv/activate.R index 9daa13a..9039d08 100644 --- a/renv/activate.R +++ b/renv/activate.R @@ -6,7 +6,9 @@ local({ attr(version, "sha") <- NULL # the project directory - project <- getwd() + project <- Sys.getenv("RENV_PROJECT") + if (!nzchar(project)) + project <- getwd() # use start-up diagnostics if enabled diagnostics <- Sys.getenv("RENV_STARTUP_DIAGNOSTICS", unset = "FALSE") @@ -95,50 +97,65 @@ local({ if ("renv" %in% loadedNamespaces()) unloadNamespace("renv") - # load bootstrap tools + # load bootstrap tools `%||%` <- function(x, y) { if (is.null(x)) y else x } - + catf <- function(fmt, ..., appendLF = TRUE) { - + quiet <- getOption("renv.bootstrap.quiet", default = FALSE) if (quiet) return(invisible()) - + msg <- sprintf(fmt, ...) cat(msg, file = stdout(), sep = if (appendLF) "\n" else "") - + invisible(msg) - + } - + header <- function(label, - ..., - prefix = "#", - suffix = "-", - n = min(getOption("width"), 78)) + ..., + prefix = "#", + suffix = "-", + n = min(getOption("width"), 78)) { label <- sprintf(label, ...) n <- max(n - nchar(label) - nchar(prefix) - 2L, 8L) if (n <= 0) return(paste(prefix, label)) - + tail <- paste(rep.int(suffix, n), collapse = "") paste0(prefix, " ", label, " ", tail) - + } - + + heredoc <- function(text, leave = 0) { + + # remove leading, trailing whitespace + trimmed <- gsub("^\\s*\\n|\\n\\s*$", "", text) + + # split into lines + lines <- strsplit(trimmed, "\n", fixed = TRUE)[[1L]] + + # compute common indent + indent <- regexpr("[^[:space:]]", lines) + common <- min(setdiff(indent, -1L)) - leave + paste(substring(lines, common), collapse = "\n") + + } + startswith <- function(string, prefix) { substring(string, 1, nchar(prefix)) == prefix } - + bootstrap <- function(version, library) { - + friendly <- renv_bootstrap_version_friendly(version) section <- header(sprintf("Bootstrapping renv %s", friendly)) catf(section) - + # attempt to download renv catf("- Downloading renv ... ", appendLF = FALSE) withCallingHandlers( @@ -150,7 +167,7 @@ local({ ) catf("OK") on.exit(unlink(tarball), add = TRUE) - + # now attempt to install catf("- Installing renv ... ", appendLF = FALSE) withCallingHandlers( @@ -161,174 +178,174 @@ local({ } ) catf("OK") - + # add empty line to break up bootstrapping from normal output catf("") - + return(invisible()) } - + renv_bootstrap_tests_running <- function() { getOption("renv.tests.running", default = FALSE) } - + renv_bootstrap_repos <- function() { - + # get CRAN repository cran <- getOption("renv.repos.cran", "https://cloud.r-project.org") - + # check for repos override repos <- Sys.getenv("RENV_CONFIG_REPOS_OVERRIDE", unset = NA) if (!is.na(repos)) { - + # check for RSPM; if set, use a fallback repository for renv rspm <- Sys.getenv("RSPM", unset = NA) if (identical(rspm, repos)) repos <- c(RSPM = rspm, CRAN = cran) - + return(repos) - + } - + # check for lockfile repositories repos <- tryCatch(renv_bootstrap_repos_lockfile(), error = identity) if (!inherits(repos, "error") && length(repos)) return(repos) - + # retrieve current repos repos <- getOption("repos") - + # ensure @CRAN@ entries are resolved repos[repos == "@CRAN@"] <- cran - + # add in renv.bootstrap.repos if set default <- c(FALLBACK = "https://cloud.r-project.org") extra <- getOption("renv.bootstrap.repos", default = default) repos <- c(repos, extra) - + # remove duplicates that might've snuck in dupes <- duplicated(repos) | duplicated(names(repos)) repos[!dupes] - + } - + renv_bootstrap_repos_lockfile <- function() { - + lockpath <- Sys.getenv("RENV_PATHS_LOCKFILE", unset = "renv.lock") if (!file.exists(lockpath)) return(NULL) - + lockfile <- tryCatch(renv_json_read(lockpath), error = identity) if (inherits(lockfile, "error")) { warning(lockfile) return(NULL) } - + repos <- lockfile$R$Repositories if (length(repos) == 0) return(NULL) - + keys <- vapply(repos, `[[`, "Name", FUN.VALUE = character(1)) vals <- vapply(repos, `[[`, "URL", FUN.VALUE = character(1)) names(vals) <- keys - + return(vals) - + } - + renv_bootstrap_download <- function(version) { - + sha <- attr(version, "sha", exact = TRUE) - + methods <- if (!is.null(sha)) { - + # attempting to bootstrap a development version of renv c( function() renv_bootstrap_download_tarball(sha), function() renv_bootstrap_download_github(sha) ) - + } else { - + # attempting to bootstrap a release version of renv c( function() renv_bootstrap_download_tarball(version), function() renv_bootstrap_download_cran_latest(version), function() renv_bootstrap_download_cran_archive(version) ) - + } - + for (method in methods) { path <- tryCatch(method(), error = identity) if (is.character(path) && file.exists(path)) return(path) } - + stop("All download methods failed") - + } - + renv_bootstrap_download_impl <- function(url, destfile) { - + mode <- "wb" - + # https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17715 fixup <- Sys.info()[["sysname"]] == "Windows" && substring(url, 1L, 5L) == "file:" - + if (fixup) mode <- "w+b" - + args <- list( url = url, destfile = destfile, mode = mode, quiet = TRUE ) - + if ("headers" %in% names(formals(utils::download.file))) args$headers <- renv_bootstrap_download_custom_headers(url) - + do.call(utils::download.file, args) - + } - + renv_bootstrap_download_custom_headers <- function(url) { - + headers <- getOption("renv.download.headers") if (is.null(headers)) return(character()) - + if (!is.function(headers)) stopf("'renv.download.headers' is not a function") - + headers <- headers(url) if (length(headers) == 0L) return(character()) - + if (is.list(headers)) headers <- unlist(headers, recursive = FALSE, use.names = TRUE) - + ok <- is.character(headers) && is.character(names(headers)) && all(nzchar(names(headers))) - + if (!ok) stop("invocation of 'renv.download.headers' did not return a named character vector") - + headers - + } - + renv_bootstrap_download_cran_latest <- function(version) { - + spec <- renv_bootstrap_download_cran_latest_find(version) type <- spec$type repos <- spec$repos - + baseurl <- utils::contrib.url(repos = repos, type = type) ext <- if (identical(type, "source")) ".tar.gz" @@ -338,36 +355,36 @@ local({ ".tgz" name <- sprintf("renv_%s%s", version, ext) url <- paste(baseurl, name, sep = "/") - + destfile <- file.path(tempdir(), name) status <- tryCatch( renv_bootstrap_download_impl(url, destfile), condition = identity ) - + if (inherits(status, "condition")) return(FALSE) - + # report success and return destfile - + } - + renv_bootstrap_download_cran_latest_find <- function(version) { - + # check whether binaries are supported on this system binary <- getOption("renv.bootstrap.binary", default = TRUE) && !identical(.Platform$pkgType, "source") && !identical(getOption("pkgType"), "source") && Sys.info()[["sysname"]] %in% c("Darwin", "Windows") - + types <- c(if (binary) "binary", "source") - + # iterate over types + repositories for (type in types) { for (repos in renv_bootstrap_repos()) { - + # retrieve package database db <- tryCatch( as.data.frame( @@ -376,89 +393,89 @@ local({ ), error = identity ) - + if (inherits(db, "error")) next - + # check for compatible entry entry <- db[db$Package %in% "renv" & db$Version %in% version, ] if (nrow(entry) == 0) next - + # found it; return spec to caller spec <- list(entry = entry, type = type, repos = repos) return(spec) - + } } - + # if we got here, we failed to find renv fmt <- "renv %s is not available from your declared package repositories" stop(sprintf(fmt, version)) - + } - + renv_bootstrap_download_cran_archive <- function(version) { - + name <- sprintf("renv_%s.tar.gz", version) repos <- renv_bootstrap_repos() urls <- file.path(repos, "src/contrib/Archive/renv", name) destfile <- file.path(tempdir(), name) - + for (url in urls) { - + status <- tryCatch( renv_bootstrap_download_impl(url, destfile), condition = identity ) - + if (identical(status, 0L)) return(destfile) - + } - + return(FALSE) - + } - + renv_bootstrap_download_tarball <- function(version) { - + # if the user has provided the path to a tarball via # an environment variable, then use it tarball <- Sys.getenv("RENV_BOOTSTRAP_TARBALL", unset = NA) if (is.na(tarball)) return() - + # allow directories if (dir.exists(tarball)) { name <- sprintf("renv_%s.tar.gz", version) tarball <- file.path(tarball, name) } - + # bail if it doesn't exist if (!file.exists(tarball)) { - + # let the user know we weren't able to honour their request fmt <- "- RENV_BOOTSTRAP_TARBALL is set (%s) but does not exist." msg <- sprintf(fmt, tarball) warning(msg) - + # bail return() - + } - + catf("- Using local tarball '%s'.", tarball) tarball - + } - + renv_bootstrap_download_github <- function(version) { - + enabled <- Sys.getenv("RENV_BOOTSTRAP_FROM_GITHUB", unset = "TRUE") if (!identical(enabled, "TRUE")) return(FALSE) - + # prepare download options pat <- Sys.getenv("GITHUB_PAT") if (nzchar(Sys.which("curl")) && nzchar(pat)) { @@ -474,25 +491,25 @@ local({ options(download.file.method = "wget", download.file.extra = extra) on.exit(do.call(base::options, saved), add = TRUE) } - + url <- file.path("https://api.github.com/repos/rstudio/renv/tarball", version) name <- sprintf("renv_%s.tar.gz", version) destfile <- file.path(tempdir(), name) - + status <- tryCatch( renv_bootstrap_download_impl(url, destfile), condition = identity ) - + if (!identical(status, 0L)) return(FALSE) - + renv_bootstrap_download_augment(destfile) - + return(destfile) - + } - + # Add Sha to DESCRIPTION. This is stop gap until #890, after which we # can use renv::install() to fully capture metadata. renv_bootstrap_download_augment <- function(destfile) { @@ -500,13 +517,13 @@ local({ if (is.null(sha)) { return() } - + # Untar tempdir <- tempfile("renv-github-") on.exit(unlink(tempdir, recursive = TRUE), add = TRUE) untar(destfile, exdir = tempdir) pkgdir <- dir(tempdir, full.names = TRUE)[[1]] - + # Modify description desc_path <- file.path(pkgdir, "DESCRIPTION") desc_lines <- readLines(desc_path) @@ -520,170 +537,173 @@ local({ paste("RemoteSha: ", sha) ) writeLines(c(desc_lines[desc_lines != ""], remotes_fields), con = desc_path) - + # Re-tar local({ old <- setwd(tempdir) on.exit(setwd(old), add = TRUE) - + tar(destfile, compression = "gzip") }) invisible() } - + # Extract the commit hash from a git archive. Git archives include the SHA1 # hash as the comment field of the tarball pax extended header # (see https://www.kernel.org/pub/software/scm/git/docs/git-archive.html) # For GitHub archives this should be the first header after the default one # (512 byte) header. renv_bootstrap_git_extract_sha1_tar <- function(bundle) { - + # open the bundle for reading # We use gzcon for everything because (from ?gzcon) # > Reading from a connection which does not supply a 'gzip' magic # > header is equivalent to reading from the original connection conn <- gzcon(file(bundle, open = "rb", raw = TRUE)) on.exit(close(conn)) - + # The default pax header is 512 bytes long and the first pax extended header # with the comment should be 51 bytes long # `52 comment=` (11 chars) + 40 byte SHA1 hash len <- 0x200 + 0x33 res <- rawToChar(readBin(conn, "raw", n = len)[0x201:len]) - + if (grepl("^52 comment=", res)) { sub("52 comment=", "", res) } else { NULL } } - + renv_bootstrap_install <- function(version, tarball, library) { - + # attempt to install it into project library dir.create(library, showWarnings = FALSE, recursive = TRUE) output <- renv_bootstrap_install_impl(library, tarball) - + # check for successful install status <- attr(output, "status") if (is.null(status) || identical(status, 0L)) return(status) - + # an error occurred; report it header <- "installation of renv failed" lines <- paste(rep.int("=", nchar(header)), collapse = "") text <- paste(c(header, lines, output), collapse = "\n") stop(text) - + } - + renv_bootstrap_install_impl <- function(library, tarball) { - + # invoke using system2 so we can capture and report output bin <- R.home("bin") exe <- if (Sys.info()[["sysname"]] == "Windows") "R.exe" else "R" R <- file.path(bin, exe) - + args <- c( "--vanilla", "CMD", "INSTALL", "--no-multiarch", "-l", shQuote(path.expand(library)), shQuote(path.expand(tarball)) ) - + system2(R, args, stdout = TRUE, stderr = TRUE) - + } - + renv_bootstrap_platform_prefix <- function() { - + # construct version prefix version <- paste(R.version$major, R.version$minor, sep = ".") prefix <- paste("R", numeric_version(version)[1, 1:2], sep = "-") - + # include SVN revision for development versions of R # (to avoid sharing platform-specific artefacts with released versions of R) devel <- identical(R.version[["status"]], "Under development (unstable)") || identical(R.version[["nickname"]], "Unsuffered Consequences") - + if (devel) prefix <- paste(prefix, R.version[["svn rev"]], sep = "-r") - + # build list of path components components <- c(prefix, R.version$platform) - + # include prefix if provided by user prefix <- renv_bootstrap_platform_prefix_impl() if (!is.na(prefix) && nzchar(prefix)) components <- c(prefix, components) - + # build prefix paste(components, collapse = "/") - + } - + renv_bootstrap_platform_prefix_impl <- function() { - + # if an explicit prefix has been supplied, use it prefix <- Sys.getenv("RENV_PATHS_PREFIX", unset = NA) if (!is.na(prefix)) return(prefix) - + # if the user has requested an automatic prefix, generate it auto <- Sys.getenv("RENV_PATHS_PREFIX_AUTO", unset = NA) + if (is.na(auto) && getRversion() >= "4.4.0") + auto <- "TRUE" + if (auto %in% c("TRUE", "True", "true", "1")) return(renv_bootstrap_platform_prefix_auto()) - + # empty string on failure "" - + } - + renv_bootstrap_platform_prefix_auto <- function() { - + prefix <- tryCatch(renv_bootstrap_platform_os(), error = identity) if (inherits(prefix, "error") || prefix %in% "unknown") { - + msg <- paste( "failed to infer current operating system", "please file a bug report at https://github.com/rstudio/renv/issues", sep = "; " ) - + warning(msg) - + } - + prefix - + } - + renv_bootstrap_platform_os <- function() { - + sysinfo <- Sys.info() sysname <- sysinfo[["sysname"]] - + # handle Windows + macOS up front if (sysname == "Windows") return("windows") else if (sysname == "Darwin") return("macos") - + # check for os-release files for (file in c("/etc/os-release", "/usr/lib/os-release")) if (file.exists(file)) return(renv_bootstrap_platform_os_via_os_release(file, sysinfo)) - + # check for redhat-release files if (file.exists("/etc/redhat-release")) return(renv_bootstrap_platform_os_via_redhat_release()) - + "unknown" - + } - + renv_bootstrap_platform_os_via_os_release <- function(file, sysinfo) { - + # read /etc/os-release release <- utils::read.table( file = file, @@ -693,13 +713,13 @@ local({ comment.char = "#", stringsAsFactors = FALSE ) - + vars <- as.list(release$Value) names(vars) <- release$Key - + # get os name os <- tolower(sysinfo[["sysname"]]) - + # read id id <- "unknown" for (field in c("ID", "ID_LIKE")) { @@ -708,7 +728,7 @@ local({ break } } - + # read version version <- "unknown" for (field in c("UBUNTU_CODENAME", "VERSION_CODENAME", "VERSION_ID", "BUILD_ID")) { @@ -717,17 +737,17 @@ local({ break } } - + # join together paste(c(os, id, version), collapse = "-") - + } - + renv_bootstrap_platform_os_via_redhat_release <- function() { - + # read /etc/redhat-release contents <- readLines("/etc/redhat-release", warn = FALSE) - + # infer id id <- if (grepl("centos", contents, ignore.case = TRUE)) "centos" @@ -735,73 +755,73 @@ local({ "redhat" else "unknown" - + # try to find a version component (very hacky) version <- "unknown" - + parts <- strsplit(contents, "[[:space:]]")[[1L]] for (part in parts) { - + nv <- tryCatch(numeric_version(part), error = identity) if (inherits(nv, "error")) next - + version <- nv[1, 1] break - + } - + paste(c("linux", id, version), collapse = "-") - + } - + renv_bootstrap_library_root_name <- function(project) { - + # use project name as-is if requested asis <- Sys.getenv("RENV_PATHS_LIBRARY_ROOT_ASIS", unset = "FALSE") if (asis) return(basename(project)) - + # otherwise, disambiguate based on project's path id <- substring(renv_bootstrap_hash_text(project), 1L, 8L) paste(basename(project), id, sep = "-") - + } - + renv_bootstrap_library_root <- function(project) { - + prefix <- renv_bootstrap_profile_prefix() - + path <- Sys.getenv("RENV_PATHS_LIBRARY", unset = NA) if (!is.na(path)) return(paste(c(path, prefix), collapse = "/")) - + path <- renv_bootstrap_library_root_impl(project) if (!is.null(path)) { name <- renv_bootstrap_library_root_name(project) return(paste(c(path, prefix, name), collapse = "/")) } - + renv_bootstrap_paths_renv("library", project = project) - + } - + renv_bootstrap_library_root_impl <- function(project) { - + root <- Sys.getenv("RENV_PATHS_LIBRARY_ROOT", unset = NA) if (!is.na(root)) return(root) - + type <- renv_bootstrap_project_type(project) if (identical(type, "package")) { userdir <- renv_bootstrap_user_dir() return(file.path(userdir, "library")) } - + } - + renv_bootstrap_validate_version <- function(version, description = NULL) { - + # resolve description file # # avoid passing lib.loc to `packageDescription()` below, since R will @@ -809,122 +829,121 @@ local({ # this function should only be called after 'renv' is loaded # https://github.com/rstudio/renv/issues/1625 description <- description %||% packageDescription("renv") - + # check whether requested version 'version' matches loaded version of renv sha <- attr(version, "sha", exact = TRUE) valid <- if (!is.null(sha)) renv_bootstrap_validate_version_dev(sha, description) else renv_bootstrap_validate_version_release(version, description) - + if (valid) return(TRUE) - + # the loaded version of renv doesn't match the requested version; # give the user instructions on how to proceed - remote <- if (!is.null(description[["RemoteSha"]])) { + dev <- identical(description[["RemoteType"]], "github") + remote <- if (dev) paste("rstudio/renv", description[["RemoteSha"]], sep = "@") - } else { + else paste("renv", description[["Version"]], sep = "@") - } - + # display both loaded version + sha if available friendly <- renv_bootstrap_version_friendly( version = description[["Version"]], - sha = description[["RemoteSha"]] - ) - - fmt <- paste( - "renv %1$s was loaded from project library, but this project is configured to use renv %2$s.", - "- Use `renv::record(\"%3$s\")` to record renv %1$s in the lockfile.", - "- Use `renv::restore(packages = \"renv\")` to install renv %2$s into the project library.", - sep = "\n" + sha = if (dev) description[["RemoteSha"]] ) + + fmt <- heredoc(" + renv %1$s was loaded from project library, but this project is configured to use renv %2$s. + - Use `renv::record(\"%3$s\")` to record renv %1$s in the lockfile. + - Use `renv::restore(packages = \"renv\")` to install renv %2$s into the project library. + ") catf(fmt, friendly, renv_bootstrap_version_friendly(version), remote) - + FALSE - + } - + renv_bootstrap_validate_version_dev <- function(version, description) { expected <- description[["RemoteSha"]] is.character(expected) && startswith(expected, version) } - + renv_bootstrap_validate_version_release <- function(version, description) { expected <- description[["Version"]] is.character(expected) && identical(expected, version) } - + renv_bootstrap_hash_text <- function(text) { - + hashfile <- tempfile("renv-hash-") on.exit(unlink(hashfile), add = TRUE) - + writeLines(text, con = hashfile) tools::md5sum(hashfile) - + } - + renv_bootstrap_load <- function(project, libpath, version) { - + # try to load renv from the project library if (!requireNamespace("renv", lib.loc = libpath, quietly = TRUE)) return(FALSE) - + # warn if the version of renv loaded does not match renv_bootstrap_validate_version(version) - + # execute renv load hooks, if any hooks <- getHook("renv::autoload") for (hook in hooks) if (is.function(hook)) tryCatch(hook(), error = warnify) - + # load the project renv::load(project) - + TRUE - + } - + renv_bootstrap_profile_load <- function(project) { - + # if RENV_PROFILE is already set, just use that profile <- Sys.getenv("RENV_PROFILE", unset = NA) if (!is.na(profile) && nzchar(profile)) return(profile) - + # check for a profile file (nothing to do if it doesn't exist) path <- renv_bootstrap_paths_renv("profile", profile = FALSE, project = project) if (!file.exists(path)) return(NULL) - + # read the profile, and set it if it exists contents <- readLines(path, warn = FALSE) if (length(contents) == 0L) return(NULL) - + # set RENV_PROFILE profile <- contents[[1L]] if (!profile %in% c("", "default")) Sys.setenv(RENV_PROFILE = profile) - + profile - + } - + renv_bootstrap_profile_prefix <- function() { profile <- renv_bootstrap_profile_get() if (!is.null(profile)) return(file.path("profiles", profile, "renv")) } - + renv_bootstrap_profile_get <- function() { profile <- Sys.getenv("RENV_PROFILE", unset = "") renv_bootstrap_profile_normalize(profile) } - + renv_bootstrap_profile_set <- function(profile) { profile <- renv_bootstrap_profile_normalize(profile) if (is.null(profile)) @@ -932,25 +951,25 @@ local({ else Sys.setenv(RENV_PROFILE = profile) } - + renv_bootstrap_profile_normalize <- function(profile) { - + if (is.null(profile) || profile %in% c("", "default")) return(NULL) - + profile - + } - + renv_bootstrap_path_absolute <- function(path) { - + substr(path, 1L, 1L) %in% c("~", "/", "\\") || ( substr(path, 1L, 1L) %in% c(letters, LETTERS) && - substr(path, 2L, 3L) %in% c(":/", ":\\") + substr(path, 2L, 3L) %in% c(":/", ":\\") ) - + } - + renv_bootstrap_paths_renv <- function(..., profile = TRUE, project = NULL) { renv <- Sys.getenv("RENV_PATHS_RENV", unset = "renv") root <- if (renv_bootstrap_path_absolute(renv)) NULL else project @@ -958,50 +977,50 @@ local({ components <- c(root, renv, prefix, ...) paste(components, collapse = "/") } - + renv_bootstrap_project_type <- function(path) { - + descpath <- file.path(path, "DESCRIPTION") if (!file.exists(descpath)) return("unknown") - + desc <- tryCatch( read.dcf(descpath, all = TRUE), error = identity ) - + if (inherits(desc, "error")) return("unknown") - + type <- desc$Type if (!is.null(type)) return(tolower(type)) - + package <- desc$Package if (!is.null(package)) return("package") - + "unknown" - + } - + renv_bootstrap_user_dir <- function() { dir <- renv_bootstrap_user_dir_impl() path.expand(chartr("\\", "/", dir)) } - + renv_bootstrap_user_dir_impl <- function() { - + # use local override if set override <- getOption("renv.userdir.override") if (!is.null(override)) return(override) - + # use R_user_dir if available tools <- asNamespace("tools") if (is.function(tools$R_user_dir)) return(tools$R_user_dir("renv", "cache")) - + # try using our own backfill for older versions of R envvars <- c("R_USER_CACHE_DIR", "XDG_CACHE_HOME") for (envvar in envvars) { @@ -1009,7 +1028,7 @@ local({ if (!is.na(root)) return(file.path(root, "R/renv")) } - + # use platform-specific default fallbacks if (Sys.info()[["sysname"]] == "Windows") file.path(Sys.getenv("LOCALAPPDATA"), "R/cache/R/renv") @@ -1017,109 +1036,109 @@ local({ "~/Library/Caches/org.R-project.R/R/renv" else "~/.cache/R/renv" - + } - + renv_bootstrap_version_friendly <- function(version, shafmt = NULL, sha = NULL) { sha <- sha %||% attr(version, "sha", exact = TRUE) parts <- c(version, sprintf(shafmt %||% " [sha: %s]", substring(sha, 1L, 7L))) paste(parts, collapse = "") } - + renv_bootstrap_exec <- function(project, libpath, version) { if (!renv_bootstrap_load(project, libpath, version)) renv_bootstrap_run(version, libpath) } - + renv_bootstrap_run <- function(version, libpath) { - + # perform bootstrap bootstrap(version, libpath) - + # exit early if we're just testing bootstrap if (!is.na(Sys.getenv("RENV_BOOTSTRAP_INSTALL_ONLY", unset = NA))) return(TRUE) - + # try again to load if (requireNamespace("renv", lib.loc = libpath, quietly = TRUE)) { return(renv::load(project = getwd())) } - + # failed to download or load renv; warn the user msg <- c( "Failed to find an renv installation: the project will not be loaded.", "Use `renv::activate()` to re-initialize the project." ) - + warning(paste(msg, collapse = "\n"), call. = FALSE) - + } - + renv_json_read <- function(file = NULL, text = NULL) { - + jlerr <- NULL - + # if jsonlite is loaded, use that instead if ("jsonlite" %in% loadedNamespaces()) { - + json <- tryCatch(renv_json_read_jsonlite(file, text), error = identity) if (!inherits(json, "error")) return(json) - + jlerr <- json - + } - + # otherwise, fall back to the default JSON reader json <- tryCatch(renv_json_read_default(file, text), error = identity) if (!inherits(json, "error")) return(json) - + # report an error if (!is.null(jlerr)) stop(jlerr) else stop(json) - + } - + renv_json_read_jsonlite <- function(file = NULL, text = NULL) { text <- paste(text %||% readLines(file, warn = FALSE), collapse = "\n") jsonlite::fromJSON(txt = text, simplifyVector = FALSE) } - + renv_json_read_default <- function(file = NULL, text = NULL) { - + # find strings in the JSON text <- paste(text %||% readLines(file, warn = FALSE), collapse = "\n") pattern <- '["](?:(?:\\\\.)|(?:[^"\\\\]))*?["]' locs <- gregexpr(pattern, text, perl = TRUE)[[1]] - + # if any are found, replace them with placeholders replaced <- text strings <- character() replacements <- character() - + if (!identical(c(locs), -1L)) { - + # get the string values starts <- locs ends <- locs + attr(locs, "match.length") - 1L strings <- substring(text, starts, ends) - + # only keep those requiring escaping strings <- grep("[[\\]{}:]", strings, perl = TRUE, value = TRUE) - + # compute replacements replacements <- sprintf('"\032%i\032"', seq_along(strings)) - + # replace the strings mapply(function(string, replacement) { replaced <<- sub(string, replacement, replaced, fixed = TRUE) }, strings, replacements) - + } - + # transform the JSON into something the R parser understands transformed <- replaced transformed <- gsub("{}", "`names<-`(list(), character())", transformed, fixed = TRUE) @@ -1127,38 +1146,38 @@ local({ transformed <- gsub("[]}]", ")", transformed, perl = TRUE) transformed <- gsub(":", "=", transformed, fixed = TRUE) text <- paste(transformed, collapse = "\n") - + # parse it json <- parse(text = text, keep.source = FALSE, srcfile = NULL)[[1L]] - + # construct map between source strings, replaced strings map <- as.character(parse(text = strings)) names(map) <- as.character(parse(text = replacements)) - + # convert to list map <- as.list(map) - + # remap strings in object remapped <- renv_json_read_remap(json, map) - + # evaluate eval(remapped, envir = baseenv()) - + } - + renv_json_read_remap <- function(json, map) { - + # fix names if (!is.null(names(json))) { lhs <- match(names(json), names(map), nomatch = 0L) rhs <- match(names(map), names(json), nomatch = 0L) names(json)[rhs] <- map[lhs] } - + # fix values if (is.character(json)) return(map[[json]] %||% json) - + # handle true, false, null if (is.name(json)) { text <- as.character(json) @@ -1169,16 +1188,16 @@ local({ else if (text == "null") return(NULL) } - + # recurse if (is.recursive(json)) { for (i in seq_along(json)) { json[i] <- list(renv_json_read_remap(json[[i]], map)) } } - + json - + } # load the renv profile, if any From f11cd227a5fc477111693dcf2ea2f1ee919dc8a7 Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Tue, 30 Jul 2024 13:36:30 +0100 Subject: [PATCH 08/19] update renv package versions --- renv/profiles/lesson-requirements/renv.lock | 274 ++++++++++++-------- 1 file changed, 160 insertions(+), 114 deletions(-) diff --git a/renv/profiles/lesson-requirements/renv.lock b/renv/profiles/lesson-requirements/renv.lock index d4fd3c5..20f3b10 100644 --- a/renv/profiles/lesson-requirements/renv.lock +++ b/renv/profiles/lesson-requirements/renv.lock @@ -1,26 +1,26 @@ { "R": { - "Version": "4.3.2", + "Version": "4.4.0", "Repositories": [ { "Name": "BioCsoft", - "URL": "https://bioconductor.org/packages/3.17/bioc" + "URL": "https://bioconductor.org/packages/3.19/bioc" }, { "Name": "BioCann", - "URL": "https://bioconductor.org/packages/3.17/data/annotation" + "URL": "https://bioconductor.org/packages/3.19/data/annotation" }, { "Name": "BioCexp", - "URL": "https://bioconductor.org/packages/3.17/data/experiment" + "URL": "https://bioconductor.org/packages/3.19/data/experiment" }, { "Name": "BioCworkflows", - "URL": "https://bioconductor.org/packages/3.17/workflows" + "URL": "https://bioconductor.org/packages/3.19/workflows" }, { "Name": "BioCbooks", - "URL": "https://bioconductor.org/packages/3.17/books" + "URL": "https://bioconductor.org/packages/3.19/books" }, { "Name": "carpentries", @@ -37,7 +37,7 @@ ] }, "Bioconductor": { - "Version": "3.17" + "Version": "3.19" }, "Packages": { "BH": { @@ -49,10 +49,12 @@ }, "BSgenome": { "Package": "BSgenome", - "Version": "1.68.0", + "Version": "1.72.0", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", + "BiocIO", "Biostrings", "GenomeInfoDb", "GenomicRanges", @@ -67,7 +69,7 @@ "stats", "utils" ], - "Hash": "9d7c3c9b904c28bc971ed29d3f3b445e" + "Hash": "9e00bf24b78d10f32cb8e1dceb5f87ff" }, "BSgenome.Hsapiens.UCSC.hg38": { "Package": "BSgenome.Hsapiens.UCSC.hg38", @@ -94,20 +96,22 @@ }, "Biobase": { "Package": "Biobase", - "Version": "2.60.0", + "Version": "2.64.0", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "R", "methods", "utils" ], - "Hash": "ed269b250f5844d54dfdc7e749f901aa" + "Hash": "9bc4cabd3bfda461409172213d932813" }, "BiocGenerics": { "Package": "BiocGenerics", - "Version": "0.46.0", + "Version": "0.49.1", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "R", "graphics", @@ -115,12 +119,13 @@ "stats", "utils" ], - "Hash": "c179ae59955c36f5d0068ed29ce832f7" + "Hash": "bb0e8378090c72c1fe8721fc34a4f7cb" }, "BiocIO": { "Package": "BiocIO", - "Version": "1.10.0", + "Version": "1.14.0", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "R", @@ -128,7 +133,7 @@ "methods", "tools" ], - "Hash": "a236af72143f9023b2f9f5f8baa81712" + "Hash": "f97a7ef01d364cf20d1946d43a3d526f" }, "BiocManager": { "Package": "BiocManager", @@ -142,8 +147,9 @@ }, "BiocParallel": { "Package": "BiocParallel", - "Version": "1.34.2", + "Version": "1.38.0", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BH", "R", @@ -156,12 +162,13 @@ "stats", "utils" ], - "Hash": "84347b6a8118ba2182b148298b118f0e" + "Hash": "7b6e79f86e3d1c23f62c5e2052e848d4" }, "BiocStyle": { "Package": "BiocStyle", - "Version": "2.28.1", + "Version": "2.32.1", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocManager", "bookdown", @@ -171,21 +178,23 @@ "utils", "yaml" ], - "Hash": "c1bc4c0cdef7dd5756ab9e8d5016b52e" + "Hash": "beadb5ac6d6b64dc6153cb300dd063ef" }, "BiocVersion": { "Package": "BiocVersion", - "Version": "3.17.1", + "Version": "3.19.1", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "R" ], - "Hash": "f7c0d5521799b7b0d0a211143ed0bfcb" + "Hash": "b892e27fc9659a4c8f8787d34c37b8b2" }, "Biostrings": { "Package": "Biostrings", - "Version": "2.68.1", + "Version": "2.71.5", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "GenomeInfoDb", @@ -200,12 +209,13 @@ "stats", "utils" ], - "Hash": "838eef43ab267a7409d68ba0fa8da5fa" + "Hash": "da1575dfeace212da5adae444704d212" }, "DelayedArray": { "Package": "DelayedArray", - "Version": "0.26.7", + "Version": "0.30.1", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "IRanges", @@ -214,43 +224,45 @@ "R", "S4Arrays", "S4Vectors", + "SparseArray", "methods", "stats", "stats4" ], - "Hash": "ef6ff3e15ce624118e6cf8151e58e38c" + "Hash": "395472c65cd9d606a1a345687102f299" }, "GenomeInfoDb": { "Package": "GenomeInfoDb", - "Version": "1.36.4", + "Version": "1.39.10", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "GenomeInfoDbData", "IRanges", "R", - "RCurl", "S4Vectors", "methods", "stats", "stats4", "utils" ], - "Hash": "1c6756527d78e8135d34662d2e1d54ec" + "Hash": "86cc7f0a5b83be019673ef3a508dda2c" }, "GenomeInfoDbData": { "Package": "GenomeInfoDbData", - "Version": "1.2.10", + "Version": "1.2.12", "Source": "Bioconductor", "Requirements": [ "R" ], - "Hash": "56294b21068b8cb5db1c47d0a42f307b" + "Hash": "c3c792a7b7f2677be56e8632c5b7543d" }, "GenomicAlignments": { "Package": "GenomicAlignments", - "Version": "1.36.0", + "Version": "1.40.0", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "BiocParallel", @@ -266,12 +278,13 @@ "stats", "utils" ], - "Hash": "21b603ae11c96c397db2231103e4d430" + "Hash": "e539709764587c581b31e446dc84d7b8" }, "GenomicRanges": { "Package": "GenomicRanges", - "Version": "1.52.1", + "Version": "1.55.4", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "GenomeInfoDb", @@ -284,12 +297,13 @@ "stats4", "utils" ], - "Hash": "c9471497a07953ded9b8889879c079e9" + "Hash": "f0957f9dcf1bdf2d301d82e5bea6e7ca" }, "IRanges": { "Package": "IRanges", - "Version": "2.34.1", + "Version": "2.37.1", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "R", @@ -299,11 +313,11 @@ "stats4", "utils" ], - "Hash": "18939552437a335b59fb381e508275d6" + "Hash": "4adff00e89fd9b182216f800f61a8943" }, "Matrix": { "Package": "Matrix", - "Version": "1.6-5", + "Version": "1.7-0", "Source": "Repository", "Repository": "CRAN", "Requirements": [ @@ -316,17 +330,18 @@ "stats", "utils" ], - "Hash": "8c7115cd3a0e048bda2a7cd110549f7a" + "Hash": "1920b2f11133b12350024297d8a4ff4a" }, "MatrixGenerics": { "Package": "MatrixGenerics", - "Version": "1.12.3", + "Version": "1.16.0", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "matrixStats", "methods" ], - "Hash": "10a6bd0dcabaeede87616e4465b6ac6f" + "Hash": "152dbbcde6a9a7c7f3beef79b68cd76a" }, "R6": { "Package": "R6", @@ -340,7 +355,7 @@ }, "RCurl": { "Package": "RCurl", - "Version": "1.98-1.14", + "Version": "1.98-1.16", "Source": "Repository", "Repository": "CRAN", "Requirements": [ @@ -348,18 +363,18 @@ "bitops", "methods" ], - "Hash": "47f648d288079d0c696804ad4e55197e" + "Hash": "ddbdf53d15b47be4407ede6914f56fbb" }, "Rcpp": { "Package": "Rcpp", - "Version": "1.0.12", + "Version": "1.0.13", "Source": "Repository", "Repository": "CRAN", "Requirements": [ "methods", "utils" ], - "Hash": "5ea2700d21e038ace58269ecdbeb9ec0" + "Hash": "f27411eb6d9c3dada5edd444b8416675" }, "RefManageR": { "Package": "RefManageR", @@ -383,17 +398,20 @@ }, "Rhtslib": { "Package": "Rhtslib", - "Version": "2.2.0", + "Version": "3.0.0", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ + "tools", "zlibbioc" ], - "Hash": "225ae2e63b9c94991ee76bd33601b275" + "Hash": "5d6514cd44a0106581e3310f3972a82e" }, "Rsamtools": { "Package": "Rsamtools", - "Version": "2.16.0", + "Version": "2.20.0", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "BiocParallel", @@ -411,12 +429,13 @@ "utils", "zlibbioc" ], - "Hash": "e84f23bbbd554c5354471458a687046f" + "Hash": "9762f24dcbdbd1626173c516bb64792c" }, "S4Arrays": { "Package": "S4Arrays", - "Version": "1.0.6", + "Version": "1.4.1", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "IRanges", @@ -428,12 +447,13 @@ "methods", "stats" ], - "Hash": "2b40d107b4a6fbd3f0cc81214d0b2891" + "Hash": "deeed4802c5132e88f24a432a1caf5e0" }, "S4Vectors": { "Package": "S4Vectors", - "Version": "0.38.2", + "Version": "0.41.5", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "R", @@ -442,12 +462,34 @@ "stats4", "utils" ], - "Hash": "338207894998073e7823706f886a1386" + "Hash": "9912dc4d5ed3e8d92d15573b57e9a6c8" + }, + "SparseArray": { + "Package": "SparseArray", + "Version": "1.4.8", + "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", + "Requirements": [ + "BiocGenerics", + "IRanges", + "Matrix", + "MatrixGenerics", + "R", + "S4Arrays", + "S4Vectors", + "XVector", + "matrixStats", + "methods", + "stats", + "utils" + ], + "Hash": "97f70ff11c14edd379ee2429228cbb60" }, "SummarizedExperiment": { "Package": "SummarizedExperiment", - "Version": "1.30.2", + "Version": "1.34.0", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "Biobase", "BiocGenerics", @@ -465,11 +507,11 @@ "tools", "utils" ], - "Hash": "6af56fbcb57deb9b094a352beee9a202" + "Hash": "2f6c8cc972ed6aee07c96e3dff729d15" }, "XML": { "Package": "XML", - "Version": "3.99-0.16.1", + "Version": "3.99-0.17", "Source": "Repository", "Repository": "CRAN", "Requirements": [ @@ -477,12 +519,13 @@ "methods", "utils" ], - "Hash": "da3098169c887914551b607c66fe2a28" + "Hash": "bc2a8a1139d8d4bd9c46086708945124" }, "XVector": { "Package": "XVector", - "Version": "0.40.0", + "Version": "0.43.1", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "IRanges", @@ -493,7 +536,7 @@ "utils", "zlibbioc" ], - "Hash": "cc3048ef590a16ff55a5e3149d5e060b" + "Hash": "24224fef455e6f52ab17348e95fbea72" }, "abind": { "Package": "abind", @@ -511,7 +554,7 @@ "Package": "askpass", "Version": "1.2.0", "Source": "Repository", - "Repository": "RSPM", + "Repository": "CRAN", "Requirements": [ "sys" ], @@ -519,13 +562,13 @@ }, "backports": { "Package": "backports", - "Version": "1.4.1", + "Version": "1.5.0", "Source": "Repository", "Repository": "CRAN", "Requirements": [ "R" ], - "Hash": "c39fbec8a30d23e721980b8afb31984c" + "Hash": "e1e1b9d75c37401117b636b7ae50827a" }, "base64enc": { "Package": "base64enc", @@ -551,14 +594,14 @@ }, "bitops": { "Package": "bitops", - "Version": "1.0-7", + "Version": "1.0-8", "Source": "Repository", - "Repository": "RSPM", - "Hash": "b7d8d8ee39869c18d8846a184dd8a1af" + "Repository": "CRAN", + "Hash": "da69e6b6f8feebec0827205aad3fdbd8" }, "bookdown": { "Package": "bookdown", - "Version": "0.39", + "Version": "0.40", "Source": "Repository", "Repository": "CRAN", "Requirements": [ @@ -571,11 +614,11 @@ "xfun", "yaml" ], - "Hash": "cb4f7066855b6f936e8d25edc9a9cff9" + "Hash": "896a79478a50c78fb035a37148638f4e" }, "bslib": { "Package": "bslib", - "Version": "0.7.0", + "Version": "0.8.0", "Source": "Repository", "Repository": "CRAN", "Requirements": [ @@ -593,29 +636,29 @@ "rlang", "sass" ], - "Hash": "8644cc53f43828f19133548195d7e59e" + "Hash": "b299c6741ca9746fb227debcb0f9fb6c" }, "cachem": { "Package": "cachem", - "Version": "1.0.8", + "Version": "1.1.0", "Source": "Repository", - "Repository": "CRAN", + "Repository": "RSPM", "Requirements": [ "fastmap", "rlang" ], - "Hash": "c35768291560ce302c0a6589f92e837d" + "Hash": "cd9a672193789068eb5a2aad65a0dedf" }, "cli": { "Package": "cli", - "Version": "3.6.2", + "Version": "3.6.3", "Source": "Repository", "Repository": "CRAN", "Requirements": [ "R", "utils" ], - "Hash": "1216ac65ac55ec0058a6f75d7ca0fd52" + "Hash": "b21916dd77a27642b447374a5d30ecf3" }, "codetools": { "Package": "codetools", @@ -653,7 +696,7 @@ "Package": "curl", "Version": "5.2.1", "Source": "Repository", - "Repository": "RSPM", + "Repository": "CRAN", "Requirements": [ "R" ], @@ -661,38 +704,38 @@ }, "digest": { "Package": "digest", - "Version": "0.6.35", + "Version": "0.6.36", "Source": "Repository", "Repository": "CRAN", "Requirements": [ "R", "utils" ], - "Hash": "698ece7ba5a4fa4559e3d537e7ec3d31" + "Hash": "fd6824ad91ede64151e93af67df6376b" }, "evaluate": { "Package": "evaluate", - "Version": "0.23", + "Version": "0.24.0", "Source": "Repository", "Repository": "CRAN", "Requirements": [ "R", "methods" ], - "Hash": "daf4a1246be12c1fa8c7705a0935c1a0" + "Hash": "a1066cbc05caee9a4bf6d90f194ff4da" }, "fastmap": { "Package": "fastmap", - "Version": "1.1.1", + "Version": "1.2.0", "Source": "Repository", "Repository": "RSPM", - "Hash": "f7736a18de97dea803bde0a2daaafb27" + "Hash": "aa5e1cd11c2d15497494c5292d7ffcc8" }, "fontawesome": { "Package": "fontawesome", "Version": "0.5.2", "Source": "Repository", - "Repository": "CRAN", + "Repository": "RSPM", "Requirements": [ "R", "htmltools", @@ -714,7 +757,7 @@ "Package": "fs", "Version": "1.6.4", "Source": "Repository", - "Repository": "CRAN", + "Repository": "RSPM", "Requirements": [ "R", "methods" @@ -759,7 +802,7 @@ "Package": "glue", "Version": "1.7.0", "Source": "Repository", - "Repository": "CRAN", + "Repository": "RSPM", "Requirements": [ "R", "methods" @@ -768,20 +811,20 @@ }, "highr": { "Package": "highr", - "Version": "0.10", + "Version": "0.11", "Source": "Repository", - "Repository": "RSPM", + "Repository": "CRAN", "Requirements": [ "R", "xfun" ], - "Hash": "06230136b2d2b9ba5805e1963fa6e890" + "Hash": "d65ba49117ca223614f71b60d85b8ab7" }, "htmltools": { "Package": "htmltools", "Version": "0.5.8.1", "Source": "Repository", - "Repository": "CRAN", + "Repository": "RSPM", "Requirements": [ "R", "base64enc", @@ -812,7 +855,7 @@ "Package": "jquerylib", "Version": "0.1.4", "Source": "Repository", - "Repository": "CRAN", + "Repository": "RSPM", "Requirements": [ "htmltools" ], @@ -822,7 +865,7 @@ "Package": "jsonlite", "Version": "1.8.8", "Source": "Repository", - "Repository": "CRAN", + "Repository": "RSPM", "Requirements": [ "methods" ], @@ -830,9 +873,9 @@ }, "knitr": { "Package": "knitr", - "Version": "1.46", + "Version": "1.48", "Source": "Repository", - "Repository": "CRAN", + "Repository": "RSPM", "Requirements": [ "R", "evaluate", @@ -842,7 +885,7 @@ "xfun", "yaml" ], - "Hash": "6e008ab1d696a5283c79765fa7b56b47" + "Hash": "acf380f300c721da9fde7df115a5f86f" }, "lambda.r": { "Package": "lambda.r", @@ -859,7 +902,7 @@ "Package": "lattice", "Version": "0.22-6", "Source": "Repository", - "Repository": "RSPM", + "Repository": "CRAN", "Requirements": [ "R", "grDevices", @@ -874,7 +917,7 @@ "Package": "lifecycle", "Version": "1.0.4", "Source": "Repository", - "Repository": "CRAN", + "Repository": "RSPM", "Requirements": [ "R", "cli", @@ -939,13 +982,13 @@ }, "openssl": { "Package": "openssl", - "Version": "2.1.2", + "Version": "2.2.0", "Source": "Repository", "Repository": "CRAN", "Requirements": [ "askpass" ], - "Hash": "ea2475b073243d9d338aa8f086ce973e" + "Hash": "2bcca3848e4734eb3b16103bc9aa4b8e" }, "plyr": { "Package": "plyr", @@ -1006,18 +1049,18 @@ }, "rlang": { "Package": "rlang", - "Version": "1.1.3", + "Version": "1.1.4", "Source": "Repository", "Repository": "CRAN", "Requirements": [ "R", "utils" ], - "Hash": "42548638fae05fd9a9b5f3f437fbbbe2" + "Hash": "3eec01f8b1dee337674b2e34ab1f9bc1" }, "rmarkdown": { "Package": "rmarkdown", - "Version": "2.26", + "Version": "2.27", "Source": "Repository", "Repository": "CRAN", "Requirements": [ @@ -1036,12 +1079,13 @@ "xfun", "yaml" ], - "Hash": "9b148e7f95d33aac01f31282d49e4f44" + "Hash": "27f9502e1cdbfa195f94e03b0f517484" }, "rtracklayer": { "Package": "rtracklayer", - "Version": "1.60.1", + "Version": "1.64.0", "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", "Requirements": [ "BiocGenerics", "BiocIO", @@ -1051,23 +1095,24 @@ "GenomicRanges", "IRanges", "R", - "RCurl", "Rsamtools", "S4Vectors", "XML", "XVector", + "curl", + "httr", "methods", "restfulr", "tools", "zlibbioc" ], - "Hash": "6732db89601d93a1697d8c280fa96444" + "Hash": "3d6f004fce582bd7d68e2e18d44abbc1" }, "sass": { "Package": "sass", "Version": "0.4.9", "Source": "Repository", - "Repository": "CRAN", + "Repository": "RSPM", "Requirements": [ "R6", "fs", @@ -1138,19 +1183,19 @@ }, "tinytex": { "Package": "tinytex", - "Version": "0.51", + "Version": "0.52", "Source": "Repository", "Repository": "CRAN", "Requirements": [ "xfun" ], - "Hash": "d44e2fcd2e4e076f0aac540208559d1d" + "Hash": "cfbad971a71f0e27cec22e544a08bc3b" }, "vctrs": { "Package": "vctrs", "Version": "0.6.5", "Source": "Repository", - "Repository": "CRAN", + "Repository": "RSPM", "Requirements": [ "R", "cli", @@ -1162,7 +1207,7 @@ }, "xfun": { "Package": "xfun", - "Version": "0.44", + "Version": "0.46", "Source": "Repository", "Repository": "CRAN", "Requirements": [ @@ -1170,13 +1215,13 @@ "stats", "tools" ], - "Hash": "317a0538d32f4a009658bcedb7923f4b" + "Hash": "00ce32f398db0415dde61abfef11300c" }, "xml2": { "Package": "xml2", "Version": "1.3.6", "Source": "Repository", - "Repository": "RSPM", + "Repository": "CRAN", "Requirements": [ "R", "cli", @@ -1187,16 +1232,17 @@ }, "yaml": { "Package": "yaml", - "Version": "2.3.8", + "Version": "2.3.10", "Source": "Repository", "Repository": "CRAN", - "Hash": "29240487a071f535f5e5d5a323b7afbd" + "Hash": "51dab85c6c98e50a18d7551e9d49f76c" }, "zlibbioc": { "Package": "zlibbioc", - "Version": "1.46.0", + "Version": "1.49.3", "Source": "Bioconductor", - "Hash": "20158ef5adb641f0b4e8d63136f0e870" + "Repository": "Bioconductor 3.19", + "Hash": "6dd05467a4736905623634dc1f145da6" } } } From e38b072d8e06620e068a8410267c7b57f1e639b3 Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Tue, 30 Jul 2024 14:42:23 +0100 Subject: [PATCH 09/19] extend episode and update renv --- episodes/08-annotations.Rmd | 55 +- renv/activate.R | 586 ++++++++++---------- renv/profiles/lesson-requirements/renv.lock | 410 ++++++++++++++ 3 files changed, 754 insertions(+), 297 deletions(-) diff --git a/episodes/08-annotations.Rmd b/episodes/08-annotations.Rmd index a35397d..7ba1842 100644 --- a/episodes/08-annotations.Rmd +++ b/episodes/08-annotations.Rmd @@ -57,18 +57,65 @@ Packages dedicated to query gene annotations exist in the 'Software' and 'Annotation' categories of the Bioconductor [biocViews][biocviews], according to their nature. -In the 'Software' section, we find packages that do not _contain_ gene -annotations, -but rather dynamically _query_ them from online resources +In the 'Software' section, we find packages that do not actually contain gene +annotations, but rather dynamically _query_ them from online resources (e.g.,[Ensembl BioMart][biomart-ensembl]). One such Bioconductor package is `r BiocStyle::Biocpkg("biomaRt")`. -Instead, in the 'Annotation' section, we find packages that _contain_ +Instead, in the 'Annotation' section, we find packages that do contain annotations. Examples include `r BiocStyle::Biocpkg("org.Hs.eg.db")`, `r BiocStyle::Biocpkg("EnsDb.Hsapiens.v86")`, and `r BiocStyle::Biocpkg("TxDb.Hsapiens.UCSC.hg38.knownGene")`. +In this episode, we will demonstrate the two approaches: + +* Querying annotations from the Ensembl Biomart API using the `r BiocStyle::Biocpkg("biomaRt")` package. +* Querying annotations from the `r BiocStyle::Biocpkg("org.Hs.eg.db")` annotation package. + +## Querying annotations from online resources + +### Pros and cons + +Pros: + +* Automatically access the latest information + +Cons: + +* Requires a live and stable internet connection. +* Reproducibility may not be possible if the resource is updated without access + to archives. +* Data may be organised differently in each resource. +* Custom code may be needed to access and retrieve data from each resource. + +### The Ensembl BioMart + +[Ensembl BioMart][biomart-ensembl] is a robust data mining tool designed to +facilitate access to the vast array of biological data available through the +Ensembl project. + +The [BioMart web interface][biomart-ensembl] enables researchers to efficiently +query and retrieve data on genes, proteins, and other genomic features +across multiple species. +It allows users to filter, sort, and export data based on various attributes +such as gene IDs, chromosomal locations, and functional annotations. + +### The Bioconductor `biomaRt` package + +`r BiocStyle::Biocpkg("biomaRt")` is a Bioconductor software package that +enables retrieval of large amounts of data from Ensembl BioMart tables +directly from an R session where those annotations can be used. + +Let us first load the package: + +```{r} +library(biomaRt) +``` + + + + [biocviews]: https://www.bioconductor.org/packages/release/BiocViews.html [biomart-ensembl]: https://www.ensembl.org/biomart/martview diff --git a/renv/activate.R b/renv/activate.R index 9039d08..d13f993 100644 --- a/renv/activate.R +++ b/renv/activate.R @@ -97,65 +97,65 @@ local({ if ("renv" %in% loadedNamespaces()) unloadNamespace("renv") - # load bootstrap tools + # load bootstrap tools `%||%` <- function(x, y) { if (is.null(x)) y else x } - + catf <- function(fmt, ..., appendLF = TRUE) { - + quiet <- getOption("renv.bootstrap.quiet", default = FALSE) if (quiet) return(invisible()) - + msg <- sprintf(fmt, ...) cat(msg, file = stdout(), sep = if (appendLF) "\n" else "") - + invisible(msg) - + } - + header <- function(label, - ..., - prefix = "#", - suffix = "-", - n = min(getOption("width"), 78)) + ..., + prefix = "#", + suffix = "-", + n = min(getOption("width"), 78)) { label <- sprintf(label, ...) n <- max(n - nchar(label) - nchar(prefix) - 2L, 8L) if (n <= 0) return(paste(prefix, label)) - + tail <- paste(rep.int(suffix, n), collapse = "") paste0(prefix, " ", label, " ", tail) - + } - + heredoc <- function(text, leave = 0) { - + # remove leading, trailing whitespace trimmed <- gsub("^\\s*\\n|\\n\\s*$", "", text) - + # split into lines lines <- strsplit(trimmed, "\n", fixed = TRUE)[[1L]] - + # compute common indent indent <- regexpr("[^[:space:]]", lines) common <- min(setdiff(indent, -1L)) - leave paste(substring(lines, common), collapse = "\n") - + } - + startswith <- function(string, prefix) { substring(string, 1, nchar(prefix)) == prefix } - + bootstrap <- function(version, library) { - + friendly <- renv_bootstrap_version_friendly(version) section <- header(sprintf("Bootstrapping renv %s", friendly)) catf(section) - + # attempt to download renv catf("- Downloading renv ... ", appendLF = FALSE) withCallingHandlers( @@ -167,7 +167,7 @@ local({ ) catf("OK") on.exit(unlink(tarball), add = TRUE) - + # now attempt to install catf("- Installing renv ... ", appendLF = FALSE) withCallingHandlers( @@ -178,174 +178,174 @@ local({ } ) catf("OK") - + # add empty line to break up bootstrapping from normal output catf("") - + return(invisible()) } - + renv_bootstrap_tests_running <- function() { getOption("renv.tests.running", default = FALSE) } - + renv_bootstrap_repos <- function() { - + # get CRAN repository cran <- getOption("renv.repos.cran", "https://cloud.r-project.org") - + # check for repos override repos <- Sys.getenv("RENV_CONFIG_REPOS_OVERRIDE", unset = NA) if (!is.na(repos)) { - + # check for RSPM; if set, use a fallback repository for renv rspm <- Sys.getenv("RSPM", unset = NA) if (identical(rspm, repos)) repos <- c(RSPM = rspm, CRAN = cran) - + return(repos) - + } - + # check for lockfile repositories repos <- tryCatch(renv_bootstrap_repos_lockfile(), error = identity) if (!inherits(repos, "error") && length(repos)) return(repos) - + # retrieve current repos repos <- getOption("repos") - + # ensure @CRAN@ entries are resolved repos[repos == "@CRAN@"] <- cran - + # add in renv.bootstrap.repos if set default <- c(FALLBACK = "https://cloud.r-project.org") extra <- getOption("renv.bootstrap.repos", default = default) repos <- c(repos, extra) - + # remove duplicates that might've snuck in dupes <- duplicated(repos) | duplicated(names(repos)) repos[!dupes] - + } - + renv_bootstrap_repos_lockfile <- function() { - + lockpath <- Sys.getenv("RENV_PATHS_LOCKFILE", unset = "renv.lock") if (!file.exists(lockpath)) return(NULL) - + lockfile <- tryCatch(renv_json_read(lockpath), error = identity) if (inherits(lockfile, "error")) { warning(lockfile) return(NULL) } - + repos <- lockfile$R$Repositories if (length(repos) == 0) return(NULL) - + keys <- vapply(repos, `[[`, "Name", FUN.VALUE = character(1)) vals <- vapply(repos, `[[`, "URL", FUN.VALUE = character(1)) names(vals) <- keys - + return(vals) - + } - + renv_bootstrap_download <- function(version) { - + sha <- attr(version, "sha", exact = TRUE) - + methods <- if (!is.null(sha)) { - + # attempting to bootstrap a development version of renv c( function() renv_bootstrap_download_tarball(sha), function() renv_bootstrap_download_github(sha) ) - + } else { - + # attempting to bootstrap a release version of renv c( function() renv_bootstrap_download_tarball(version), function() renv_bootstrap_download_cran_latest(version), function() renv_bootstrap_download_cran_archive(version) ) - + } - + for (method in methods) { path <- tryCatch(method(), error = identity) if (is.character(path) && file.exists(path)) return(path) } - + stop("All download methods failed") - + } - + renv_bootstrap_download_impl <- function(url, destfile) { - + mode <- "wb" - + # https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17715 fixup <- Sys.info()[["sysname"]] == "Windows" && substring(url, 1L, 5L) == "file:" - + if (fixup) mode <- "w+b" - + args <- list( url = url, destfile = destfile, mode = mode, quiet = TRUE ) - + if ("headers" %in% names(formals(utils::download.file))) args$headers <- renv_bootstrap_download_custom_headers(url) - + do.call(utils::download.file, args) - + } - + renv_bootstrap_download_custom_headers <- function(url) { - + headers <- getOption("renv.download.headers") if (is.null(headers)) return(character()) - + if (!is.function(headers)) stopf("'renv.download.headers' is not a function") - + headers <- headers(url) if (length(headers) == 0L) return(character()) - + if (is.list(headers)) headers <- unlist(headers, recursive = FALSE, use.names = TRUE) - + ok <- is.character(headers) && is.character(names(headers)) && all(nzchar(names(headers))) - + if (!ok) stop("invocation of 'renv.download.headers' did not return a named character vector") - + headers - + } - + renv_bootstrap_download_cran_latest <- function(version) { - + spec <- renv_bootstrap_download_cran_latest_find(version) type <- spec$type repos <- spec$repos - + baseurl <- utils::contrib.url(repos = repos, type = type) ext <- if (identical(type, "source")) ".tar.gz" @@ -355,36 +355,36 @@ local({ ".tgz" name <- sprintf("renv_%s%s", version, ext) url <- paste(baseurl, name, sep = "/") - + destfile <- file.path(tempdir(), name) status <- tryCatch( renv_bootstrap_download_impl(url, destfile), condition = identity ) - + if (inherits(status, "condition")) return(FALSE) - + # report success and return destfile - + } - + renv_bootstrap_download_cran_latest_find <- function(version) { - + # check whether binaries are supported on this system binary <- getOption("renv.bootstrap.binary", default = TRUE) && !identical(.Platform$pkgType, "source") && !identical(getOption("pkgType"), "source") && Sys.info()[["sysname"]] %in% c("Darwin", "Windows") - + types <- c(if (binary) "binary", "source") - + # iterate over types + repositories for (type in types) { for (repos in renv_bootstrap_repos()) { - + # retrieve package database db <- tryCatch( as.data.frame( @@ -393,89 +393,89 @@ local({ ), error = identity ) - + if (inherits(db, "error")) next - + # check for compatible entry entry <- db[db$Package %in% "renv" & db$Version %in% version, ] if (nrow(entry) == 0) next - + # found it; return spec to caller spec <- list(entry = entry, type = type, repos = repos) return(spec) - + } } - + # if we got here, we failed to find renv fmt <- "renv %s is not available from your declared package repositories" stop(sprintf(fmt, version)) - + } - + renv_bootstrap_download_cran_archive <- function(version) { - + name <- sprintf("renv_%s.tar.gz", version) repos <- renv_bootstrap_repos() urls <- file.path(repos, "src/contrib/Archive/renv", name) destfile <- file.path(tempdir(), name) - + for (url in urls) { - + status <- tryCatch( renv_bootstrap_download_impl(url, destfile), condition = identity ) - + if (identical(status, 0L)) return(destfile) - + } - + return(FALSE) - + } - + renv_bootstrap_download_tarball <- function(version) { - + # if the user has provided the path to a tarball via # an environment variable, then use it tarball <- Sys.getenv("RENV_BOOTSTRAP_TARBALL", unset = NA) if (is.na(tarball)) return() - + # allow directories if (dir.exists(tarball)) { name <- sprintf("renv_%s.tar.gz", version) tarball <- file.path(tarball, name) } - + # bail if it doesn't exist if (!file.exists(tarball)) { - + # let the user know we weren't able to honour their request fmt <- "- RENV_BOOTSTRAP_TARBALL is set (%s) but does not exist." msg <- sprintf(fmt, tarball) warning(msg) - + # bail return() - + } - + catf("- Using local tarball '%s'.", tarball) tarball - + } - + renv_bootstrap_download_github <- function(version) { - + enabled <- Sys.getenv("RENV_BOOTSTRAP_FROM_GITHUB", unset = "TRUE") if (!identical(enabled, "TRUE")) return(FALSE) - + # prepare download options pat <- Sys.getenv("GITHUB_PAT") if (nzchar(Sys.which("curl")) && nzchar(pat)) { @@ -491,25 +491,25 @@ local({ options(download.file.method = "wget", download.file.extra = extra) on.exit(do.call(base::options, saved), add = TRUE) } - + url <- file.path("https://api.github.com/repos/rstudio/renv/tarball", version) name <- sprintf("renv_%s.tar.gz", version) destfile <- file.path(tempdir(), name) - + status <- tryCatch( renv_bootstrap_download_impl(url, destfile), condition = identity ) - + if (!identical(status, 0L)) return(FALSE) - + renv_bootstrap_download_augment(destfile) - + return(destfile) - + } - + # Add Sha to DESCRIPTION. This is stop gap until #890, after which we # can use renv::install() to fully capture metadata. renv_bootstrap_download_augment <- function(destfile) { @@ -517,13 +517,13 @@ local({ if (is.null(sha)) { return() } - + # Untar tempdir <- tempfile("renv-github-") on.exit(unlink(tempdir, recursive = TRUE), add = TRUE) untar(destfile, exdir = tempdir) pkgdir <- dir(tempdir, full.names = TRUE)[[1]] - + # Modify description desc_path <- file.path(pkgdir, "DESCRIPTION") desc_lines <- readLines(desc_path) @@ -537,173 +537,173 @@ local({ paste("RemoteSha: ", sha) ) writeLines(c(desc_lines[desc_lines != ""], remotes_fields), con = desc_path) - + # Re-tar local({ old <- setwd(tempdir) on.exit(setwd(old), add = TRUE) - + tar(destfile, compression = "gzip") }) invisible() } - + # Extract the commit hash from a git archive. Git archives include the SHA1 # hash as the comment field of the tarball pax extended header # (see https://www.kernel.org/pub/software/scm/git/docs/git-archive.html) # For GitHub archives this should be the first header after the default one # (512 byte) header. renv_bootstrap_git_extract_sha1_tar <- function(bundle) { - + # open the bundle for reading # We use gzcon for everything because (from ?gzcon) # > Reading from a connection which does not supply a 'gzip' magic # > header is equivalent to reading from the original connection conn <- gzcon(file(bundle, open = "rb", raw = TRUE)) on.exit(close(conn)) - + # The default pax header is 512 bytes long and the first pax extended header # with the comment should be 51 bytes long # `52 comment=` (11 chars) + 40 byte SHA1 hash len <- 0x200 + 0x33 res <- rawToChar(readBin(conn, "raw", n = len)[0x201:len]) - + if (grepl("^52 comment=", res)) { sub("52 comment=", "", res) } else { NULL } } - + renv_bootstrap_install <- function(version, tarball, library) { - + # attempt to install it into project library dir.create(library, showWarnings = FALSE, recursive = TRUE) output <- renv_bootstrap_install_impl(library, tarball) - + # check for successful install status <- attr(output, "status") if (is.null(status) || identical(status, 0L)) return(status) - + # an error occurred; report it header <- "installation of renv failed" lines <- paste(rep.int("=", nchar(header)), collapse = "") text <- paste(c(header, lines, output), collapse = "\n") stop(text) - + } - + renv_bootstrap_install_impl <- function(library, tarball) { - + # invoke using system2 so we can capture and report output bin <- R.home("bin") exe <- if (Sys.info()[["sysname"]] == "Windows") "R.exe" else "R" R <- file.path(bin, exe) - + args <- c( "--vanilla", "CMD", "INSTALL", "--no-multiarch", "-l", shQuote(path.expand(library)), shQuote(path.expand(tarball)) ) - + system2(R, args, stdout = TRUE, stderr = TRUE) - + } - + renv_bootstrap_platform_prefix <- function() { - + # construct version prefix version <- paste(R.version$major, R.version$minor, sep = ".") prefix <- paste("R", numeric_version(version)[1, 1:2], sep = "-") - + # include SVN revision for development versions of R # (to avoid sharing platform-specific artefacts with released versions of R) devel <- identical(R.version[["status"]], "Under development (unstable)") || identical(R.version[["nickname"]], "Unsuffered Consequences") - + if (devel) prefix <- paste(prefix, R.version[["svn rev"]], sep = "-r") - + # build list of path components components <- c(prefix, R.version$platform) - + # include prefix if provided by user prefix <- renv_bootstrap_platform_prefix_impl() if (!is.na(prefix) && nzchar(prefix)) components <- c(prefix, components) - + # build prefix paste(components, collapse = "/") - + } - + renv_bootstrap_platform_prefix_impl <- function() { - + # if an explicit prefix has been supplied, use it prefix <- Sys.getenv("RENV_PATHS_PREFIX", unset = NA) if (!is.na(prefix)) return(prefix) - + # if the user has requested an automatic prefix, generate it auto <- Sys.getenv("RENV_PATHS_PREFIX_AUTO", unset = NA) if (is.na(auto) && getRversion() >= "4.4.0") auto <- "TRUE" - + if (auto %in% c("TRUE", "True", "true", "1")) return(renv_bootstrap_platform_prefix_auto()) - + # empty string on failure "" - + } - + renv_bootstrap_platform_prefix_auto <- function() { - + prefix <- tryCatch(renv_bootstrap_platform_os(), error = identity) if (inherits(prefix, "error") || prefix %in% "unknown") { - + msg <- paste( "failed to infer current operating system", "please file a bug report at https://github.com/rstudio/renv/issues", sep = "; " ) - + warning(msg) - + } - + prefix - + } - + renv_bootstrap_platform_os <- function() { - + sysinfo <- Sys.info() sysname <- sysinfo[["sysname"]] - + # handle Windows + macOS up front if (sysname == "Windows") return("windows") else if (sysname == "Darwin") return("macos") - + # check for os-release files for (file in c("/etc/os-release", "/usr/lib/os-release")) if (file.exists(file)) return(renv_bootstrap_platform_os_via_os_release(file, sysinfo)) - + # check for redhat-release files if (file.exists("/etc/redhat-release")) return(renv_bootstrap_platform_os_via_redhat_release()) - + "unknown" - + } - + renv_bootstrap_platform_os_via_os_release <- function(file, sysinfo) { - + # read /etc/os-release release <- utils::read.table( file = file, @@ -713,13 +713,13 @@ local({ comment.char = "#", stringsAsFactors = FALSE ) - + vars <- as.list(release$Value) names(vars) <- release$Key - + # get os name os <- tolower(sysinfo[["sysname"]]) - + # read id id <- "unknown" for (field in c("ID", "ID_LIKE")) { @@ -728,7 +728,7 @@ local({ break } } - + # read version version <- "unknown" for (field in c("UBUNTU_CODENAME", "VERSION_CODENAME", "VERSION_ID", "BUILD_ID")) { @@ -737,17 +737,17 @@ local({ break } } - + # join together paste(c(os, id, version), collapse = "-") - + } - + renv_bootstrap_platform_os_via_redhat_release <- function() { - + # read /etc/redhat-release contents <- readLines("/etc/redhat-release", warn = FALSE) - + # infer id id <- if (grepl("centos", contents, ignore.case = TRUE)) "centos" @@ -755,73 +755,73 @@ local({ "redhat" else "unknown" - + # try to find a version component (very hacky) version <- "unknown" - + parts <- strsplit(contents, "[[:space:]]")[[1L]] for (part in parts) { - + nv <- tryCatch(numeric_version(part), error = identity) if (inherits(nv, "error")) next - + version <- nv[1, 1] break - + } - + paste(c("linux", id, version), collapse = "-") - + } - + renv_bootstrap_library_root_name <- function(project) { - + # use project name as-is if requested asis <- Sys.getenv("RENV_PATHS_LIBRARY_ROOT_ASIS", unset = "FALSE") if (asis) return(basename(project)) - + # otherwise, disambiguate based on project's path id <- substring(renv_bootstrap_hash_text(project), 1L, 8L) paste(basename(project), id, sep = "-") - + } - + renv_bootstrap_library_root <- function(project) { - + prefix <- renv_bootstrap_profile_prefix() - + path <- Sys.getenv("RENV_PATHS_LIBRARY", unset = NA) if (!is.na(path)) return(paste(c(path, prefix), collapse = "/")) - + path <- renv_bootstrap_library_root_impl(project) if (!is.null(path)) { name <- renv_bootstrap_library_root_name(project) return(paste(c(path, prefix, name), collapse = "/")) } - + renv_bootstrap_paths_renv("library", project = project) - + } - + renv_bootstrap_library_root_impl <- function(project) { - + root <- Sys.getenv("RENV_PATHS_LIBRARY_ROOT", unset = NA) if (!is.na(root)) return(root) - + type <- renv_bootstrap_project_type(project) if (identical(type, "package")) { userdir <- renv_bootstrap_user_dir() return(file.path(userdir, "library")) } - + } - + renv_bootstrap_validate_version <- function(version, description = NULL) { - + # resolve description file # # avoid passing lib.loc to `packageDescription()` below, since R will @@ -829,17 +829,17 @@ local({ # this function should only be called after 'renv' is loaded # https://github.com/rstudio/renv/issues/1625 description <- description %||% packageDescription("renv") - + # check whether requested version 'version' matches loaded version of renv sha <- attr(version, "sha", exact = TRUE) valid <- if (!is.null(sha)) renv_bootstrap_validate_version_dev(sha, description) else renv_bootstrap_validate_version_release(version, description) - + if (valid) return(TRUE) - + # the loaded version of renv doesn't match the requested version; # give the user instructions on how to proceed dev <- identical(description[["RemoteType"]], "github") @@ -847,103 +847,103 @@ local({ paste("rstudio/renv", description[["RemoteSha"]], sep = "@") else paste("renv", description[["Version"]], sep = "@") - + # display both loaded version + sha if available friendly <- renv_bootstrap_version_friendly( version = description[["Version"]], sha = if (dev) description[["RemoteSha"]] ) - + fmt <- heredoc(" renv %1$s was loaded from project library, but this project is configured to use renv %2$s. - Use `renv::record(\"%3$s\")` to record renv %1$s in the lockfile. - Use `renv::restore(packages = \"renv\")` to install renv %2$s into the project library. ") catf(fmt, friendly, renv_bootstrap_version_friendly(version), remote) - + FALSE - + } - + renv_bootstrap_validate_version_dev <- function(version, description) { expected <- description[["RemoteSha"]] is.character(expected) && startswith(expected, version) } - + renv_bootstrap_validate_version_release <- function(version, description) { expected <- description[["Version"]] is.character(expected) && identical(expected, version) } - + renv_bootstrap_hash_text <- function(text) { - + hashfile <- tempfile("renv-hash-") on.exit(unlink(hashfile), add = TRUE) - + writeLines(text, con = hashfile) tools::md5sum(hashfile) - + } - + renv_bootstrap_load <- function(project, libpath, version) { - + # try to load renv from the project library if (!requireNamespace("renv", lib.loc = libpath, quietly = TRUE)) return(FALSE) - + # warn if the version of renv loaded does not match renv_bootstrap_validate_version(version) - + # execute renv load hooks, if any hooks <- getHook("renv::autoload") for (hook in hooks) if (is.function(hook)) tryCatch(hook(), error = warnify) - + # load the project renv::load(project) - + TRUE - + } - + renv_bootstrap_profile_load <- function(project) { - + # if RENV_PROFILE is already set, just use that profile <- Sys.getenv("RENV_PROFILE", unset = NA) if (!is.na(profile) && nzchar(profile)) return(profile) - + # check for a profile file (nothing to do if it doesn't exist) path <- renv_bootstrap_paths_renv("profile", profile = FALSE, project = project) if (!file.exists(path)) return(NULL) - + # read the profile, and set it if it exists contents <- readLines(path, warn = FALSE) if (length(contents) == 0L) return(NULL) - + # set RENV_PROFILE profile <- contents[[1L]] if (!profile %in% c("", "default")) Sys.setenv(RENV_PROFILE = profile) - + profile - + } - + renv_bootstrap_profile_prefix <- function() { profile <- renv_bootstrap_profile_get() if (!is.null(profile)) return(file.path("profiles", profile, "renv")) } - + renv_bootstrap_profile_get <- function() { profile <- Sys.getenv("RENV_PROFILE", unset = "") renv_bootstrap_profile_normalize(profile) } - + renv_bootstrap_profile_set <- function(profile) { profile <- renv_bootstrap_profile_normalize(profile) if (is.null(profile)) @@ -951,25 +951,25 @@ local({ else Sys.setenv(RENV_PROFILE = profile) } - + renv_bootstrap_profile_normalize <- function(profile) { - + if (is.null(profile) || profile %in% c("", "default")) return(NULL) - + profile - + } - + renv_bootstrap_path_absolute <- function(path) { - + substr(path, 1L, 1L) %in% c("~", "/", "\\") || ( substr(path, 1L, 1L) %in% c(letters, LETTERS) && - substr(path, 2L, 3L) %in% c(":/", ":\\") + substr(path, 2L, 3L) %in% c(":/", ":\\") ) - + } - + renv_bootstrap_paths_renv <- function(..., profile = TRUE, project = NULL) { renv <- Sys.getenv("RENV_PATHS_RENV", unset = "renv") root <- if (renv_bootstrap_path_absolute(renv)) NULL else project @@ -977,50 +977,50 @@ local({ components <- c(root, renv, prefix, ...) paste(components, collapse = "/") } - + renv_bootstrap_project_type <- function(path) { - + descpath <- file.path(path, "DESCRIPTION") if (!file.exists(descpath)) return("unknown") - + desc <- tryCatch( read.dcf(descpath, all = TRUE), error = identity ) - + if (inherits(desc, "error")) return("unknown") - + type <- desc$Type if (!is.null(type)) return(tolower(type)) - + package <- desc$Package if (!is.null(package)) return("package") - + "unknown" - + } - + renv_bootstrap_user_dir <- function() { dir <- renv_bootstrap_user_dir_impl() path.expand(chartr("\\", "/", dir)) } - + renv_bootstrap_user_dir_impl <- function() { - + # use local override if set override <- getOption("renv.userdir.override") if (!is.null(override)) return(override) - + # use R_user_dir if available tools <- asNamespace("tools") if (is.function(tools$R_user_dir)) return(tools$R_user_dir("renv", "cache")) - + # try using our own backfill for older versions of R envvars <- c("R_USER_CACHE_DIR", "XDG_CACHE_HOME") for (envvar in envvars) { @@ -1028,7 +1028,7 @@ local({ if (!is.na(root)) return(file.path(root, "R/renv")) } - + # use platform-specific default fallbacks if (Sys.info()[["sysname"]] == "Windows") file.path(Sys.getenv("LOCALAPPDATA"), "R/cache/R/renv") @@ -1036,109 +1036,109 @@ local({ "~/Library/Caches/org.R-project.R/R/renv" else "~/.cache/R/renv" - + } - + renv_bootstrap_version_friendly <- function(version, shafmt = NULL, sha = NULL) { sha <- sha %||% attr(version, "sha", exact = TRUE) parts <- c(version, sprintf(shafmt %||% " [sha: %s]", substring(sha, 1L, 7L))) paste(parts, collapse = "") } - + renv_bootstrap_exec <- function(project, libpath, version) { if (!renv_bootstrap_load(project, libpath, version)) renv_bootstrap_run(version, libpath) } - + renv_bootstrap_run <- function(version, libpath) { - + # perform bootstrap bootstrap(version, libpath) - + # exit early if we're just testing bootstrap if (!is.na(Sys.getenv("RENV_BOOTSTRAP_INSTALL_ONLY", unset = NA))) return(TRUE) - + # try again to load if (requireNamespace("renv", lib.loc = libpath, quietly = TRUE)) { return(renv::load(project = getwd())) } - + # failed to download or load renv; warn the user msg <- c( "Failed to find an renv installation: the project will not be loaded.", "Use `renv::activate()` to re-initialize the project." ) - + warning(paste(msg, collapse = "\n"), call. = FALSE) - + } - + renv_json_read <- function(file = NULL, text = NULL) { - + jlerr <- NULL - + # if jsonlite is loaded, use that instead if ("jsonlite" %in% loadedNamespaces()) { - + json <- tryCatch(renv_json_read_jsonlite(file, text), error = identity) if (!inherits(json, "error")) return(json) - + jlerr <- json - + } - + # otherwise, fall back to the default JSON reader json <- tryCatch(renv_json_read_default(file, text), error = identity) if (!inherits(json, "error")) return(json) - + # report an error if (!is.null(jlerr)) stop(jlerr) else stop(json) - + } - + renv_json_read_jsonlite <- function(file = NULL, text = NULL) { text <- paste(text %||% readLines(file, warn = FALSE), collapse = "\n") jsonlite::fromJSON(txt = text, simplifyVector = FALSE) } - + renv_json_read_default <- function(file = NULL, text = NULL) { - + # find strings in the JSON text <- paste(text %||% readLines(file, warn = FALSE), collapse = "\n") pattern <- '["](?:(?:\\\\.)|(?:[^"\\\\]))*?["]' locs <- gregexpr(pattern, text, perl = TRUE)[[1]] - + # if any are found, replace them with placeholders replaced <- text strings <- character() replacements <- character() - + if (!identical(c(locs), -1L)) { - + # get the string values starts <- locs ends <- locs + attr(locs, "match.length") - 1L strings <- substring(text, starts, ends) - + # only keep those requiring escaping strings <- grep("[[\\]{}:]", strings, perl = TRUE, value = TRUE) - + # compute replacements replacements <- sprintf('"\032%i\032"', seq_along(strings)) - + # replace the strings mapply(function(string, replacement) { replaced <<- sub(string, replacement, replaced, fixed = TRUE) }, strings, replacements) - + } - + # transform the JSON into something the R parser understands transformed <- replaced transformed <- gsub("{}", "`names<-`(list(), character())", transformed, fixed = TRUE) @@ -1146,38 +1146,38 @@ local({ transformed <- gsub("[]}]", ")", transformed, perl = TRUE) transformed <- gsub(":", "=", transformed, fixed = TRUE) text <- paste(transformed, collapse = "\n") - + # parse it json <- parse(text = text, keep.source = FALSE, srcfile = NULL)[[1L]] - + # construct map between source strings, replaced strings map <- as.character(parse(text = strings)) names(map) <- as.character(parse(text = replacements)) - + # convert to list map <- as.list(map) - + # remap strings in object remapped <- renv_json_read_remap(json, map) - + # evaluate eval(remapped, envir = baseenv()) - + } - + renv_json_read_remap <- function(json, map) { - + # fix names if (!is.null(names(json))) { lhs <- match(names(json), names(map), nomatch = 0L) rhs <- match(names(map), names(json), nomatch = 0L) names(json)[rhs] <- map[lhs] } - + # fix values if (is.character(json)) return(map[[json]] %||% json) - + # handle true, false, null if (is.name(json)) { text <- as.character(json) @@ -1188,16 +1188,16 @@ local({ else if (text == "null") return(NULL) } - + # recurse if (is.recursive(json)) { for (i in seq_along(json)) { json[i] <- list(renv_json_read_remap(json[[i]], map)) } } - + json - + } # load the renv profile, if any diff --git a/renv/profiles/lesson-requirements/renv.lock b/renv/profiles/lesson-requirements/renv.lock index 20f3b10..46a80df 100644 --- a/renv/profiles/lesson-requirements/renv.lock +++ b/renv/profiles/lesson-requirements/renv.lock @@ -40,6 +40,26 @@ "Version": "3.19" }, "Packages": { + "AnnotationDbi": { + "Package": "AnnotationDbi", + "Version": "1.66.0", + "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", + "Requirements": [ + "Biobase", + "BiocGenerics", + "DBI", + "IRanges", + "KEGGREST", + "R", + "RSQLite", + "S4Vectors", + "methods", + "stats", + "stats4" + ], + "Hash": "b7df9c597fb5533fc8248d73b8c703ac" + }, "BH": { "Package": "BH", "Version": "1.84.0-0", @@ -107,6 +127,26 @@ ], "Hash": "9bc4cabd3bfda461409172213d932813" }, + "BiocFileCache": { + "Package": "BiocFileCache", + "Version": "2.12.0", + "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", + "Requirements": [ + "DBI", + "R", + "RSQLite", + "curl", + "dbplyr", + "dplyr", + "filelock", + "httr", + "methods", + "stats", + "utils" + ], + "Hash": "9c3414bcfae204d56080dd0f0a220136" + }, "BiocGenerics": { "Package": "BiocGenerics", "Version": "0.49.1", @@ -211,6 +251,17 @@ ], "Hash": "da1575dfeace212da5adae444704d212" }, + "DBI": { + "Package": "DBI", + "Version": "1.2.3", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "methods" + ], + "Hash": "065ae649b05f1ff66bb0c793107508f5" + }, "DelayedArray": { "Package": "DelayedArray", "Version": "0.30.1", @@ -315,6 +366,20 @@ ], "Hash": "4adff00e89fd9b182216f800f61a8943" }, + "KEGGREST": { + "Package": "KEGGREST", + "Version": "1.44.1", + "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", + "Requirements": [ + "Biostrings", + "R", + "httr", + "methods", + "png" + ], + "Hash": "017f19c09477c0473073518db9076ac1" + }, "Matrix": { "Package": "Matrix", "Version": "1.7-0", @@ -365,6 +430,25 @@ ], "Hash": "ddbdf53d15b47be4407ede6914f56fbb" }, + "RSQLite": { + "Package": "RSQLite", + "Version": "2.3.7", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "DBI", + "R", + "bit64", + "blob", + "cpp11", + "memoise", + "methods", + "pkgconfig", + "plogr", + "rlang" + ], + "Hash": "46b45a4dd7bb0e0f4e3fc22245817240" + }, "Rcpp": { "Package": "Rcpp", "Version": "1.0.13", @@ -592,6 +676,49 @@ ], "Hash": "a704d52e87822191b42c715c568f96dd" }, + "biomaRt": { + "Package": "biomaRt", + "Version": "2.60.1", + "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", + "Requirements": [ + "AnnotationDbi", + "BiocFileCache", + "digest", + "httr2", + "methods", + "progress", + "rappdirs", + "stringr", + "utils", + "xml2" + ], + "Hash": "e53d495b9e6ecd5394acad1d53c3fa22" + }, + "bit": { + "Package": "bit", + "Version": "4.0.5", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R" + ], + "Hash": "d242abec29412ce988848d0294b208fd" + }, + "bit64": { + "Package": "bit64", + "Version": "4.0.5", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "bit", + "methods", + "stats", + "utils" + ], + "Hash": "9fe98599ca456d6552421db0d6772d8f" + }, "bitops": { "Package": "bitops", "Version": "1.0-8", @@ -599,6 +726,18 @@ "Repository": "CRAN", "Hash": "da69e6b6f8feebec0827205aad3fdbd8" }, + "blob": { + "Package": "blob", + "Version": "1.2.4", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "methods", + "rlang", + "vctrs" + ], + "Hash": "40415719b5a479b87949f3aa0aee737c" + }, "bookdown": { "Package": "bookdown", "Version": "0.40", @@ -702,6 +841,34 @@ ], "Hash": "411ca2c03b1ce5f548345d2fc2685f7a" }, + "dbplyr": { + "Package": "dbplyr", + "Version": "2.5.0", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "DBI", + "R", + "R6", + "blob", + "cli", + "dplyr", + "glue", + "lifecycle", + "magrittr", + "methods", + "pillar", + "purrr", + "rlang", + "tibble", + "tidyr", + "tidyselect", + "utils", + "vctrs", + "withr" + ], + "Hash": "39b2e002522bfd258039ee4e889e0fd1" + }, "digest": { "Package": "digest", "Version": "0.6.36", @@ -713,6 +880,29 @@ ], "Hash": "fd6824ad91ede64151e93af67df6376b" }, + "dplyr": { + "Package": "dplyr", + "Version": "1.1.4", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R", + "R6", + "cli", + "generics", + "glue", + "lifecycle", + "magrittr", + "methods", + "pillar", + "rlang", + "tibble", + "tidyselect", + "utils", + "vctrs" + ], + "Hash": "fedd9d00c2944ff00a0e2696ccf048ec" + }, "evaluate": { "Package": "evaluate", "Version": "0.24.0", @@ -724,6 +914,18 @@ ], "Hash": "a1066cbc05caee9a4bf6d90f194ff4da" }, + "fansi": { + "Package": "fansi", + "Version": "1.0.6", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R", + "grDevices", + "utils" + ], + "Hash": "962174cf2aeb5b9eea581522286a911f" + }, "fastmap": { "Package": "fastmap", "Version": "1.2.0", @@ -731,6 +933,16 @@ "Repository": "RSPM", "Hash": "aa5e1cd11c2d15497494c5292d7ffcc8" }, + "filelock": { + "Package": "filelock", + "Version": "1.0.3", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R" + ], + "Hash": "192053c276525c8495ccfd523aa8f2d1" + }, "fontawesome": { "Package": "fontawesome", "Version": "0.5.2", @@ -820,6 +1032,20 @@ ], "Hash": "d65ba49117ca223614f71b60d85b8ab7" }, + "hms": { + "Package": "hms", + "Version": "1.1.3", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "lifecycle", + "methods", + "pkgconfig", + "rlang", + "vctrs" + ], + "Hash": "b59377caa7ed00fa41808342002138f9" + }, "htmltools": { "Package": "htmltools", "Version": "0.5.8.1", @@ -851,6 +1077,27 @@ ], "Hash": "ac107251d9d9fd72f0ca8049988f1d7f" }, + "httr2": { + "Package": "httr2", + "Version": "1.0.2", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "R6", + "cli", + "curl", + "glue", + "lifecycle", + "magrittr", + "openssl", + "rappdirs", + "rlang", + "vctrs", + "withr" + ], + "Hash": "320c8fe23fcb25a6690ef7bdb6a3a705" + }, "jquerylib": { "Package": "jquerylib", "Version": "0.1.4", @@ -990,6 +1237,40 @@ ], "Hash": "2bcca3848e4734eb3b16103bc9aa4b8e" }, + "pillar": { + "Package": "pillar", + "Version": "1.9.0", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "cli", + "fansi", + "glue", + "lifecycle", + "rlang", + "utf8", + "utils", + "vctrs" + ], + "Hash": "15da5a8412f317beeee6175fbc76f4bb" + }, + "pkgconfig": { + "Package": "pkgconfig", + "Version": "2.0.3", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "utils" + ], + "Hash": "01f28d4278f15c76cddbea05899c5d6f" + }, + "plogr": { + "Package": "plogr", + "Version": "0.2.0", + "Source": "Repository", + "Repository": "CRAN", + "Hash": "09eb987710984fc2905c7129c7d85e65" + }, "plyr": { "Package": "plyr", "Version": "1.8.9", @@ -1001,6 +1282,55 @@ ], "Hash": "6b8177fd19982f0020743fadbfdbd933" }, + "png": { + "Package": "png", + "Version": "0.1-8", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R" + ], + "Hash": "bd54ba8a0a5faded999a7aab6e46b374" + }, + "prettyunits": { + "Package": "prettyunits", + "Version": "1.2.0", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R" + ], + "Hash": "6b01fc98b1e86c4f705ce9dcfd2f57c7" + }, + "progress": { + "Package": "progress", + "Version": "1.2.3", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "R6", + "crayon", + "hms", + "prettyunits" + ], + "Hash": "f4625e061cb2865f111b47ff163a5ca6" + }, + "purrr": { + "Package": "purrr", + "Version": "1.0.2", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "cli", + "lifecycle", + "magrittr", + "rlang", + "vctrs" + ], + "Hash": "1cba04a4e9414bdefc9dcaa99649a8dc" + }, "rappdirs": { "Package": "rappdirs", "Version": "0.3.3", @@ -1170,6 +1500,64 @@ "Repository": "CRAN", "Hash": "3a1be13d68d47a8cd0bfd74739ca1555" }, + "tibble": { + "Package": "tibble", + "Version": "3.2.1", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R", + "fansi", + "lifecycle", + "magrittr", + "methods", + "pillar", + "pkgconfig", + "rlang", + "utils", + "vctrs" + ], + "Hash": "a84e2cc86d07289b3b6f5069df7a004c" + }, + "tidyr": { + "Package": "tidyr", + "Version": "1.3.1", + "Source": "Repository", + "Repository": "CRAN", + "Requirements": [ + "R", + "cli", + "cpp11", + "dplyr", + "glue", + "lifecycle", + "magrittr", + "purrr", + "rlang", + "stringr", + "tibble", + "tidyselect", + "utils", + "vctrs" + ], + "Hash": "915fb7ce036c22a6a33b5a8adb712eb1" + }, + "tidyselect": { + "Package": "tidyselect", + "Version": "1.2.1", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R", + "cli", + "glue", + "lifecycle", + "rlang", + "vctrs", + "withr" + ], + "Hash": "829f27b9c4919c16b593794a6344d6c0" + }, "timechange": { "Package": "timechange", "Version": "0.3.0", @@ -1191,6 +1579,16 @@ ], "Hash": "cfbad971a71f0e27cec22e544a08bc3b" }, + "utf8": { + "Package": "utf8", + "Version": "1.2.4", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R" + ], + "Hash": "62b65c52671e6665f803ff02954446e9" + }, "vctrs": { "Package": "vctrs", "Version": "0.6.5", @@ -1205,6 +1603,18 @@ ], "Hash": "c03fa420630029418f7e6da3667aac4a" }, + "withr": { + "Package": "withr", + "Version": "3.0.0", + "Source": "Repository", + "Repository": "RSPM", + "Requirements": [ + "R", + "grDevices", + "graphics" + ], + "Hash": "d31b6c62c10dcf11ec530ca6b0dd5d35" + }, "xfun": { "Package": "xfun", "Version": "0.46", From 12f1dbfcb2f17a8e6978a9bb617dadaa65535ae8 Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Tue, 30 Jul 2024 15:02:44 +0100 Subject: [PATCH 10/19] try rendering site from scratch --- episodes/01-helpers.R | 7 +++ episodes/01-setup.Rmd | 3 +- episodes/08-annotations.Rmd | 4 -- renv/profiles/lesson-requirements/renv.lock | 54 +++++++++++++-------- site/README.md | 2 +- 5 files changed, 44 insertions(+), 26 deletions(-) create mode 100644 episodes/01-helpers.R diff --git a/episodes/01-helpers.R b/episodes/01-helpers.R new file mode 100644 index 0000000..ec06299 --- /dev/null +++ b/episodes/01-helpers.R @@ -0,0 +1,7 @@ +r_version_string <- function() { + paste0(R.version$major, ".", R.version$minor) +} + +r_version_string.patch_x <- function() { + gsub(".$", "x", r_version_string()) +} diff --git a/episodes/01-setup.Rmd b/episodes/01-setup.Rmd index 1b1aebb..70b7632 100644 --- a/episodes/01-setup.Rmd +++ b/episodes/01-setup.Rmd @@ -6,6 +6,7 @@ exercises: XX --- ```{r, include=FALSE} +source("01-helpers.R") ``` ::::::::::::::::::::::::::::::::::::::: objectives @@ -27,7 +28,7 @@ exercises: XX This lesson was developed and tested with `r R.version.string`. -Take a moment to launch RStudio and verify that you are using R version `4.1.x`, with `x` being any patch version, e.g. `4.1.2`. +Take a moment to launch RStudio and verify that you are using R version `r r_version_string.patch_x()`, with `x` being any patch version, e.g. `r R.version.string`. ```{r} R.version.string diff --git a/episodes/08-annotations.Rmd b/episodes/08-annotations.Rmd index 7ba1842..70de1b0 100644 --- a/episodes/08-annotations.Rmd +++ b/episodes/08-annotations.Rmd @@ -113,9 +113,5 @@ Let us first load the package: library(biomaRt) ``` - - - - [biocviews]: https://www.bioconductor.org/packages/release/BiocViews.html [biomart-ensembl]: https://www.ensembl.org/biomart/martview diff --git a/renv/profiles/lesson-requirements/renv.lock b/renv/profiles/lesson-requirements/renv.lock index 46a80df..401f455 100644 --- a/renv/profiles/lesson-requirements/renv.lock +++ b/renv/profiles/lesson-requirements/renv.lock @@ -149,7 +149,7 @@ }, "BiocGenerics": { "Package": "BiocGenerics", - "Version": "0.49.1", + "Version": "0.50.0", "Source": "Bioconductor", "Repository": "Bioconductor 3.19", "Requirements": [ @@ -159,7 +159,7 @@ "stats", "utils" ], - "Hash": "bb0e8378090c72c1fe8721fc34a4f7cb" + "Hash": "ef32d07aafdd12f24c5827374ae3590d" }, "BiocIO": { "Package": "BiocIO", @@ -232,7 +232,7 @@ }, "Biostrings": { "Package": "Biostrings", - "Version": "2.71.5", + "Version": "2.72.1", "Source": "Bioconductor", "Repository": "Bioconductor 3.19", "Requirements": [ @@ -244,12 +244,11 @@ "XVector", "crayon", "grDevices", - "graphics", "methods", "stats", "utils" ], - "Hash": "da1575dfeace212da5adae444704d212" + "Hash": "886ff0ed958d6f839ed2e0d01f6853b3" }, "DBI": { "Package": "DBI", @@ -284,7 +283,7 @@ }, "GenomeInfoDb": { "Package": "GenomeInfoDb", - "Version": "1.39.10", + "Version": "1.40.1", "Source": "Bioconductor", "Repository": "Bioconductor 3.19", "Requirements": [ @@ -293,12 +292,13 @@ "IRanges", "R", "S4Vectors", + "UCSC.utils", "methods", "stats", "stats4", "utils" ], - "Hash": "86cc7f0a5b83be019673ef3a508dda2c" + "Hash": "171e9becd9bb948b9e64eb3759208c94" }, "GenomeInfoDbData": { "Package": "GenomeInfoDbData", @@ -333,7 +333,7 @@ }, "GenomicRanges": { "Package": "GenomicRanges", - "Version": "1.55.4", + "Version": "1.56.1", "Source": "Bioconductor", "Repository": "Bioconductor 3.19", "Requirements": [ @@ -348,11 +348,11 @@ "stats4", "utils" ], - "Hash": "f0957f9dcf1bdf2d301d82e5bea6e7ca" + "Hash": "a3c822ef3c124828e25e7a9611beeb50" }, "IRanges": { "Package": "IRanges", - "Version": "2.37.1", + "Version": "2.38.1", "Source": "Bioconductor", "Repository": "Bioconductor 3.19", "Requirements": [ @@ -364,7 +364,7 @@ "stats4", "utils" ], - "Hash": "4adff00e89fd9b182216f800f61a8943" + "Hash": "066f3c5d6b022ed62c91ce49e4d8f619" }, "KEGGREST": { "Package": "KEGGREST", @@ -535,7 +535,7 @@ }, "S4Vectors": { "Package": "S4Vectors", - "Version": "0.41.5", + "Version": "0.42.1", "Source": "Bioconductor", "Repository": "Bioconductor 3.19", "Requirements": [ @@ -546,7 +546,7 @@ "stats4", "utils" ], - "Hash": "9912dc4d5ed3e8d92d15573b57e9a6c8" + "Hash": "86398fc7c5f6be4ba29fe23ed08c2da6" }, "SparseArray": { "Package": "SparseArray", @@ -593,6 +593,20 @@ ], "Hash": "2f6c8cc972ed6aee07c96e3dff729d15" }, + "UCSC.utils": { + "Package": "UCSC.utils", + "Version": "1.0.0", + "Source": "Bioconductor", + "Repository": "Bioconductor 3.19", + "Requirements": [ + "S4Vectors", + "httr", + "jsonlite", + "methods", + "stats" + ], + "Hash": "83d45b690bffd09d1980c224ef329f5b" + }, "XML": { "Package": "XML", "Version": "3.99-0.17", @@ -607,7 +621,7 @@ }, "XVector": { "Package": "XVector", - "Version": "0.43.1", + "Version": "0.44.0", "Source": "Bioconductor", "Repository": "Bioconductor 3.19", "Requirements": [ @@ -620,7 +634,7 @@ "utils", "zlibbioc" ], - "Hash": "24224fef455e6f52ab17348e95fbea72" + "Hash": "4245b9938ac74c0dbddbebbec6036ab4" }, "abind": { "Package": "abind", @@ -821,15 +835,15 @@ }, "crayon": { "Package": "crayon", - "Version": "1.5.2", + "Version": "1.5.3", "Source": "Repository", - "Repository": "RSPM", + "Repository": "CRAN", "Requirements": [ "grDevices", "methods", "utils" ], - "Hash": "e8a1e41acf02548751f45c718d55aa6a" + "Hash": "859d96e65ef198fd43e82b9628d593ef" }, "curl": { "Package": "curl", @@ -1649,10 +1663,10 @@ }, "zlibbioc": { "Package": "zlibbioc", - "Version": "1.49.3", + "Version": "1.50.0", "Source": "Bioconductor", "Repository": "Bioconductor 3.19", - "Hash": "6dd05467a4736905623634dc1f145da6" + "Hash": "3db02e3c460e1c852365df117a2b441b" } } } diff --git a/site/README.md b/site/README.md index 42997e3..0a00291 100644 --- a/site/README.md +++ b/site/README.md @@ -1,2 +1,2 @@ This directory contains rendered lesson materials. Please do not edit files -here. +here. From 19f2e45440d722331ebc2b482c987fdc8b932d53 Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Tue, 30 Jul 2024 15:17:08 +0100 Subject: [PATCH 11/19] adapt learner setup based on bioc-rnaseq --- learners/setup.md | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/learners/setup.md b/learners/setup.md index 8f06d41..f1b6cfe 100644 --- a/learners/setup.md +++ b/learners/setup.md @@ -2,9 +2,18 @@ title: Setup --- -- Install R version `4.1.x` (`x` being any patch version, for instance `4.1.2`). -- Install RStudio. - +Ensure that you have the most recent versions of R and RStudio installed on your computer. +For detailed instructions on how to do this, you can refer to the section "If you already have R and RStudio installed" +in the [Introduction to R](https://carpentries-incubator.github.io/bioc-intro/#r-and-rstudio) +episode of the [Introduction to data analysis with R and Bioconductor](https://carpentries-incubator.github.io/bioc-intro) lesson. +Additionally, you will also need to install the following packages that will be used throughout the lesson. +```r +install.packages(c("BiocManager", "remotes")) +BiocManager::install(c( + "S4Vectors", "Biostrings", "BSgenome", "BSgenome.Hsapiens.UCSC.hg38.masked", + "GenomicRanges", "rtracklayer", "biomaRt")) +``` +*If you are attending a workshop, please complete all of the above before the workshop. Should you need help, an instructor will be available 30 minutes before the workshop commences to assist.* From fe46b2d5fe3c596db918acf32705544c348545bc Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Tue, 30 Jul 2024 15:18:12 +0100 Subject: [PATCH 12/19] fix function call --- episodes/01-setup.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/episodes/01-setup.Rmd b/episodes/01-setup.Rmd index 70b7632..d4b5810 100644 --- a/episodes/01-setup.Rmd +++ b/episodes/01-setup.Rmd @@ -28,7 +28,7 @@ source("01-helpers.R") This lesson was developed and tested with `r R.version.string`. -Take a moment to launch RStudio and verify that you are using R version `r r_version_string.patch_x()`, with `x` being any patch version, e.g. `r R.version.string`. +Take a moment to launch RStudio and verify that you are using R version `r r_version_string.patch_x()`, with `x` being any patch version, e.g. `r r_version_string()`. ```{r} R.version.string From eb7f3acab504e07d0d7e3a8e89573f3e714d7914 Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Tue, 30 Jul 2024 15:24:25 +0100 Subject: [PATCH 13/19] fit to narrow layout --- learners/setup.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/learners/setup.md b/learners/setup.md index f1b6cfe..49b9192 100644 --- a/learners/setup.md +++ b/learners/setup.md @@ -12,7 +12,8 @@ Additionally, you will also need to install the following packages that will be ```r install.packages(c("BiocManager", "remotes")) BiocManager::install(c( - "S4Vectors", "Biostrings", "BSgenome", "BSgenome.Hsapiens.UCSC.hg38.masked", + "S4Vectors", "Biostrings", "BSgenome", + "BSgenome.Hsapiens.UCSC.hg38.masked", "GenomicRanges", "rtracklayer", "biomaRt")) ``` From 426fdeed81075e5e063b58bdca124e465bd4c320 Mon Sep 17 00:00:00 2001 From: Kevin Rue-Albrecht Date: Tue, 30 Jul 2024 15:51:18 +0100 Subject: [PATCH 14/19] expand the episode --- episodes/08-annotations.Rmd | 61 +++++++++++++++++++++++++++++++++++-- 1 file changed, 58 insertions(+), 3 deletions(-) diff --git a/episodes/08-annotations.Rmd b/episodes/08-annotations.Rmd index 70de1b0..faeb22f 100644 --- a/episodes/08-annotations.Rmd +++ b/episodes/08-annotations.Rmd @@ -73,17 +73,18 @@ In this episode, we will demonstrate the two approaches: * Querying annotations from the Ensembl Biomart API using the `r BiocStyle::Biocpkg("biomaRt")` package. * Querying annotations from the `r BiocStyle::Biocpkg("org.Hs.eg.db")` annotation package. -## Querying annotations from online resources +## Querying annotations from Ensembl BioMart ### Pros and cons Pros: * Automatically access the latest information +* Minimal storage footprint on the user's computer Cons: -* Requires a live and stable internet connection. +* Requires a live and stable internet connection throughout the analysis. * Reproducibility may not be possible if the resource is updated without access to archives. * Data may be organised differently in each resource. @@ -109,9 +110,63 @@ directly from an R session where those annotations can be used. Let us first load the package: -```{r} +```{r, message=FALSE, warning=FALSE} library(biomaRt) ``` +### Available marts + +Ensembl BioMart organises its diverse biological information into four databases +also known as *marts* or *biomarts*. +Each mart focuses on a different type of data. + +Users must select the mart corresponds to the type of data they are interested +in before they can query any information from it. + +The function `listMarts()` can be used to display the names of those marts. +This is convenient as users do not need to memorise the name of the marts, +and the function will also return an updated list of names if any mart is +renamed, added, or removed. + +```{r} +listMarts() +``` + +In this demonstration, we will use the biomart called `ENSEMBL_MART_ENSEMBL`, +which contains the Ensembl gene set. + +Notably, the `version` columns also indicates the version of the biomart. +The Ensembl BioMart is updated regularly (multiple times per year). +By default, `r BiocStyle::Biocpkg("biomaRt")` functions access the latest +version of each biomart. +This is not ideal for reproducibility. + +Thankfully, Ensembl BioMart archives past versions of its mars in a way that +is accessible both programmatically, and on its website. + +The function `listEnsemblArchives()` can be used to display all the versions of +Ensembl Biomart accessible. + +```{r} +listEnsemblArchives() +``` + +In the output above, the key piece of information is the `url` column, which +provides the URL that `r BiocStyle::Biocpkg("biomaRt")` functions will need to +access data from the corresponding snapshot of the Ensembl BioMart. + +### Connecting to a biomart + +The two pieces of information collected above -- the name of a biomart +and the URL of a snapshot -- is all that is needed to connect to a BioMart +database reproducibly. + +The function `useMart()` can then be used to create a connection. +The connection is traditionally stored in an object called `mart`. + +```{r} +mart <- useMart(biomart = , host = "https://may2024.archive.ensembl.org") +``` + [biocviews]: https://www.bioconductor.org/packages/release/BiocViews.html [biomart-ensembl]: https://www.ensembl.org/biomart/martview From a304d996b24715a3cca3ae7ce20b70d220f22321 Mon Sep 17 00:00:00 2001 From: Kevin Rue Date: Wed, 31 Jul 2024 08:18:48 +0100 Subject: [PATCH 15/19] woops --- episodes/08-annotations.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/episodes/08-annotations.Rmd b/episodes/08-annotations.Rmd index faeb22f..edbd22a 100644 --- a/episodes/08-annotations.Rmd +++ b/episodes/08-annotations.Rmd @@ -165,7 +165,7 @@ The function `useMart()` can then be used to create a connection. The connection is traditionally stored in an object called `mart`. ```{r} -mart <- useMart(biomart = , host = "https://may2024.archive.ensembl.org") +mart <- useMart(biomart = "ENSEMBL_MART_ENSEMBL", host = "https://may2024.archive.ensembl.org") ``` [biocviews]: https://www.bioconductor.org/packages/release/BiocViews.html From 1dd1e85c632ea1db12ff7cef2badefbd5279fd0f Mon Sep 17 00:00:00 2001 From: Kevin Rue Date: Wed, 31 Jul 2024 08:50:46 +0100 Subject: [PATCH 16/19] connect to data set --- episodes/08-annotations.Rmd | 53 +++++++++++++++++++++++++++++++++++-- 1 file changed, 51 insertions(+), 2 deletions(-) diff --git a/episodes/08-annotations.Rmd b/episodes/08-annotations.Rmd index edbd22a..0b72344 100644 --- a/episodes/08-annotations.Rmd +++ b/episodes/08-annotations.Rmd @@ -8,7 +8,6 @@ exercises: XX --- ```{r, echo=FALSE, purl=FALSE, message=FALSE} -source("download_data.R") ``` ::::::::::::::::::::::::::::::::::::::: objectives @@ -155,6 +154,10 @@ In the output above, the key piece of information is the `url` column, which provides the URL that `r BiocStyle::Biocpkg("biomaRt")` functions will need to access data from the corresponding snapshot of the Ensembl BioMart. +At the time of writing, the current release is Ensembl 112, so let us use +the corresponding url `https://may2024.archive.ensembl.org` to ensure +reproducible results no matter when this lesson is delivered. + ### Connecting to a biomart The two pieces of information collected above -- the name of a biomart @@ -162,11 +165,57 @@ and the URL of a snapshot -- is all that is needed to connect to a BioMart database reproducibly. The function `useMart()` can then be used to create a connection. -The connection is traditionally stored in an object called `mart`. +The connection is traditionally stored in an object called `mart`, +to be reused in subsequent steps for querying information from the online mart. ```{r} mart <- useMart(biomart = "ENSEMBL_MART_ENSEMBL", host = "https://may2024.archive.ensembl.org") ``` +### Listing available data sets + +Each biomart contains a number of data sets. + +The function `listDatasets()` can be used to display the information about those +data sets. +This is convenient as users do not need to memorise the name of the data sets, +and the information returned by the function includes a short description of +each data set, as well as its version. + +```{r} +listDatasets(mart) +``` + +In the output above, the key piece of information is the `dataset` column, which +provides the identifier that `r BiocStyle::Biocpkg("biomaRt")` functions will +need to access data from the corresponding biomart table. + +In this demonstration, we will use the Ensembl gene set for Homo sapiens. + +Given the number of data sets available (`r nrow(listDatasets(mart))`), +let us programmatically filter the table of information using pattern matching: + +```{r} +subset(listDatasets(mart), grepl("sapiens", dataset)) +``` + +We identify the desired data set identifier as `hsapiens_gene_ensembl`. + +### Connecting to a data set + +Having chosen the data set that we want to use, we need to call the function +`useMart()` again, this time specifying the selected data set. + +Typically, one would copy paste the previous call to `useMart()` and edit as +needed. +It is also common practice to replace the `mart` object with the new connection. + +```{r} +mart <- useMart( + biomart = "ENSEMBL_MART_ENSEMBL", + dataset = "hsapiens_gene_ensembl", + host = "https://may2024.archive.ensembl.org") +``` + [biocviews]: https://www.bioconductor.org/packages/release/BiocViews.html [biomart-ensembl]: https://www.ensembl.org/biomart/martview From f59210c33a21c9cea1700ebcbad47256bae37e92 Mon Sep 17 00:00:00 2001 From: Kevin Rue Date: Wed, 31 Jul 2024 09:21:23 +0100 Subject: [PATCH 17/19] list attributes --- episodes/08-annotations.Rmd | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-) diff --git a/episodes/08-annotations.Rmd b/episodes/08-annotations.Rmd index 0b72344..446efde 100644 --- a/episodes/08-annotations.Rmd +++ b/episodes/08-annotations.Rmd @@ -116,7 +116,7 @@ library(biomaRt) ### Available marts Ensembl BioMart organises its diverse biological information into four databases -also known as *marts* or *biomarts*. +also known as 'marts' or 'biomarts'. Each mart focuses on a different type of data. Users must select the mart corresponds to the type of data they are interested @@ -182,8 +182,11 @@ This is convenient as users do not need to memorise the name of the data sets, and the information returned by the function includes a short description of each data set, as well as its version. +In the example below, we restrict the output table to the first few rows, +as the full table comprises `r nrow(listDatasets(mart))` rows. + ```{r} -listDatasets(mart) +head(listDatasets(mart)) ``` In the output above, the key piece of information is the `dataset` column, which @@ -192,14 +195,15 @@ need to access data from the corresponding biomart table. In this demonstration, we will use the Ensembl gene set for Homo sapiens. -Given the number of data sets available (`r nrow(listDatasets(mart))`), +Given the number of data sets available, let us programmatically filter the table of information using pattern matching: ```{r} subset(listDatasets(mart), grepl("sapiens", dataset)) ``` -We identify the desired data set identifier as `hsapiens_gene_ensembl`. +From the output above, we identify the desired data set identifier as +`hsapiens_gene_ensembl`. ### Connecting to a data set @@ -217,5 +221,25 @@ mart <- useMart( host = "https://may2024.archive.ensembl.org") ``` +### Listing information available in a data set + +BioMart tables contain many pieces of information also known as 'attributes'. +So many, in fact, that they have been grouped into categories also known as +'pages'. + +The function `listAttributes()` can be used to display the information about +those attributes. +This is convenient as users do not need to memorise the name of the attributes, +and the information returned by the function includes a short description of +each attribute, as well as its page categorisation. + +In the example below, we restrict the output table to the first few rows, +as the full table comprises `r nrow(listAttributes(mart))` rows. + +```{r} +listAttributes(mart) +``` + + [biocviews]: https://www.bioconductor.org/packages/release/BiocViews.html [biomart-ensembl]: https://www.ensembl.org/biomart/martview From 1ecdcb824a51a3470d92070c1b248265fd5b2fe2 Mon Sep 17 00:00:00 2001 From: Kevin Rue Date: Wed, 31 Jul 2024 09:47:56 +0100 Subject: [PATCH 18/19] demo getBM() --- episodes/08-annotations.Rmd | 76 ++++++++++++++++++++++++++++++++++++- 1 file changed, 74 insertions(+), 2 deletions(-) diff --git a/episodes/08-annotations.Rmd b/episodes/08-annotations.Rmd index 446efde..55fa989 100644 --- a/episodes/08-annotations.Rmd +++ b/episodes/08-annotations.Rmd @@ -196,7 +196,8 @@ need to access data from the corresponding biomart table. In this demonstration, we will use the Ensembl gene set for Homo sapiens. Given the number of data sets available, -let us programmatically filter the table of information using pattern matching: +let us programmatically filter the table of information using pattern matching +rather than searching the table manually: ```{r} subset(listDatasets(mart), grepl("sapiens", dataset)) @@ -237,9 +238,80 @@ In the example below, we restrict the output table to the first few rows, as the full table comprises `r nrow(listAttributes(mart))` rows. ```{r} -listAttributes(mart) +head(listAttributes(mart)) ``` +In the output above, the key piece of information is the `name` column, which +provides the identifier that `r BiocStyle::Biocpkg("biomaRt")` functions will +need to query that information from the corresponding biomart data set. + +The choice of attributes to query now depends on what it is we wish to achieve. + +For instance, let us imagine that we have a set of gene identifiers, +for which we wish to query: + +* The gene symbol +* The name of the chromosome where the gene is located +* The start and end position of the gene on that chromosome +* The strand on which the gene is encoded + +Users would often manually explore the full table of attributes to identify +the ones they wish to include in their query. +It is also possible to programmatically filter the table of attribute, +based on experience and intuition, to narrow down the search: + +```{r} +subset(listAttributes(mart), grepl("position", name) & grepl("feature", page)) +``` + +### Querying information from a BioMart table + +We have now all the information that we need to perform the actual query: + +* A connection to a BioMart data set +* The list of attributes available in that data set + +The function `getBM()` is the main `r BiocStyle::Biocpkg("biomaRt")` query +function. +Given a set of filters and corresponding values, it retrieves the attributes +requested by the user from the BioMart data set it is connected to. + +In the example below, we manually create a vector or arbitrary gene identifiers +for our query. +In practice, the query will often originate from an earlier analysis +(e.g., differential gene expression). + +The example below also queries attributes that we have not introduced yet. +In the previous section, we described how one may search the table of attributes +returned by `listAttributes()` to identify attributes to include in their query. + +```{r} +query_gene_ids <- c( + "ENSG00000133101", + "ENSG00000145386", + "ENSG00000134057", + "ENSG00000157456", + "ENSG00000147082" +) +getBM( + attributes = c( + "ensembl_gene_id", + "hgnc_symbol", + "chromosome_name", + "start_position", + "end_position", + "strand" + ), + filters = "ensembl_gene_id", + values = query_gene_ids, + mart = mart +) +``` + +Note that we also included the filtering attribute `ensembl_gene_id` to the +attributes retrieved from the data set. +This is key to reliably match the newly retrieved attributes to those used +in the query. [biocviews]: https://www.bioconductor.org/packages/release/BiocViews.html [biomart-ensembl]: https://www.ensembl.org/biomart/martview From 02989598300ff6673d9fc28f63c39f08c9dfc656 Mon Sep 17 00:00:00 2001 From: Kevin Rue Date: Wed, 31 Jul 2024 09:56:35 +0100 Subject: [PATCH 19/19] tweak --- episodes/08-annotations.Rmd | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/episodes/08-annotations.Rmd b/episodes/08-annotations.Rmd index 55fa989..d499ee0 100644 --- a/episodes/08-annotations.Rmd +++ b/episodes/08-annotations.Rmd @@ -193,7 +193,8 @@ In the output above, the key piece of information is the `dataset` column, which provides the identifier that `r BiocStyle::Biocpkg("biomaRt")` functions will need to access data from the corresponding biomart table. -In this demonstration, we will use the Ensembl gene set for Homo sapiens. +In this demonstration, we will use the Ensembl gene set for Homo sapiens, +which is not visible in the output above. Given the number of data sets available, let us programmatically filter the table of information using pattern matching