Skip to content

Commit

Permalink
Allow incomplete renv.lock (#10)
Browse files Browse the repository at this point in the history
  • Loading branch information
walkowif authored Nov 16, 2023
1 parent e2dce76 commit dadf06a
Show file tree
Hide file tree
Showing 8 changed files with 138 additions and 40 deletions.
68 changes: 49 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,18 @@

`locksmith` is a utility to generate `renv.lock` file containing all dependencies of given set of R packages.

Given the input list of git repositories containing the R packages, as well as a list of R package repositories (e.g. in a package manager, CRAN, BioConductor etc.), `locksmith` will try to determine the list of all dependencies and their versions required to make the input list of packages work. It will then save the result in an `renv.lock`-compatible file.
Given the input list of git repositories containing the R packages, as well as a list of R package
repositories (e.g. in a package manager, CRAN, BioConductor etc.), `locksmith` will try to determine
the list of all dependencies and their versions required to make the input list of packages work.
It will then save the result in an `renv.lock`-compatible file.

For additional information about `renv.lock`, please refer to the [`renv` documentation](https://rstudio.github.io/renv/articles/renv.html).

## Installation

Simply download the project for your distribution from the [releases](https://github.com/insightsengineering/locksmith/releases) page. `locksmith` is distributed as a single binary file and does not require any additional system requirements.
Simply download the project for your distribution from the
[releases](https://github.com/insightsengineering/locksmith/releases) page. `locksmith` is
distributed as a single binary file and does not need any additional system requirements.

Alternatively, you can install the latest version by running:

Expand All @@ -18,7 +25,8 @@ go install github.com/insightsengineering/locksmith@latest

## Usage

`locksmith` is a command line utility, so after installing the binary in your `PATH`, simply run the following command to view its capabilities:
`locksmith` is a command line utility, so after installing the binary in your `PATH`, simply run the
following command to view its capabilities:

```bash
locksmith --help
Expand All @@ -31,13 +39,15 @@ locksmith --logLevel debug --exampleParameter 'exampleValue'
```

Real-life example with multiple input packages and repositories.
Please see below for [an example](#configuration-file) how to set package and repository lists more easily in a configuration file.
Please see below for [an example](#configuration-file) how to set package and repository lists more
easily in a configuration file.

```bash
locksmith --inputPackageList https://raw.githubusercontent.com/insightsengineering/formatters/main/DESCRIPTION,https://raw.githubusercontent.com/insightsengineering/rtables/main/DESCRIPTION,https://raw.githubusercontent.com/insightsengineering/scda/main/DESCRIPTION,https://raw.githubusercontent.com/insightsengineering/scda.2022/main/DESCRIPTION,https://raw.githubusercontent.com/insightsengineering/nestcolor/main/DESCRIPTION,https://raw.githubusercontent.com/insightsengineering/tern/main/DESCRIPTION,https://raw.githubusercontent.com/insightsengineering/rlistings/main/DESCRIPTION --inputRepositoryList BioC=https://bioconductor.org/packages/release/bioc,CRAN=https://cran.rstudio.com
locksmith --inputPackageList https://raw.githubusercontent.com/insightsengineering/formatters/main/DESCRIPTION,https://raw.githubusercontent.com/insightsengineering/rtables/main/DESCRIPTION,https://raw.githubusercontent.com/insightsengineering/scda/main/DESCRIPTION,https://raw.githubusercontent.com/insightsengineering/scda.2022/main/DESCRIPTION,https://raw.githubusercontent.com/insightsengineering/nestcolor/main/DESCRIPTION,https://raw.githubusercontent.com/insightsengineering/tern/main/DESCRIPTION,https://raw.githubusercontent.com/insightsengineering/rlistings/main/DESCRIPTION,https://gitlab.example.com/projectgroup/projectsubgroup/projectname/-/raw/main/DESCRIPTION --inputRepositoryList BioC=https://bioconductor.org/packages/release/bioc,CRAN=https://cran.rstudio.com
```

In order to download the packages from GitHub or GitLab repositories, please set the environment variables containing the Personal Access Tokens.
In order to download the packages from non-public GitHub or GitLab repositories, please set the environment
variables containing the Personal Access Tokens.

* For GitHub, set the `LOCKSMITH_GITHUBTOKEN` environment variable.
* For GitLab, set the `LOCKSMITH_GITLABTOKEN` environment variable.
Expand All @@ -46,12 +56,15 @@ By default `locksmith` will save the resulting output file to `renv.lock`.

## Configuration file

If you'd like to set the above options in a configuration file, by default `locksmith` checks `~/.locksmith`, `~/.locksmith.yaml` and `~/.locksmith.yml` files.
If you'd like to set the above options in a configuration file, by default `locksmith` checks
`~/.locksmith`, `~/.locksmith.yaml` and `~/.locksmith.yml` files.

If any of these files exist, `locksmith` will use options defined there, unless they are overridden by command line flags or environment variables.
If any of these files exist, `locksmith` will use options defined there, unless they are overridden
by command line flags or environment variables.

You can also specify custom path to configuration file with `--config <your-configuration-file>.yml` command line flag.
When using custom configuration file, if you specify command line flags, the latter will still take precedence.
You can also specify custom path to configuration file with `--config <your-configuration-file>.yml`
command line flag. When using custom configuration file, if you specify command line flags,
the latter will still take precedence.

Example contents of configuration file:

Expand All @@ -62,6 +75,7 @@ inputPackages:
- https://raw.githubusercontent.com/insightsengineering/rtables/main/DESCRIPTION
- https://raw.githubusercontent.com/insightsengineering/scda/main/DESCRIPTION
- https://raw.githubusercontent.com/insightsengineering/scda.2022/main/DESCRIPTION
- https://gitlab.example.com/projectgroup/projectsubgroup/projectname/-/raw/main/DESCRIPTION
inputRepositories:
- Bioconductor.BioCsoft=https://bioconductor.org/packages/release/bioc
- CRAN=https://cran.rstudio.com
Expand All @@ -70,11 +84,23 @@ inputRepositories:
The example above shows an alternative way of providing input packages, and input repositories,
as opposed to `inputPackageList` and `inputRepositoryList` CLI flags/YAML keys.

Additionally, `inputPackageList`/`inputRepositoryList` CLI flags take precendence over `inputPackages`/`inputRepositories` YAML keys.
Additionally, `inputPackageList`/`inputRepositoryList` CLI flags take precendence over
`inputPackages`/`inputRepositories` YAML keys.

## Environment variables

`locksmith` reads environment variables with `LOCKSMITH_` prefix and tries to match them with CLI
flags. For example, setting the following variables will override the respective values from the
configuration file: `LOCKSMITH_LOGLEVEL`, `LOCKSMITH_INPUTPACKAGELIST`, `LOCKSMITH_INPUTREPOSITORYLIST` etc.

The order of precedence is:

CLI flag → environment variable → configuration file → default value.

## Binary dependencies

For `locksmith` in order to generate an `renv.lock` with binary R packages, it is necessary to provide URLs to binary repositories in `inputRepositories`/`inputRepositoryList`.
For `locksmith` in order to generate an `renv.lock` with binary R packages,
it is necessary to provide URLs to binary repositories via `inputRepositories`/`inputRepositoryList`.

Examples illustrating the expected format of URLs to repositories with binary packages:

Expand Down Expand Up @@ -113,23 +139,27 @@ As a result, the configuration file could look like this:
- Bioc-Windows=https://www.bioconductor.org/packages/release/bioc/bin/windows/contrib/4.3
```

## Environment variables
## Packages not found in the repositories

`locksmith` reads environment variables with `LOCKSMITH_` prefix and tries to match them with CLI flags.
For example, setting the following variables will override the respective values from configuration file:
`LOCKSMITH_LOGLEVEL`, `LOCKSMITH_EXAMPLEPARAMETER` etc.
It may happen that some of the dependencies required by the input packages cannot be found in any of
the input repositories. By default, `locksmith` will fail in such case and show a list of such dependencies.

The order of precedence is:
However, it is possible to override this behavior by using the `--allowIncompleteRenvLock` flag.
Simply list the types of dependencies which should not cause the `renv.lock` generation to fail:

CLI flag → environment variable → configuration file → default value.
```bash
locksmith --allowIncompleteRenvLock 'Imports,Depends,Suggests,LinkingTo'
```

## Development

This project is built with the [Go programming language](https://go.dev/).

### Development Environment

It is recommended to use Go 1.21+ for developing this project. This project uses a pre-commit configuration and it is recommended to [install and use pre-commit](https://pre-commit.com/#install) when you are developing this project.
It is recommended to use Go 1.21+ for developing this project. This project uses a pre-commit
configuration and it is recommended to [install and use pre-commit](https://pre-commit.com/#install)
when you are developing this project.

### Common Commands

Expand Down
28 changes: 21 additions & 7 deletions cmd/construct.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,10 @@ import (
// which should be included in the output renv.lock file,
// based on the list of package descriptions, and information contained in the PACKAGES files.
func ConstructOutputPackageList(packages []PackageDescription, packagesFiles map[string]PackagesFile,
repositoryList []string) []PackageDescription {
repositoryList []string, allowedMissingDependencyTypes []string) []PackageDescription {
var outputPackageList []PackageDescription
var fatalErrors string
var nonFatalErrors string
// Add all input packages to output list, as the packages should be downloaded from git repositories.
for _, p := range packages {
outputPackageList = append(outputPackageList, PackageDescription{
Expand All @@ -44,7 +45,8 @@ func ConstructOutputPackageList(packages []PackageDescription, packagesFiles map
log.Info(p.Package, " → ", d.DependencyName, " (", d.DependencyType, ")")
ResolveDependenciesRecursively(
&outputPackageList, d.DependencyName, d.VersionOperator,
d.VersionValue, repositoryList, packagesFiles, 1, &fatalErrors,
d.VersionValue, d.DependencyType, allowedMissingDependencyTypes,
repositoryList, packagesFiles, 1, &fatalErrors, &nonFatalErrors,
)
}
}
Expand All @@ -53,6 +55,9 @@ func ConstructOutputPackageList(packages []PackageDescription, packagesFiles map
if fatalErrors != "" {
log.Fatal(fatalErrors)
}
if nonFatalErrors != "" {
log.Error(nonFatalErrors)
}
return outputPackageList
}

Expand All @@ -61,8 +66,9 @@ func ConstructOutputPackageList(packages []PackageDescription, packagesFiles map
// (later used to generate the renv.lock), or if the dependency should be downloaded from a package repository.
// Repeats the process recursively for all dependencies not yet processed.
func ResolveDependenciesRecursively(outputList *[]PackageDescription, name string, versionOperator string,
versionValue string, repositoryList []string, packagesFiles map[string]PackagesFile, recursionLevel int,
fatalErrors *string) {
versionValue string, dependencyType string, allowedMissingDependencyTypes []string,
repositoryList []string, packagesFiles map[string]PackagesFile, recursionLevel int,
fatalErrors *string, nonFatalErrors *string) {
var indentation string
for i := 0; i < recursionLevel; i++ {
indentation += " "
Expand Down Expand Up @@ -103,7 +109,8 @@ func ResolveDependenciesRecursively(outputList *[]PackageDescription, name strin
)
ResolveDependenciesRecursively(
outputList, d.DependencyName, d.VersionOperator, d.VersionValue,
repositoryList, packagesFiles, recursionLevel+1, fatalErrors,
d.DependencyType, allowedMissingDependencyTypes, repositoryList,
packagesFiles, recursionLevel+1, fatalErrors, nonFatalErrors,
)
}
}
Expand All @@ -115,9 +122,16 @@ func ResolveDependenciesRecursively(outputList *[]PackageDescription, name strin
}
var versionConstraint string
if versionOperator != "" && versionValue != "" {
versionConstraint = " in version " + versionOperator + " " + versionValue
versionConstraint = " (version " + versionOperator + " " + versionValue + ")"
}
message := "Could not find package " + name + versionConstraint + " in any of the repositories.\n"
if stringInSlice(dependencyType, allowedMissingDependencyTypes) {
log.Warn(indentation + message)
*nonFatalErrors += message
} else {
log.Error(indentation + message)
*fatalErrors += message
}
*fatalErrors += "Could not find package " + name + versionConstraint + " in any of the repositories.\n"
}

// CheckIfBasePackage checks whether the package should be treated as a base R package
Expand Down
9 changes: 9 additions & 0 deletions cmd/construct_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -326,6 +326,12 @@ func Test_ConstructOutputPackageList(t *testing.T) {
"",
"",
},
{
"LinkingTo",
"nonExistentPackage",
"",
"",
},
},
"", "", "", "", "", "", "",
},
Expand Down Expand Up @@ -382,6 +388,9 @@ func Test_ConstructOutputPackageList(t *testing.T) {
},
},
packagesFiles, repositoryList,
// Let the generation of renv.lock proceed, despite 'nonExistentPackage'
// (dependency type LinkingTo) not being found in any repository.
[]string{"LinkingTo"},
)
assert.Equal(t, outputPackageList,
[]PackageDescription{
Expand Down
29 changes: 22 additions & 7 deletions cmd/parse.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,14 +46,21 @@ func ParsePackagesFiles(repositoryPackageFiles map[string]string) map[string]Pac
// with those fields/properties that are required for further processing.
func ProcessPackagesFile(content string) PackagesFile {
var allPackages PackagesFile
// PACKAGES files in binary Windows repositories use CRLF line endings.
// Therefore, we first change them to LF line endings.
for _, lineGroup := range strings.Split(strings.ReplaceAll(content, "\r\n", "\n"), "\n\n") {
if lineGroup == "" {
continue
}
// Each lineGroup contains information about one package and is separated by an empty line.
firstLine := strings.Split(lineGroup, "\n")[0]
packageName := strings.ReplaceAll(firstLine, "Package: ", "")
cleaned := CleanDescriptionOrPackagesEntry(lineGroup)
cleaned := CleanDescriptionOrPackagesEntry(lineGroup, false)
if cleaned == "" {
// Package entry pointing to a "Path:" subdirectory encountered.
// Such package entries are skipped altogether.
continue
}
packageMap := make(map[string]string)
err := yaml.Unmarshal([]byte(cleaned), &packageMap)
if err != nil {
Expand All @@ -75,7 +82,7 @@ func ProcessPackagesFile(content string) PackagesFile {
// ProcessDescription reads a string containing DESCRIPTION file and returns a structure
// with those fields/properties that are required for further processing.
func ProcessDescription(description DescriptionFile, allPackages *[]PackageDescription) {
cleaned := CleanDescriptionOrPackagesEntry(description.Contents)
cleaned := CleanDescriptionOrPackagesEntry(description.Contents, true)
packageMap := make(map[string]string)
err := yaml.Unmarshal([]byte(cleaned), &packageMap)
checkError(err)
Expand All @@ -92,16 +99,24 @@ func ProcessDescription(description DescriptionFile, allPackages *[]PackageDescr
)
}

// CleanDescriptionOrPackagesEntry processes a multiline string representing information about one package
// from PACKAGES file, or the whole contents of DESCRIPTION file. Removes newlines occurring within
// filtered fields (which are predominantly fields containing lists of package dependencies).
// Also removes fields which are not required for further processing.
func CleanDescriptionOrPackagesEntry(description string) string {
// CleanDescriptionOrPackagesEntry processes a multiline string representing information about one
// package from PACKAGES file (if isDescription is false), or the whole contents of DESCRIPTION file
// (if isDescription is true). Removes newlines occurring within filtered fields (which are
// predominantly fields containing lists of package dependencies). Also removes fields which are not
// required for further processing.
func CleanDescriptionOrPackagesEntry(description string, isDescription bool) string {
lines := strings.Split(description, "\n")
filterFields := []string{"Package:", "Version:", "Depends:", "Imports:", "Suggests:", "LinkingTo:"}
outputContent := ""
processingFilteredField := false
for _, line := range lines {
if strings.HasPrefix(line, "Path:") && !isDescription {
// This means that the package is located in a subdirectory mentioned in this field.
// For example "Path: 4.4.0/Recommended" means that the package is located in
// "latest/src/contrib/4.4.0/Recommended/" subdirectory. We want to avoid these kinds of
// packages and prefer to download them from "latest/src/contrib/".
return ""
}
filteredFieldFound := false
// Check if we start processing any of the filtered fields.
for _, field := range filterFields {
Expand Down
11 changes: 9 additions & 2 deletions cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ var logLevel string
var gitHubToken string
var gitLabToken string
var outputRenvLock string
var allowIncompleteRenvLock string

// In case the lists are provided as arrays in YAML configuration file:
var inputPackages []string
Expand Down Expand Up @@ -93,13 +94,14 @@ in an renv.lock-compatible file.`,
fmt.Println("inputPackages =", inputPackages)
fmt.Println("inputRepositories =", inputRepositories)
fmt.Println("outputRenvLock =", outputRenvLock)
fmt.Println("allowIncompleteRenvLock =", allowIncompleteRenvLock)

packageDescriptionList, repositoryList, repositoryMap := ParseInput()
packageDescriptionList, repositoryList, repositoryMap, allowedMissingDependencyTypes := ParseInput()
inputDescriptionFiles := DownloadDescriptionFiles(packageDescriptionList, DownloadTextFile)
inputPackages := ParseDescriptionFileList(inputDescriptionFiles)
repositoryPackagesFiles := DownloadPackagesFiles(repositoryList, DownloadTextFile)
packagesFiles := ParsePackagesFiles(repositoryPackagesFiles)
outputPackageList := ConstructOutputPackageList(inputPackages, packagesFiles, repositoryList)
outputPackageList := ConstructOutputPackageList(inputPackages, packagesFiles, repositoryList, allowedMissingDependencyTypes)
renvLock := GenerateRenvLock(outputPackageList, repositoryMap)
writeJSON(outputRenvLock, renvLock)
},
Expand All @@ -118,6 +120,10 @@ in an renv.lock-compatible file.`,
"Token to download non-public files from GitLab.")
rootCmd.PersistentFlags().StringVar(&outputRenvLock, "outputRenvLock", "renv.lock",
"File name to save the output renv.lock file.")
rootCmd.PersistentFlags().StringVar(&allowIncompleteRenvLock, "allowIncompleteRenvLock", "",
"Locksmith will fail if any of dependencies of input packages cannot be found in the repositories. "+
"However, it will not fail for comma-separated dependency types listed in this argument, e.g.: "+
"'Imports,Depends,Suggests,LinkingTo'")

// Add version command.
rootCmd.AddCommand(extension.NewVersionCobraCmd())
Expand Down Expand Up @@ -173,6 +179,7 @@ func initializeConfig() {
"gitHubToken",
"gitLabToken",
"outputRenvLock",
"allowIncompleteRenvLock",
} {
// If the flag has not been set in newRootCommand() and it has been set in initConfig().
// In other words: if it's not been provided in command line, but has been
Expand Down
9 changes: 9 additions & 0 deletions cmd/testdata/PACKAGES
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,15 @@ License: GPL-3
MD5sum: bbb222333444555666
NeedsCompilation: no

Package: skippedPackage
Version: 5.0.0
Depends: R (>= 3.6.0)
Imports: grDevices, graphics, grid, lattice, stats, utils
License: GPL-3
MD5sum: aaabbbccc999888777
NeedsCompilation: no
Path: 4.4.0/Recommended

Package: somePackage3
Version: 0.0.1
Depends: R (>= 3.1.0)
Expand Down
Loading

0 comments on commit dadf06a

Please sign in to comment.