-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nmr bucketing3 #243
base: master
Are you sure you want to change the base?
Nmr bucketing3 #243
Conversation
… direct bucketing
… the "Input file" parameter
… of a tsv file possibility
Fix a syntax error
remove R deps bump version add log change
Update changelog
…graphical representation modification
…of input files (if zip = printing; else = no printing)
…omed" windows depending on exclusion borders
ping @mtremblayfr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lecorguille and @mtremblayfr.
I have added some comments inline. Greeting from the Australian Cofest!
@@ -0,0 +1,302 @@ | |||
<tool id="NmrBucketing" name="NMR_Bucketing" version="2.0.3"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<tool id="NmrBucketing" name="NMR_Bucketing" version="2.0.3"> | |
<tool id="NmrBucketing" name="NMR_Bucketing" version="2.0.3" profile="21.05"> | |
@@ -0,0 +1,302 @@ | |||
<tool id="NmrBucketing" name="NMR_Bucketing" version="2.0.3"> | |||
|
|||
<description> Bucketing and integration of NMR Bruker raw data</description> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<description> Bucketing and integration of NMR Bruker raw data</description> | |
<description>Bucketing and integration of NMR Bruker raw data</description> | |
<stdio> | ||
<exit_code range="1:" level="fatal" /> | ||
</stdio> | ||
|
||
<command> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<stdio> | |
<exit_code range="1:" level="fatal" /> | |
</stdio> | |
<command> | |
<command detect_errors="exit_code"> | |
<inputs> | ||
<conditional name="inputs"> | ||
<param name="input" type="select" label="Choose your inputs method" > | ||
<option value="zip_file" selected="true">Zip file from your history containing your Bruker directories</option> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have now a bruker filetype that can be used. https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/config/sample/datatypes_conf.xml.sample#L295
</when> | ||
</conditional> | ||
|
||
<param name="bucket_width" label="Bucket width" type="float" value="0.04" help="Default value is 0.04 ppm"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to add a min/max attribute here and below?
|
||
<outputs> | ||
<data format="txt" name="logOut" label="${tool.name}_log" /> | ||
<data format="tabular" name="sampleOut" label="${tool.name}_sampleMetadata" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add the column names as metadata, for example see here: https://github.com/galaxyproject/tools-iuc/blob/689a4aeaf70a8e77be8589b1dcceb190f9df626a/tools/nonpareil/nonpareil.xml#L84
|
||
<tests> | ||
<test> | ||
<param name="inputs|input" value="zip_file" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use the more explicit way of defining nested structures. E.g.
<param name="inputs|input" value="zip_file" /> | |
<conditional name="inputs"> | |
<param name="input" value="zip_file" /> | |
</conditional> |
|
||
.. class:: infomark | ||
|
||
**Authors** Marie Tremblay-Franco ([email protected]), Marion Landi ([email protected]) and Franck Giacomoni ([email protected]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the more explicit schema.org annotations for this: https://docs.galaxyproject.org/en/latest/dev/schema.html#tool-creator-person
The advantage is that Galaxy can nicely render this content and you can access it via the API and such ....
|
||
--------------------------------------------------- | ||
|
||
Changelog/News |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the changelog its already in the Readme file and its hard to keep it in sync.
@@ -0,0 +1,4 @@ | |||
<?xml version="1.0"?> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see that the metabolomics tools are being consolidated in one place! I think this is also a great opportunity for some overall cleanup and refreshing the tools!
long_description: 'Part of the W4M project: http://workflow4metabolomics.org' | ||
name: nmr_bucketing | ||
owner: marie-tremblay-metatoul | ||
remote_repository_url: https://github.com/workflow4metabolomics/nmr_bucketing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be updated to now point to this repository?
@@ -0,0 +1,7 @@ | |||
categories: [Metabolomics] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
categories: [Metabolomics] | |
categories: Metabolomics |
################################################################################################ | ||
# SPECTRA BUCKETING AND INTEGRATION FROM RAW BRUKER FILES # | ||
# User : Galaxy # | ||
# Original data : -- # | ||
# Starting date : 20-10-2014 # | ||
# Version 1 : 18-12-2014 # | ||
# Version 2 : 07-01-2015 # | ||
# Version 3 : 24-10-2016 # | ||
# # | ||
# Input files : modification on october 2016 # | ||
# - Raw bruker files included in user-defined fileName # | ||
# - Preprocessed files (alignment, ...) included in p x n dataframe # | ||
################################################################################################ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
################################################################################################ | |
# SPECTRA BUCKETING AND INTEGRATION FROM RAW BRUKER FILES # | |
# User : Galaxy # | |
# Original data : -- # | |
# Starting date : 20-10-2014 # | |
# Version 1 : 18-12-2014 # | |
# Version 2 : 07-01-2015 # | |
# Version 3 : 24-10-2016 # | |
# # | |
# Input files : modification on october 2016 # | |
# - Raw bruker files included in user-defined fileName # | |
# - Preprocessed files (alignment, ...) included in p x n dataframe # | |
################################################################################################ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please run some auto formatting tool on this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could overall be cleaned up into some minimal CRAN package as it has minimal dependencies etc. which would also mkae the Galaxy tool a bit nicer and include all the R functions. That would also make this functionality available to more users or other workflow engines.
if (fileType=="tsv") | ||
{ | ||
FileNames <- colnames(fileName) | ||
n <- length(FileNames) | ||
|
||
for (i in 1:ncol(fileName)) | ||
{ | ||
orderedSpectrum <- cbind(as.numeric(rownames(fileName)),fileName[,i]) | ||
orderedSpectrum <- orderedSpectrum[order(orderedSpectrum[,1],decreasing=T), ] | ||
|
||
truncatedSpectrum <- orderedSpectrum[orderedSpectrum[,1] < leftBorder & orderedSpectrum[,1] > rightBorder, ] | ||
truncatedSpectrum[,1] <- round(truncatedSpectrum[,1],3) | ||
|
||
# Bucketing | ||
spectrum.bucket <- NmrBrucker_bucket(truncatedSpectrum) | ||
ppm <- spectrum.bucket[,1] | ||
|
||
# spectrum Concatenation | ||
if (i == 1) | ||
bucketedSpectra <- spectrum.bucket | ||
if (i > 1) | ||
bucketedSpectra <- cbind(bucketedSpectra,spectrum.bucket[,2]) | ||
colnames(bucketedSpectra)[i+1] <- colnames(fileName)[i] | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for this - would be cool if this was its own function.
runExampleL <- FALSE | ||
|
||
if(runExampleL) { | ||
##------------------------------ | ||
## Example of arguments | ||
##------------------------------ | ||
argLs <- list(StudyDir = "Tlse_BPASourisCerveau", | ||
upper = "10.0", | ||
lower = "0.50", | ||
bucket.width = "0.01", | ||
exclusion = "TRUE", | ||
exclusion.zone = list(c(6.5,4.5)), | ||
graph="Overlay") | ||
|
||
argLs <- c(argLs, | ||
list(dataMatrixOut = paste(directory,"_NmrBucketing_dataMatrix.tsv",sep=""), | ||
sampleMetadataOut = paste(directory,"_NmrBucketing_sampleMetadata.tsv",sep=""), | ||
variableMetadataOut = paste(directory,"_NmrBucketing_variableMetadata.tsv",sep=""), | ||
graphOut = paste(directory,"_NmrBucketing_graph.pdf",sep=""), | ||
logOut = paste(directory,"_NmrBucketing_log.txt",sep=""))) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
runExampleL <- FALSE | |
if(runExampleL) { | |
##------------------------------ | |
## Example of arguments | |
##------------------------------ | |
argLs <- list(StudyDir = "Tlse_BPASourisCerveau", | |
upper = "10.0", | |
lower = "0.50", | |
bucket.width = "0.01", | |
exclusion = "TRUE", | |
exclusion.zone = list(c(6.5,4.5)), | |
graph="Overlay") | |
argLs <- c(argLs, | |
list(dataMatrixOut = paste(directory,"_NmrBucketing_dataMatrix.tsv",sep=""), | |
sampleMetadataOut = paste(directory,"_NmrBucketing_sampleMetadata.tsv",sep=""), | |
variableMetadataOut = paste(directory,"_NmrBucketing_variableMetadata.tsv",sep=""), | |
graphOut = paste(directory,"_NmrBucketing_graph.pdf",sep=""), | |
logOut = paste(directory,"_NmrBucketing_log.txt",sep=""))) | |
} |
This should be covered in a test case ideally.
# For parseCommandArgs function | ||
library(batch) | ||
# For cumtrapz function | ||
library(pracma) | ||
|
||
# R script call | ||
source_local <- function(fname) | ||
{ | ||
argv <- commandArgs(trailingOnly = FALSE) | ||
base_dir <- dirname(substring(argv[grep("--file=", argv)], 8)) | ||
source(paste(base_dir, fname, sep="/")) | ||
} | ||
#Import the different functions | ||
source_local("NmrBucketing_script.R") | ||
source_local("DrawSpec.R") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be easier to define one function in a file which takes all arguments, then call source on that file in the galaxy wrapper and then call that function. This would remove the entire dependency and command line passing logic. Another options would be to use configfiles.
##------------------------------ | ||
## Errors ????????????????????? | ||
##------------------------------ | ||
|
||
|
||
##------------------------------ | ||
## Constants | ||
##------------------------------ | ||
topEnvC <- environment() | ||
flagC <- "\n" | ||
|
||
|
||
##------------------------------ | ||
## Script | ||
##------------------------------ | ||
if(!runExampleL) | ||
argLs <- parseCommandArgs(evaluate=FALSE) | ||
|
||
|
||
## Parameters Loading | ||
##------------------- | ||
# Inputs | ||
if (!is.null(argLs[["zipfile"]])){ | ||
fileType="zip" | ||
zipfile= argLs[["zipfile"]] | ||
directory=unzip(zipfile, list=F) | ||
directory=paste(getwd(),strsplit(directory[1],"/")[[1]][2],sep="/") | ||
} else if (!is.null(argLs[["tsvfile"]])){ | ||
fileType="tsv" | ||
directory <- read.table(argLs[["tsvfile"]],check.names=FALSE,header=TRUE,sep="\t") | ||
} | ||
|
||
leftBorder <- argLs[["left_border"]] | ||
rightBorder <- argLs[["right_border"]] | ||
bucketSize <- argLs[["bucket_width"]] | ||
exclusionZones <- argLs[["zone_exclusion_choices.choice"]] | ||
|
||
exclusionZonesBorders <- NULL | ||
if (!is.null(argLs$zone_exclusion_left)) | ||
{ | ||
for(i in which(names(argLs)=="zone_exclusion_left")) | ||
{ | ||
exclusionZonesBorders <- c(exclusionZonesBorders,list(c(argLs[[i]],argLs[[i+1]]))) | ||
} | ||
} | ||
|
||
graphique <- argLs[["graphType"]] | ||
|
||
# Outputs | ||
nomGraphe <- argLs[["graphOut"]] | ||
dataMatrixOut <- argLs[["dataMatrixOut"]] | ||
logFile <- argLs[["logOut"]] | ||
if (fileType=="zip") | ||
{ | ||
sampleMetadataOut <- argLs[["sampleOut"]] | ||
variableMetadataOut <- argLs[["variableOut"]] | ||
} | ||
|
||
|
||
## Checking R packages | ||
##-------------------- | ||
sink(logFile) | ||
cat("\t PACKAGE INFO \n") | ||
pkgs=c("batch","pracma") | ||
for(pkg in pkgs) { | ||
suppressPackageStartupMessages( stopifnot( library(pkg, quietly=TRUE, logical.return=TRUE, character.only=TRUE))) | ||
cat(pkg,"\t",as.character(packageVersion(pkg)),"\n",sep="") | ||
} | ||
cat("\n") | ||
|
||
|
||
## Checking arguments | ||
##------------------- | ||
error.stock <- "\n" | ||
|
||
if(length(error.stock) > 1) | ||
stop(error.stock) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this whole section could be removed without the command line parsing section and if following the approach recommended above.
data_sample <- outputs[[2]] | ||
data_variable <- outputs[[3]] | ||
ppm <- outputs[[4]] | ||
ppm <- round(ppm,2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be a tool parameter?
Merge the repo https://github.com/workflow4metabolomics/nmr_bucketing to tools-metabolomics
Replace: #187
FOR CONTRIBUTOR: