New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Nmr bucketing3 #243

Open

lecorguille wants to merge 47 commits into master from nmr_bucketing3

Member

lecorguille commented Jul 26, 2023

Merge the repo https://github.com/workflow4metabolomics/nmr_bucketing to tools-metabolomics

Replace: #187

FOR CONTRIBUTOR:

- I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
- License permits unrestricted use (educational + commercial)
- This PR adds a new tool or tool collection
- This PR updates an existing tool or tool collection
- This PR does something else (explain below)

lecorguille and others added 30 commits

July 26, 2023 17:51


          first commit

a5dc3b5


          add a test data

6bb9413


          planemo tes using conda passed

3bc1583


          README

6b93b24


          add tool_dep for pracma

11081c1


          add .shed.yml

7aec8dc


          add planemo test results ; change version ; add README.md

74ade3c


          planemo shed_test passed

0ea7738


          add info about conda

057db9a


          add travis test

5bbacfd


          Update README.md

f6bb791


          small edit remove IPO mention in Conda

b909b7d


          change output order to propose bucketedData to normalization by default

674a5a7


          Add Metabolomics tag in Galaxy toolshed descriptions.

5204b52


          Update .travis.yml

3f116ed

galaxyproject/planemo#520


          x-axis customization: add chemical shift labels

133f0a6


          fix a bug on an object name

93fee10


          minor

dfd4777


          remove the galaxy dev branch settings

7e495bd


          Update README.rst

ee794a9


          Change of file input type to allow bucketing of preprocessed files

4f8eaf6


          Bucketing depends on file type: if Bruker, read of raw files; if tsv,…

f9f1c1c

… direct bucketing


          Change of x-axis label

dc498c8


          remove r requirement

d5c956b


          Input type change to allow preprocessed files: add of the tsv type in…

5f46387

… the "Input file" parameter


          File input type change to allow preprocessed file to be bucketed: add…

3d7d477

… of a tsv file possibility


          Help change: input type

190a3bd


          Update NmrBucketing_wrapper.R

ebd840a

Fix a syntax error


          Update NmrBucketing_xml.xml

826e69d

remove R deps
bump version
add log change


          Update NmrBucketing_xml.xml

0f24b23

Update changelog

lecorguille and others added 17 commits

July 26, 2023 17:53


          remove library upload

14743d4


          Bucketing modification: if in exclusion zone, integration value = 0; …

960fa82

…graphical representation modification


          SampleMetadata and variableMetadata writing conditionned on the type …

99c57d1

…of input files (if zip = printing; else = no printing)


          Graphical representation: split of the whole spectral window into "zo…

ff53de6

…omed" windows depending on exclusion borders


          Graphical function needed for spectra representation

a56a22c


          update test-data

f8fe31a


          minor change in travis.yml ; removing of deprecated files


          remove the old tool_dependencies

4aa413e


          bump version ; add ChangeLog

fb95c3b


          bump again the version number to 2.0.2

b989550


          Removal of zero's for normalisation

2705e87


          Version change

38e630f


          Enhancement of output information: add of R package version

2b1e2da


          Update .travis.yml

22b42e2


          nmr_bucketin - prepare to merge in tools-metabolomics

4cd940e


          Merge nmr_bucketing repo - step2

74606e1


          nmr_bucketing - NmrBucketing - add zip datatype

bd0bd3c

lecorguille requested a review from mtremblayfr as a code owner

July 26, 2023 16:00

lecorguille mentioned this pull request

Nmr bucketing2 #187

Closed

Member Author

lecorguille commented Jul 26, 2023

ping @mtremblayfr

bgruening reviewed

View reviewed changes

Contributor

bgruening left a comment

Thanks @lecorguille and @mtremblayfr.

I have added some comments inline. Greeting from the Australian Cofest!

tools/nmr_bucketing/NmrBucketing_xml.xml

		@@ -0,0 +1,302 @@
		<tool id="NmrBucketing" name="NMR_Bucketing" version="2.0.3">

Contributor

bgruening Jul 27, 2023

Suggested change

      
            <tool id="NmrBucketing" name="NMR_Bucketing" version="2.0.3">
          
            <tool id="NmrBucketing" name="NMR_Bucketing" version="2.0.3" profile="21.05">

tools/nmr_bucketing/NmrBucketing_xml.xml

		@@ -0,0 +1,302 @@
		<tool id="NmrBucketing" name="NMR_Bucketing" version="2.0.3">

		<description> Bucketing and integration of NMR Bruker raw data</description>

Contributor

bgruening Jul 27, 2023

Suggested change

      
                <description> Bucketing and integration of NMR Bruker raw data</description>
          
                <description>Bucketing and integration of NMR Bruker raw data</description>

tools/nmr_bucketing/NmrBucketing_xml.xml

Comment on lines +10 to +14

+                  <stdio>
+                      <exit_code range="1:" level="fatal" />
+                  </stdio>
+                  <command>

Contributor

bgruening Jul 27, 2023

Suggested change

      
                <stdio>
          
                    <exit_code range="1:" level="fatal" />
          
                </stdio>
          
                <command>
          
                <command detect_errors="exit_code">

tools/nmr_bucketing/NmrBucketing_xml.xml

+                  <inputs>
+                      <conditional name="inputs">
+                          <param name="input" type="select" label="Choose your inputs method" >
+                              <option value="zip_file" selected="true">Zip file from your history containing your Bruker directories</option>

Contributor

bgruening Jul 27, 2023

We do have now a bruker filetype that can be used. https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/config/sample/datatypes_conf.xml.sample#L295

tools/nmr_bucketing/NmrBucketing_xml.xml

+                          </when>
+                      </conditional>
+                      <param name="bucket_width" label="Bucket width" type="float" value="0.04" help="Default value is 0.04 ppm"/>

Contributor

bgruening Jul 27, 2023

Would it make sense to add a min/max attribute here and below?

tools/nmr_bucketing/NmrBucketing_xml.xml

+                  <outputs>
+              		<data format="txt" name="logOut" label="${tool.name}_log" />
+                      <data format="tabular" name="sampleOut" label="${tool.name}_sampleMetadata" />

Contributor

bgruening Jul 27, 2023

You could add the column names as metadata, for example see here: https://github.com/galaxyproject/tools-iuc/blob/689a4aeaf70a8e77be8589b1dcceb190f9df626a/tools/nonpareil/nonpareil.xml#L84

tools/nmr_bucketing/NmrBucketing_xml.xml

+                  <tests>
+                      <test>
+                          <param name="inputs|input" value="zip_file" />

Contributor

bgruening Jul 27, 2023

please use the more explicit way of defining nested structures. E.g.

Suggested change

      
                        <param name="inputs|input" value="zip_file" />
          
                        <conditional name="inputs">
          
                            <param name="input" value="zip_file" />
          
                        </conditional>

tools/nmr_bucketing/NmrBucketing_xml.xml


		.. class:: infomark

		Authors Marie Tremblay-Franco ([email protected]), Marion Landi ([email protected]) and Franck Giacomoni ([email protected])

Contributor

bgruening Jul 27, 2023

Please use the more explicit schema.org annotations for this: https://docs.galaxyproject.org/en/latest/dev/schema.html#tool-creator-person

The advantage is that Galaxy can nicely render this content and you can access it via the API and such ....

tools/nmr_bucketing/NmrBucketing_xml.xml


		---------------------------------------------------

		Changelog/News

Contributor

bgruening Jul 27, 2023

I would remove the changelog its already in the Readme file and its hard to keep it in sync.

tools/nmr_bucketing/repository_dependencies.xml

		@@ -0,0 +1,4 @@
		<?xml version="1.0"?>

Contributor

bgruening Jul 27, 2023

please remove this file

hechth reviewed

View reviewed changes

Contributor

hechth left a comment

Great to see that the metabolomics tools are being consolidated in one place! I think this is also a great opportunity for some overall cleanup and refreshing the tools!

tools/nmr_bucketing/.shed.yml

+              long_description: 'Part of the W4M project: http://workflow4metabolomics.org'
+              name: nmr_bucketing
+              owner: marie-tremblay-metatoul
+              remote_repository_url: https://github.com/workflow4metabolomics/nmr_bucketing

Contributor

hechth Jul 27, 2023

Shouldn't this be updated to now point to this repository?

tools/nmr_bucketing/.shed.yml

		@@ -0,0 +1,7 @@
		categories: [Metabolomics]

Contributor

hechth Jul 27, 2023

Suggested change

      
            categories: [Metabolomics]
          
            categories: Metabolomics

tools/nmr_bucketing/NmrBucketing_script.R

Comment on lines +1 to +13

+              ################################################################################################
+              # SPECTRA BUCKETING AND INTEGRATION FROM RAW BRUKER FILES                                      #
+              # User : Galaxy                                                                                #
+              # Original data : --                                                                           #
+              # Starting date : 20-10-2014                                                                   #
+              # Version 1 : 18-12-2014                                                                       #
+              # Version 2 : 07-01-2015                                                                       #
+              # Version 3 : 24-10-2016                                                                       #
+              #                                                                                              #
+              # Input files : modification on october 2016                                                   #
+              #   - Raw bruker files included in user-defined fileName                                      #
+              #   - Preprocessed files (alignment, ...) included in p x n dataframe                          #
+              ################################################################################################

Contributor

hechth Jul 27, 2023

Suggested change

      
            ################################################################################################
          
            # SPECTRA BUCKETING AND INTEGRATION FROM RAW BRUKER FILES                                      #
          
            # User : Galaxy                                                                                #
          
            # Original data : --                                                                           #
          
            # Starting date : 20-10-2014                                                                   #
          
            # Version 1 : 18-12-2014                                                                       #
          
            # Version 2 : 07-01-2015                                                                       #
          
            # Version 3 : 24-10-2016                                                                       #
          
            #                                                                                              #
          
            # Input files : modification on october 2016                                                   #
          
            #   - Raw bruker files included in user-defined fileName                                      #
          
            #   - Preprocessed files (alignment, ...) included in p x n dataframe                          #
          
            ################################################################################################

tools/nmr_bucketing/NmrBucketing_script.R

Contributor

hechth Jul 27, 2023

Please run some auto formatting tool on this file.

tools/nmr_bucketing/NmrBucketing_script.R

Contributor

hechth Jul 27, 2023

This could overall be cleaned up into some minimal CRAN package as it has minimal dependencies etc. which would also mkae the Galaxy tool a bit nicer and include all the R functions. That would also make this functionality available to more users or other workflow engines.

tools/nmr_bucketing/NmrBucketing_script.R

Comment on lines +212 to +236

+                if (fileType=="tsv")
+                {
+                  FileNames <- colnames(fileName)
+                  n <- length(FileNames)
+                  for (i in 1:ncol(fileName))
+                  {
+                    orderedSpectrum <- cbind(as.numeric(rownames(fileName)),fileName[,i])
+                    orderedSpectrum <- orderedSpectrum[order(orderedSpectrum[,1],decreasing=T), ]
+                    truncatedSpectrum <- orderedSpectrum[orderedSpectrum[,1] < leftBorder & orderedSpectrum[,1] > rightBorder, ]
+                    truncatedSpectrum[,1] <- round(truncatedSpectrum[,1],3)
+                    # Bucketing
+                    spectrum.bucket <- NmrBrucker_bucket(truncatedSpectrum)
+                    ppm <- spectrum.bucket[,1]
+                    # spectrum Concatenation
+                    if (i == 1)
+                      bucketedSpectra <- spectrum.bucket
+                    if (i > 1)
+                      bucketedSpectra <- cbind(bucketedSpectra,spectrum.bucket[,2])
+                    colnames(bucketedSpectra)[i+1] <- colnames(fileName)[i]
+                  }
+                }

Contributor

hechth Jul 27, 2023

Same for this - would be cool if this was its own function.

tools/nmr_bucketing/NmrBucketing_wrapper.R

Comment on lines +9 to +29

+              runExampleL <- FALSE
+              if(runExampleL) {
+              ##------------------------------
+              ## Example of arguments
+              ##------------------------------
+              argLs <- list(StudyDir = "Tlse_BPASourisCerveau",
+                            upper = "10.0",
+                            lower = "0.50",
+                            bucket.width = "0.01",
+                            exclusion = "TRUE",
+                            exclusion.zone = list(c(6.5,4.5)),
+                            graph="Overlay")
+              argLs <- c(argLs,
+                         list(dataMatrixOut = paste(directory,"_NmrBucketing_dataMatrix.tsv",sep=""),
+                              sampleMetadataOut = paste(directory,"_NmrBucketing_sampleMetadata.tsv",sep=""),
+                              variableMetadataOut = paste(directory,"_NmrBucketing_variableMetadata.tsv",sep=""),
+                              graphOut = paste(directory,"_NmrBucketing_graph.pdf",sep=""),
+                              logOut = paste(directory,"_NmrBucketing_log.txt",sep="")))
+              }

Contributor

hechth Jul 27, 2023

Suggested change

      
            runExampleL <- FALSE
          
            if(runExampleL) {
          
            ##------------------------------
          
            ## Example of arguments
          
            ##------------------------------
          
            argLs <- list(StudyDir = "Tlse_BPASourisCerveau",
          
                          upper = "10.0",
          
                          lower = "0.50",
          
                          bucket.width = "0.01",
          
                          exclusion = "TRUE",
          
                          exclusion.zone = list(c(6.5,4.5)),
          
                          graph="Overlay")
          
            argLs <- c(argLs,
          
                       list(dataMatrixOut = paste(directory,"_NmrBucketing_dataMatrix.tsv",sep=""),
          
                            sampleMetadataOut = paste(directory,"_NmrBucketing_sampleMetadata.tsv",sep=""),
          
                            variableMetadataOut = paste(directory,"_NmrBucketing_variableMetadata.tsv",sep=""),
          
                            graphOut = paste(directory,"_NmrBucketing_graph.pdf",sep=""),
          
                            logOut = paste(directory,"_NmrBucketing_log.txt",sep="")))
          
            }

This should be covered in a test case ideally.

tools/nmr_bucketing/NmrBucketing_wrapper.R

Comment on lines +41 to +55

+              # For parseCommandArgs function
+              library(batch)
+              # For cumtrapz function
+              library(pracma)
+              # R script call
+              source_local <- function(fname)
+              {
+              	argv <- commandArgs(trailingOnly = FALSE)
+              	base_dir <- dirname(substring(argv[grep("--file=", argv)], 8))
+              	source(paste(base_dir, fname, sep="/"))
+              }
+              #Import the different functions
+              source_local("NmrBucketing_script.R")
+              source_local("DrawSpec.R")

Contributor

hechth Jul 27, 2023

I think it would be easier to define one function in a file which takes all arguments, then call source on that file in the galaxy wrapper and then call that function. This would remove the entire dependency and command line passing logic. Another options would be to use configfiles.

tools/nmr_bucketing/NmrBucketing_wrapper.R

Comment on lines +57 to +134

+              ##------------------------------
+              ## Errors ?????????????????????
+              ##------------------------------
+              ##------------------------------
+              ## Constants
+              ##------------------------------
+              topEnvC <- environment()
+              flagC <- "\n"
+              ##------------------------------
+              ## Script
+              ##------------------------------
+              if(!runExampleL)
+                  argLs <- parseCommandArgs(evaluate=FALSE)
+              ## Parameters Loading
+              ##-------------------
+                # Inputs
+              if (!is.null(argLs[["zipfile"]])){
+              	fileType="zip"
+              	zipfile= argLs[["zipfile"]]
+              	directory=unzip(zipfile, list=F)
+              	directory=paste(getwd(),strsplit(directory[1],"/")[[1]][2],sep="/")
+              } else if (!is.null(argLs[["tsvfile"]])){
+              	fileType="tsv"
+              	directory <- read.table(argLs[["tsvfile"]],check.names=FALSE,header=TRUE,sep="\t")
+              }
+              leftBorder <- argLs[["left_border"]]
+              rightBorder <- argLs[["right_border"]]
+              bucketSize <- argLs[["bucket_width"]]
+              exclusionZones <- argLs[["zone_exclusion_choices.choice"]]
+              exclusionZonesBorders <- NULL
+              if (!is.null(argLs$zone_exclusion_left))
+              {
+                 for(i in which(names(argLs)=="zone_exclusion_left"))
+                 {
+                   exclusionZonesBorders <- c(exclusionZonesBorders,list(c(argLs[[i]],argLs[[i+1]])))
+                 }
+              }
+              graphique <- argLs[["graphType"]]
+                # Outputs
+              nomGraphe <- argLs[["graphOut"]]
+              dataMatrixOut <- argLs[["dataMatrixOut"]]
+              logFile <- argLs[["logOut"]]
+              if (fileType=="zip")
+              {
+                sampleMetadataOut <- argLs[["sampleOut"]]
+                variableMetadataOut <- argLs[["variableOut"]]
+              }
+              ## Checking R packages
+              ##--------------------
+              sink(logFile)
+              cat("\t PACKAGE INFO \n")
+              pkgs=c("batch","pracma")
+              for(pkg in pkgs) {
+                  suppressPackageStartupMessages( stopifnot( library(pkg, quietly=TRUE, logical.return=TRUE, character.only=TRUE)))
+                  cat(pkg,"\t",as.character(packageVersion(pkg)),"\n",sep="")
+              }
+              cat("\n")
+              ## Checking arguments
+              ##-------------------
+              error.stock <- "\n"
+              if(length(error.stock) > 1)
+                stop(error.stock)

Contributor

hechth Jul 27, 2023

this whole section could be removed without the command line parsing section and if following the approach recommended above.

tools/nmr_bucketing/NmrBucketing_wrapper.R

+              data_sample <- outputs[[2]]
+              data_variable <- outputs[[3]]
+              ppm <- outputs[[4]]
+              ppm <- round(ppm,2)

Contributor

hechth Jul 27, 2023

Maybe this should be a tool parameter?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet