Validate YML contrasts #436

atrigila · 2025-02-14T13:38:36Z

The description of this new feature is here: #371
I am only merging it to dev as stated here: #411 (comment).
I had to make a few adjustments of which parameters takes as input (--contrasts_yml instead of --contrasts), but other than those minor changes, this whole new feature was developed by @alanmmobbs93 here: #404.

PR checklist

…into issue_370

POC contrasts csv -> yaml

Update tabulartogseachip

…obbs93/differentialabundance into newfeature_validate_model

atrigila · 2025-02-18T16:55:14Z

Thank you for your feedaback, @suzannejin @pinin4fjords. I'll talk to Alan who developed this module for some of these more specific questions about his design. I will also see how I can adapt this to the template.

atrigila · 2025-02-20T18:17:38Z

Hi @suzannejin and @pinin4fjords , I have migrated the module to the template structure and answered the questions. Please feel free to take a look again if you can :)

pinin4fjords

Made a few comments. General points:

We need to make sure output files and md5sums are not changed unless there's a good reason.
The R code is a bit scrappy, we should try and make it nicer and more idiomatic. I've attached an AI-assisted version here which I think is closer to the mark and cleaner, it might need some final debugging.

foo.r.zip

pinin4fjords · 2025-02-24T10:00:36Z

modules/local/validatemodel/tests/nextflow.config

@@ -0,0 +1,5 @@
+process {
+    withName: 'VALIDATE_YML_MODEL' {
+        ext.args = params.module_args


Is there even a module_args parameter? Probably wouldn't make sense if there was.

Changed the test params so that they more accurately reflect the used params in the pipeline.

modules/nf-core/custom/tabulartogseachip/environment.yml

modules/local/validatemodel/templates/validate_model.R

tests/test_soft.nf.test.snap

modules/local/validatemodel/templates/validate_model.R

Co-authored-by: Jonathan Manning <[email protected]>

pinin4fjords

Another comment, but in making it I realised I'd already thought about it when making the VALIDATEFOMCOMPONENTS script.

Sorry to be slow on thinking of this (other things going on), but is there a reason you didn't simply extend that other validation process? Seems confusing for the user/ maintainer to have two validation steps.

Work was already done to enable yaml contrasts there: pinin4fjords/shinyngs#68

This script uses validation functions built into the parsing of the various objects themselves, and that will probably be a neater way to go rather than duplicating validation logic here.

Sorry to be a pain, I can help with this when I get some time and we can bake the work you've done here there. We probably just need to extend the existing logic that validates contrasts here: https://github.com/pinin4fjords/shinyngs/blob/931d9c39c1f2200c66cc628e1ea8d68e970262ef/R/accessory.R#L908

pinin4fjords · 2025-02-24T16:18:52Z

workflows/differentialabundance.nf

@@ -193,6 +195,15 @@ workflow DIFFERENTIALABUNDANCE {
            }
            .flatten()
            .unique() // Uniquify to keep each contrast variable only once (in case it exists in multiple lines for blocking etc.)
+
+        VALIDATE_YML_MODEL (


Sorry, one last thought.

The issue with doing it this way is that ch_input etc will proceed through other processes even if there are problems that eventually get flagged by this process. What we really want is for this to stop things before they get that far.

You can use a dummy join to make that happen- I do that in this subworkflow for example. Or you can just have VALIDATE_YML_MODEL spit the matrix and contrasts back out again.

nschcolnicov · 2025-02-24T17:58:13Z

Another comment, but in making it I realised I'd already thought about it when making the VALIDATEFOMCOMPONENTS script.

Sorry to be slow on thinking of this (other things going on), but is there a reason you didn't simply extend that other validation process? Seems confusing for the user/ maintainer to have two validation steps.

Work was already done to enable yaml contrasts there: pinin4fjords/shinyngs#68

This script uses validation functions built into the parsing of the various objects themselves, and that will probably be a neater way to go rather than duplicating validation logic here.

Sorry to be a pain, I can help with this when I get some time and we can bake the work you've done here there. We probably just need to extend the existing logic that validates contrasts here: https://github.com/pinin4fjords/shinyngs/blob/931d9c39c1f2200c66cc628e1ea8d68e970262ef/R/accessory.R#L908

@pinin4fjords I can answer that. We decided to create a new script instead of updating the shinyngs validatefromcomponents module because maintaining a separate tool just to run an R script adds unnecessary complexity and significantly slows down development. Since this process can be handled effectively with a standalone R script, it makes more sense to simplify our workflow rather than having to update and maintain an entire tool every time we need to make changes. Using an R script and module streamlines development, and if we ever need more robustness, we have the option to integrate it into nf-core.

pinin4fjords · 2025-02-25T08:57:25Z

Another comment, but in making it I realised I'd already thought about it when making the VALIDATEFOMCOMPONENTS script.
Sorry to be slow on thinking of this (other things going on), but is there a reason you didn't simply extend that other validation process? Seems confusing for the user/ maintainer to have two validation steps.
Work was already done to enable yaml contrasts there: pinin4fjords/shinyngs#68
This script uses validation functions built into the parsing of the various objects themselves, and that will probably be a neater way to go rather than duplicating validation logic here.
Sorry to be a pain, I can help with this when I get some time and we can bake the work you've done here there. We probably just need to extend the existing logic that validates contrasts here: https://github.com/pinin4fjords/shinyngs/blob/931d9c39c1f2200c66cc628e1ea8d68e970262ef/R/accessory.R#L908

@pinin4fjords I can answer that. We decided to create a new script instead of updating the shinyngs validatefromcomponents module because maintaining a separate tool just to run an R script adds unnecessary complexity and significantly slows down development. Since this process can be handled effectively with a standalone R script, it makes more sense to simplify our workflow rather than having to update and maintain an entire tool every time we need to make changes. Using an R script and module streamlines development, and if we ever need more robustness, we have the option to integrate it into nf-core.

OK, but I disagree with that call.

I'm not asking for the creation of a new script within shinyngs. The fact is, validation functionality is already present, the nf-core module already exists. All this is is some additional logic for that existing validation function. It doesn't make sense to create a new module partially replicating that logic and adding some new.

This code needs to go into the existing Shinyngs function, as stated above. That function already has access to the sample sheet and contrasts, so I don't see that being all that difficult. I will assist on PRs and releases to make that happen.

grst · 2025-02-25T09:01:55Z

I'm wondering if it could be a provisional solution to have it as a separate script in the differentialabundance pipeline? We are planning several more interations on this in #429, #377 and #386 and going through shinyngs releases for each step is really really cumbersome. I'm all for putting things in their proper place in the end, but this process is really slowing down development for questionable benefit.

pinin4fjords · 2025-02-25T09:28:32Z

I'm wondering if it could be a provisional solution to have it as a separate script in the differentialabundance pipeline? We are planning several more interations on this in #429, #377 and #386 and going through shinyngs releases for each step is really really cumbersome. I'm all for putting things in their proper place in the end, but this process is really slowing down development for questionable benefit.

The benefits aren't questionable for me, they're tangible and I did this for a reason. This was a design decision I made when finalising the first versions of this pipeline, and we transitioned from a development to a production mindset. It allows us to share e.g. parsing logic between processes, and keeps scientific logic out of the workflow, which is then just doing orchestration. We also try and avoid local components, as you will have noted, and this facilitates that.

I'm sorry, but I'm not prepared to reverse those benefits just to shorten the development loop, this sort of overhead is not atypical when doing further development on tools in a production state.

But as I say I will help with the shinyngs PRs and releases, as I've done previously.

alanmmobbs93 and others added 30 commits December 13, 2024 19:45

init script

f69703b

add function to check samplesheet

352a6fc

add check model code

afa2472

POC ready

9e42bdd

change function structure

c2f666f

fix bug in module

6614f4d

update yml structure input limited to contrasts only

d9d18c5

Merge branch 'dev' of https://github.com/nf-core/differentialabundance …

887332d

…into issue_370

Updated changelog

d90a576

Update ci.yml and rever changes in nf-core module

34ca6ec

Merge pull request nf-core#382 from nf-core/issue_370

ec554b2

POC contrasts csv -> yaml

Merge branch 'dev_tmp' into newfeature_validate_model

3834865

Update tabulartogseachip

bc3859a

update modules.config

5cc6fd0

Updated snaps

ae39a00

Merge pull request nf-core#412 from nf-core/update_tabulartogseachip

8597c2b

Update tabulartogseachip

Add condition to execute the module and update snaps

6dafd95

Merge branch 'dev_tmp' into newfeature_validate_model

2ee77d0

Merge branch 'dev_tmp' into newfeature_validate_model

d40becf

Merge branch 'newfeature_validate_model' of https://github.com/alanmm…

89cb7bc

…obbs93/differentialabundance into newfeature_validate_model

address PR comments

17e1e37

update snaps

6ed0793

chore: remove dev_tmp from test branches

964bef1

chore: rename validate_model to validate_yml_model

d6885b7

Merge branch 'dev' into feature-validate-model

4bb3a23

chore: separate csv vs yml test

11d558a

refact: move yml validation to params.contrasts_yml

5ab2976

chore: change local for nf-core version

cc3d7a0

fix: update input contrasts file to contrasts_yml params

801f1c5

test: update tests

b0a6fda

atrigila added 7 commits February 19, 2025 17:11

refact: adapt R code to template

bc920c3

refact: adapt to template structure

edc56f1

test: update test and snapshot

f2b97ed

Merge branch 'dev' into feature-validate-model

8977b86

test: add gsea seed and update snapshot

a773933

Merge branch 'dev' into feature-validate-model

2ab6388

test: update snapshots after dev merge

9043cd3

atrigila requested review from suzannejin and pinin4fjords February 20, 2025 18:17

Merge branch 'dev' into feature-validate-model

82ff806

pinin4fjords requested changes Feb 24, 2025

View reviewed changes

atrigila added 8 commits February 24, 2025 13:31

refact: change code to roxygen style

a946e08

refact: pass more meaningful params

c3d5a6e

docs: add meta.yml to local module

b4aec27

test: update snapshots

84fe7a2

refact: replace module_args for sample_id_col

3b7c470

fix: fix linting issues

3fa6012

docs: remove unused sections, add new info

0be1720

revert: environment.yml equal to nf-core module

4891c6c

atrigila requested a review from pinin4fjords February 24, 2025 15:34

pinin4fjords reviewed Feb 24, 2025

View reviewed changes

modules/local/validatemodel/templates/validate_model.R Outdated Show resolved Hide resolved

Update modules/local/validatemodel/templates/validate_model.R

4e3c82b

Co-authored-by: Jonathan Manning <[email protected]>

pinin4fjords requested changes Feb 24, 2025

View reviewed changes

atrigila mentioned this pull request Feb 25, 2025

feat: add new design matrix checks pinin4fjords/shinyngs#70

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate YML contrasts #436

Validate YML contrasts #436

atrigila commented Feb 14, 2025 •

edited

Loading

atrigila commented Feb 18, 2025

atrigila commented Feb 20, 2025

pinin4fjords left a comment

pinin4fjords Feb 24, 2025

atrigila Feb 24, 2025

pinin4fjords left a comment •

edited

Loading

pinin4fjords Feb 24, 2025

nschcolnicov commented Feb 24, 2025

pinin4fjords commented Feb 25, 2025

grst commented Feb 25, 2025

pinin4fjords commented Feb 25, 2025

Validate YML contrasts #436

Are you sure you want to change the base?

Validate YML contrasts #436

Conversation

atrigila commented Feb 14, 2025 • edited Loading

PR checklist

atrigila commented Feb 18, 2025

atrigila commented Feb 20, 2025

pinin4fjords left a comment

Choose a reason for hiding this comment

pinin4fjords Feb 24, 2025

Choose a reason for hiding this comment

atrigila Feb 24, 2025

Choose a reason for hiding this comment

pinin4fjords left a comment • edited Loading

Choose a reason for hiding this comment

pinin4fjords Feb 24, 2025

Choose a reason for hiding this comment

nschcolnicov commented Feb 24, 2025

pinin4fjords commented Feb 25, 2025

grst commented Feb 25, 2025

pinin4fjords commented Feb 25, 2025

atrigila commented Feb 14, 2025 •

edited

Loading

pinin4fjords left a comment •

edited

Loading