From 536ff8c09689250081c976bd2b48bcff13e4921c Mon Sep 17 00:00:00 2001 From: Fattigman Date: Tue, 15 Aug 2023 10:29:00 +0200 Subject: [PATCH 01/25] moved workflow doc into specific folder --- docs/{ => sub-workflows}/scarc-10x.md | 0 docs/{ => sub-workflows}/scatac-10x.md | 0 docs/{ => sub-workflows}/scciteseq-10x.md | 0 docs/{ => sub-workflows}/scflex-10x.md | 0 docs/{ => sub-workflows}/scmulti-10x.md | 0 docs/{ => sub-workflows}/scrna-10x.md | 0 docs/{ => sub-workflows}/scvisium-10x.md | 0 7 files changed, 0 insertions(+), 0 deletions(-) rename docs/{ => sub-workflows}/scarc-10x.md (100%) rename docs/{ => sub-workflows}/scatac-10x.md (100%) rename docs/{ => sub-workflows}/scciteseq-10x.md (100%) rename docs/{ => sub-workflows}/scflex-10x.md (100%) rename docs/{ => sub-workflows}/scmulti-10x.md (100%) rename docs/{ => sub-workflows}/scrna-10x.md (100%) rename docs/{ => sub-workflows}/scvisium-10x.md (100%) diff --git a/docs/scarc-10x.md b/docs/sub-workflows/scarc-10x.md similarity index 100% rename from docs/scarc-10x.md rename to docs/sub-workflows/scarc-10x.md diff --git a/docs/scatac-10x.md b/docs/sub-workflows/scatac-10x.md similarity index 100% rename from docs/scatac-10x.md rename to docs/sub-workflows/scatac-10x.md diff --git a/docs/scciteseq-10x.md b/docs/sub-workflows/scciteseq-10x.md similarity index 100% rename from docs/scciteseq-10x.md rename to docs/sub-workflows/scciteseq-10x.md diff --git a/docs/scflex-10x.md b/docs/sub-workflows/scflex-10x.md similarity index 100% rename from docs/scflex-10x.md rename to docs/sub-workflows/scflex-10x.md diff --git a/docs/scmulti-10x.md b/docs/sub-workflows/scmulti-10x.md similarity index 100% rename from docs/scmulti-10x.md rename to docs/sub-workflows/scmulti-10x.md diff --git a/docs/scrna-10x.md b/docs/sub-workflows/scrna-10x.md similarity index 100% rename from docs/scrna-10x.md rename to docs/sub-workflows/scrna-10x.md diff --git a/docs/scvisium-10x.md b/docs/sub-workflows/scvisium-10x.md similarity index 100% rename from docs/scvisium-10x.md rename to docs/sub-workflows/scvisium-10x.md From d413a0825a8b2aee2770eb8ab57d4638d97803ab Mon Sep 17 00:00:00 2001 From: Fattigman Date: Tue, 15 Aug 2023 10:45:07 +0200 Subject: [PATCH 02/25] Added documentation for some workflows --- docs/10X-genomics/citeseq.md | 91 ++++++++++++++++++++++++++++++++++++ docs/10X-genomics/flex.md | 50 ++++++++++++++++++++ docs/10X-genomics/vdj.md | 32 +++++++++++++ docs/10X-genomics/visium.md | 48 +++++++++++++++++++ 4 files changed, 221 insertions(+) create mode 100644 docs/10X-genomics/citeseq.md create mode 100644 docs/10X-genomics/flex.md create mode 100644 docs/10X-genomics/vdj.md create mode 100644 docs/10X-genomics/visium.md diff --git a/docs/10X-genomics/citeseq.md b/docs/10X-genomics/citeseq.md new file mode 100644 index 0000000..a7b0d7b --- /dev/null +++ b/docs/10X-genomics/citeseq.md @@ -0,0 +1,91 @@ +# Introduction +This guide is to describe how to process data from the CiteSeq protocol. Using antibodies to tag your cells, which can help the researcher to detect things such as cell surface proteins, or pool multiple cell types together (Antibodybody Derived Tags (ADTs) or HashTag Oligonucleotides (HTOs) respectively). + +At it's complicated it consists of 3 modalities(Gene Expression + ADTs + HTOs). This page aim to describe how we process it here at CTG. + +## Feature reference (ADT) + +The feature reference should be provided by the lab which did the experiments. How a feature reference is constructed is [described here.](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/feature-bc-analysis) + +The feature reference is the file which describes these following things fo each ADT: +* id : Unique ID used to track feature counts. May only include ASCII characters and must not use whitespace, slash, quote, or comma characters. Each ID must be unique and must not collide with a gene identifier from the transcriptome +* name : Human-readable name for this feature. May only include ASCII characters and must not use whitespace, slash, quote, or comma characters. This name will be displayed in the Loupe Browser Active Feature list +* read : Specifies which RNA sequencing read contains the Feature Barcode sequence. Must be R1 or R2. Note: in most cases R2 is the correct read +* pattern : Specifies how to extract the Feature Barcode sequence from the read +* sequence : Nucleotide barcode sequence associated with this feature. E.g., antibody barcode or sgRNA protospacer sequence. +* feature_type : Type of the feature. See the Library/Feature Types section for details on the allowed values for this field. FASTQ data specified in the Library CSV file with a library_type that matches the feature_type will be scanned for occurrences of this feature. Each feature type in the feature reference must match a library_type entry in the Libraries CSV file. This field is case-sensitive. + +Example: + +|id |name |read|pattern|sequence |feature_type | +|-----|-----|----|-------|---------------|--------------------| +|A0001|Ms.CD4|R2 |5P(BC) |AACAAGACCCTTGAG|Antibody Capture | +|A0002|Ms.CD8a|R2 |5P(BC) |TACCCGTAATAGCGT|Antibody Capture | +|A0003|Ms.CD366|R2 |5P(BC) |ATTGGCACTCAGATG|Antibody Capture | +|A0004|Ms.CD279|R2 |5P(BC) |GAAAGTCAAAGCACT|Antibody Capture | +|A0013|Ms.Ly.6C|R2 |5P(BC) |AAGTCGTGAGGCATG|Antibody Capture | +|A0014|HuMs.CD11b|R2 |5P(BC) |TGAAGGCTCATTTGT|Antibody Capture | +|A0015|Ms.Ly.6G|R2 |5P(BC) |ACATTGACGCAACTA|Antibody Capture | +|A0070|HuMs.CD49f|R2 |5P(BC) |TTCCGAGGATGATCT|Antibody Capture | +|A0073|HuMs.CD44|R2 |5P(BC) |TGGCTTCAGGTCCTA|Antibody Capture | + +## CMO-Set (HTOs) +If you want to use cellranger multi to demultiplex the HTOs, write a cmo file like this. Otherwise they can be specified as ADT and demultiplexed with the Seurats HTO demux. +CMO_reference file: + +|id |name |read|pattern|sequence |feature_type | +|-----|-----|----|-------|---------------|--------------------| +|HTO1 |HTO1 |R2 |5P(BC) |ACCCACCAGTAAGAC|Multiplexing Capture| +|HTO2 |HTO2 |R2 |5P(BC) |GGTCGAGAGCATTCA|Multiplexing Capture| +|HTO3 |HTO3 |R2 |5P(BC) |CTTGCCGCATGTCAT|Multiplexing Capture| +|HTO6 |HTO6 |R2 |5P(BC) |TATGCTGCCACGGTA|Multiplexing Capture| +|HTO11|HTO11|R2 |5P(BC) |GCTTACCGAATTAAC|Multiplexing Capture| +|HTO12|HTO12|R2 |5P(BC) |CTGCAAATATAACGG|Multiplexing Capture| + +## Config +The csv file which cellranger multi takes as input +```yaml +[gene-expression] +reference,/path/to/refdata-gex-mm10-2020-A +cmo-set,/path/to/CMO_reference.csv + +[feature] +reference,/path/to/HB_feature_ref.csv + +[libraries] +fastq_id,fastqs,feature_types +230316_HB_1,/path/to/0_fastq,Gene Expression +230316_HB_1_ADT,/path/to/0_fastq,Antibody Capture +230316_HB_1_HTO,/path/to/0_fastq,Multiplexing Capture + +[samples] +sample_id,cmo_ids +sample1,HTO1 +sample2,HTO2 +sample3,HTO3 +sample4,HTO6 +sample5,HTO11 +sample6,HTO12 +``` +How to create a CMO-set is described more in [detail here.](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/multi) +## Cellranger multi for HTO + anything else +`singularity run --bind /root/ /path/to/cellranger.simg cellranger multi --id=230316_BM --csv=/path/to/BM_config.csv` +## ADT + GEX only +libraries.csv: +``` +fastqs,sample,library_type +/path/to/fastq/GEX_Sample,GEX_Sample,Gene Expression +/path/to/fastq/ADT_Sample,GEX_Sample,Antibody Capture +``` +command for running: +``` +singularity run --bind /projects/ /path/to/cellranger.simg cellranger count \ + --id=Ram_014 \ + --libraries=/path/to/libraries.csv \ + --transcriptome=/path/to/refdata-gex-GRCh38-2020-A \ + --feature-ref=/path/to/TotalSeq_A_Human_Universal_Cocktail_V1_399907_Antibody_reference_UMI_counting_CellRanger.csv \ + --localmem=140 \ + --jobmode=local \ + --localcores=24 + ``` + diff --git a/docs/10X-genomics/flex.md b/docs/10X-genomics/flex.md new file mode 100644 index 0000000..6940b9d --- /dev/null +++ b/docs/10X-genomics/flex.md @@ -0,0 +1,50 @@ +# Basic run instructions: +The fixation uses cellranger multi. Example: +``` +singularity run --bind /projects/ /path/to/cellranger.simg cellranger multi \ +--id=sample_id \ +--csv=/path/to/config.csv +``` + +# Config files: +For single plexed fixation samples: +config.csv +``` +[gene-expression] +reference,/path/to/refdata-gex-GRCh38-2020-A +probe-set,/path/to/Chromium_Human_Transcriptome_Probe_Set_v1.0_GRCh38-2020-A.csv + +[libraries] +fastq_id,fastqs,feature_types +sample_01_fix,/path/to/fastq,Gene Expression +``` +For multiplexed samples: +config.csv +``` +[gene-expression] +reference,/path/to/refdata-gex-GRCh38-2020-A +probe-set,/path/to/Chromium_Human_Transcriptome_Probe_Set_v1.0_GRCh38-2020-A.csv + +[libraries] +fastq_id,fastqs,feature_types +221117_Milladur,/path/to/fastq/sample_id/,Gene Expression + +[samples] +sample_id,probe_barcode_ids +sample1,BC001 +sample2,BC002 +sample3,BC003 +sample4,BC004 +sample5,BC005 +sample6,BC006 +sample7,BC007 +sample8,BC008 +sample9,BC009 +sample10,BC010 +sample11,BC011 +sample12,BC012 +sample13,BC013 +sample14,BC014 +sample15,BC015 + +``` \ No newline at end of file diff --git a/docs/10X-genomics/vdj.md b/docs/10X-genomics/vdj.md new file mode 100644 index 0000000..6f022e8 --- /dev/null +++ b/docs/10X-genomics/vdj.md @@ -0,0 +1,32 @@ +# VDJ only +``` +cellranger vdj --id=sample345 \ + --reference=/opt/refdata-cellranger-vdj-GRCh38-alts-ensembl-7.1.0 \ + --fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path \ +``` + + +# VDJ + GEX +You will need to construct a config.csv file like +``` +[gene-expression] +reference,/path/to/transcriptome + +[vdj] +reference,/path/to/vdj_reference + +[libraries] +fastq_id,fastqs,feature_types +GEX_fastqs_id,/path/to/GEX_fastqs,Gene Expression +VDJ_B_fastqs_id,/path/to/vdj_B_fastqs,VDJ-B +VDJ_T_fastqs_id,/path/to/vdj_T_fastqs,VDJ-T +``` +Where VDJ-T/B is set manually as opposed to cellranger vdj where it autodetects Antigens. +Then run with: +``` +cellranger multi --id= --csv=/path/to/config.csv +``` +# Reference names +I haven't changed the names provided by 10X standard refernces. These are the names. +* Human: refdata-cellranger-vdj-GRCh38-alts-ensembl-7.1.0 +* Mouse: refdata-cellranger-vdj-GRCm38-alts-ensembl-7.0.0 \ No newline at end of file diff --git a/docs/10X-genomics/visium.md b/docs/10X-genomics/visium.md new file mode 100644 index 0000000..b3c5b97 --- /dev/null +++ b/docs/10X-genomics/visium.md @@ -0,0 +1,48 @@ +# Introduction +This is where I will document how to run spaceranger. It will start very manually and hopefully in the end mature into an automated pipeline. I will make no promises, but I plan to continously revise this guide as the workflow progress. + +# Spaceranger Count +**Required input files** +* Fastq-files +* Cytassist image +* Microscope image +* Slide area +* Slide .gpr file (This will need to be downloaded) + + +**Count sbatch script:** +```bash +#!/bin/bash -ue +#SBATCH -c 24 +#SBATCH -t 12:00:00 +#SBATCH --mem 124G +#SBATCH -J sample +#SBATCH -o /path/to/out/sample.out +#SBATCH -e /path/to/err/sample.err + +singularity run --bind /projects/ \ + /path/to/spaceranger.simg \ + spaceranger count \ + --id="Visium_FFPE_Human" \ + --transcriptome=/path/to/refdata-gex-GRCh38-2020-A \ + --probe-set=/path/to/Visium_Human_Transcriptome_Probe_Set_v2.0_GRCh38-2020-A.csv \ + --fastqs=/path/to/sample/fastq \ + --sample=sample \ + --image=/path/to/CytAssist_HighRes_Sample-A-index-D1.tif \ + --slide=V42L13-392 \ + --slidefile=/path/to/V42L13-392.gpr \ + --area=A1 \ + --cytaimage=/path/to/CytAssist_LowRes_Sample-A-Index-D1.tif \ +``` + +# Glossary +**Visium CytAssist Instrument**: An instrument that mediates the tissue permeabilization to release the ligation products from tissues on standard glass slide for capture by spatially barcoded oligonucleotides within each Capture Area on the Visium slide. It also captures the image of the tissue section on the Visium slide. + +**CytAssist Captured Image (or CytAssist Image)**: A low resolution brightfield image in TIFF format that is captured by the CytAssist of the eosin stained tissue section on the CytAssist slide inside the instrument. This image contains the fiducial frame. + +**Microscope Image**: A high resolution brightfield or fluorescence image of the tissue section on the standard glass slide captured by a microscope. This image does not contain the fiducial frame. + +**CytAssist Spatial Gene Expression Slide, 6.5 mm**: Visium Spatial Gene Expression slide for use with CytAssist instrument with two capture areas each with dimensions of 6.5 mm by 6.5 mm. The spots within the capture area on these slides contain specialized oligos for capturing poly-adenylated mRNA tags. These slides have serial numbers starting with "V4". + +**CytAssist Spatial Gene Expression Slide, 11 mm**: Visium Spatial Gene Expression slide for use with CytAssist instrument with two capture areas each with dimensions 11 mm by 11 mm. The number of spots within the capture area on these slides are ~3x higher compared to the 6.5 mm capture areas and contain specialized oligos for capturing poly-adenylated mRNA tags. These slides have serial numbers starting with "V5". + From c4793c437cc898d76004cc7db96dfe809df80f0a Mon Sep 17 00:00:00 2001 From: Fattigman Date: Wed, 16 Aug 2023 18:14:48 +0200 Subject: [PATCH 03/25] shortened stub data --- modules/cellranger/multi/main.nf | 57 +------------------------------- 1 file changed, 1 insertion(+), 56 deletions(-) diff --git a/modules/cellranger/multi/main.nf b/modules/cellranger/multi/main.nf index 5549308..ff6050e 100644 --- a/modules/cellranger/multi/main.nf +++ b/modules/cellranger/multi/main.nf @@ -37,70 +37,15 @@ Cells,VDJ B,,,Cells with productive IGK contig,70.10% Cells,VDJ B,,,Cells with productive IGL contig,6.19% Cells,VDJ B,,,"Cells with productive V-J spanning (IGK, IGH) pair",49.31% Cells,VDJ B,,,"Cells with productive V-J spanning (IGL, IGH) pair",5.50% -Cells,VDJ B,,,Cells with productive V-J spanning pair,54.47% -Cells,VDJ B,,,Estimated number of cells,582 -Cells,VDJ B,,,Median IGH UMIs per Cell,9 -Cells,VDJ B,,,Median IGK UMIs per Cell,10 -Cells,VDJ B,,,Median IGL UMIs per Cell,0 -Cells,VDJ B,,,Number of cells with productive V-J spanning pair,317 Cells,VDJ B,,,Paired clonotype diversity,72.28 Cells,VDJ T,,,Cells with productive TRA contig,79.03% Cells,VDJ T,,,Cells with productive TRB contig,97.58% Cells,VDJ T,,,"Cells with productive V-J spanning (TRA, TRB) pair",76.61% Cells,VDJ T,,,Cells with productive V-J spanning pair,76.61% -Cells,VDJ T,,,Estimated number of cells,124 -Cells,VDJ T,,,Median TRA UMIs per Cell,3 -Cells,VDJ T,,,Median TRB UMIs per Cell,7 -Cells,VDJ T,,,Number of cells with productive V-J spanning pair,95 -Cells,VDJ T,,,Paired clonotype diversity,35.11 Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of reads,"172,468,563" Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of short reads skipped,0 Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 RNA read,91.1% Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8% -Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 barcodes,95.8% -Library,Gene Expression,Physical library ID,GEX_1,Confidently mapped antisense,2.56% -Library,Gene Expression,Physical library ID,GEX_1,Confidently mapped reads in cells,84.20% -Library,Gene Expression,Physical library ID,GEX_1,Confidently mapped to exonic regions,21.93% -Library,Gene Expression,Physical library ID,GEX_1,Confidently mapped to genome,26.08% -Library,Gene Expression,Physical library ID,GEX_1,Confidently mapped to intergenic regions,2.50% -Library,Gene Expression,Physical library ID,GEX_1,Confidently mapped to intronic regions,1.65% -Library,Gene Expression,Physical library ID,GEX_1,Confidently mapped to transcriptome,20.79% -Library,Gene Expression,Physical library ID,GEX_1,Estimated number of cells,"5,727" -Library,Gene Expression,Physical library ID,GEX_1,Mapped to genome,36.49% -Library,Gene Expression,Physical library ID,GEX_1,Mean reads per cell,"30,115" -Library,Gene Expression,Physical library ID,GEX_1,Number of reads,"172,468,563" -Library,Gene Expression,Physical library ID,GEX_1,Number of reads in the library,"172,468,563" -Library,Gene Expression,Physical library ID,GEX_1,Sequencing saturation,50.84% -Library,Gene Expression,Physical library ID,GEX_1,Valid UMIs,99.85% -Library,Gene Expression,Physical library ID,GEX_1,Valid barcodes,92.38% -Library,VDJ B,Fastq ID,1a_522_3wbm_BCR,Number of reads,"30,607,267" -Library,VDJ B,Fastq ID,1a_522_3wbm_BCR,Number of short reads skipped,0 -Library,VDJ B,Fastq ID,1a_522_3wbm_BCR,Q30 RNA read,92.7% -Library,VDJ B,Fastq ID,1a_522_3wbm_BCR,Q30 UMI,94.5% -Library,VDJ B,Fastq ID,1a_522_3wbm_BCR,Q30 barcodes,95.6% -Library,VDJ B,Physical library ID,VDJB_1,Estimated number of cells,582 -Library,VDJ B,Physical library ID,VDJB_1,Fraction reads in cells,83.31% -Library,VDJ B,Physical library ID,VDJB_1,Mean reads per cell,"52,590" -Library,VDJ B,Physical library ID,VDJB_1,Mean used reads per cell,"9,638" -Library,VDJ B,Physical library ID,VDJB_1,Number of reads,"30,607,267" -Library,VDJ B,Physical library ID,VDJB_1,Reads mapped to IGH,22.57% -Library,VDJ B,Physical library ID,VDJB_1,Reads mapped to IGK,50.50% -Library,VDJ B,Physical library ID,VDJB_1,Reads mapped to IGL,14.06% -Library,VDJ B,Physical library ID,VDJB_1,Reads mapped to any V(D)J gene,87.14% -Library,VDJ B,Physical library ID,VDJB_1,Valid barcodes,95.44% -Library,VDJ T,Fastq ID,1a_522_3wbm_TCR,Number of reads,"39,674,884" -Library,VDJ T,Fastq ID,1a_522_3wbm_TCR,Number of short reads skipped,0 -Library,VDJ T,Fastq ID,1a_522_3wbm_TCR,Q30 RNA read,91.5% -Library,VDJ T,Fastq ID,1a_522_3wbm_TCR,Q30 UMI,94.7% -Library,VDJ T,Fastq ID,1a_522_3wbm_TCR,Q30 barcodes,95.4% -Library,VDJ T,Physical library ID,VDJT_1,Estimated number of cells,124 -Library,VDJ T,Physical library ID,VDJT_1,Fraction reads in cells,15.60% -Library,VDJ T,Physical library ID,VDJT_1,Mean reads per cell,"319,959" -Library,VDJ T,Physical library ID,VDJT_1,Mean used reads per cell,"35,979" -Library,VDJ T,Physical library ID,VDJT_1,Number of reads,"39,674,884" -Library,VDJ T,Physical library ID,VDJT_1,Reads mapped to TRA,7.91% -Library,VDJ T,Physical library ID,VDJT_1,Reads mapped to TRB,24.14% -Library,VDJ T,Physical library ID,VDJT_1,Reads mapped to any V(D)J gene,32.31% -Library,VDJ T,Physical library ID,VDJT_1,Valid barcodes,79.69%\'\'\' > \$sample_dir/metrics_summary.csv +\'\'\' > \$sample_dir/metrics_summary.csv """ } \ No newline at end of file From 4a05f1e4f7c41b074e0663f8f42ac1b641cec5cb Mon Sep 17 00:00:00 2001 From: Fattigman Date: Wed, 16 Aug 2023 18:17:17 +0200 Subject: [PATCH 04/25] two plex stub data --- modules/cellranger/multi/main.nf | 38 +++++++++++++++++++++++++++----- 1 file changed, 33 insertions(+), 5 deletions(-) diff --git a/modules/cellranger/multi/main.nf b/modules/cellranger/multi/main.nf index ff6050e..67de7f5 100644 --- a/modules/cellranger/multi/main.nf +++ b/modules/cellranger/multi/main.nf @@ -21,10 +21,14 @@ process MULTI { """ stub: """ - sample_dir=$sample_id/outs/per_sample_outs/$sample_id - mkdir -p \$sample_dir - touch \$sample_dir/web_summary.html - touch \$sample_dir/cloupe.cloupe + sample_dir_1=$sample_id/outs/per_sample_outs/${sample_id}_1 + sample_dir_2=$sample_id/outs/per_sample_outs/${sample_id}_2 + mkdir -p \$sample_dir_1 + mkdir -p \$sample_dir_2 + touch \$sample_dir_1/web_summary.html + touch \$sample_dir_2/web_summary.html + touch \$sample_dir_1/cloupe.cloupe + touch \$sample_dir_2/cloupe.cloupe echo \'\'\'Category,Library Type,Grouped By,Group Name,Metric Name,Metric Value Cells,Gene Expression,,,Cells,"5,727" Cells,Gene Expression,,,Confidently mapped reads in cells,84.20% @@ -46,6 +50,30 @@ Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of reads,"172,468,563" Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of short reads skipped,0 Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 RNA read,91.1% Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8% -\'\'\' > \$sample_dir/metrics_summary.csv +\'\'\' > \$sample_dir_1/metrics_summary.csv + +echo \'\'\'Category,Library Type,Grouped By,Group Name,Metric Name,Metric Value +Cells,Gene Expression,,,Cells,"5,727" +Cells,Gene Expression,,,Confidently mapped reads in cells,84.20% +Cells,Gene Expression,,,Median UMI counts per cell,"1,585" +Cells,Gene Expression,,,Median genes per cell,744 +Cells,Gene Expression,,,Median reads per cell,"16,412" +Cells,Gene Expression,,,Total genes detected,"14,371" +Cells,VDJ B,,,Cells with productive IGH contig,78.52% +Cells,VDJ B,,,Cells with productive IGK contig,70.10% +Cells,VDJ B,,,Cells with productive IGL contig,6.19% +Cells,VDJ B,,,"Cells with productive V-J spanning (IGK, IGH) pair",49.31% +Cells,VDJ B,,,"Cells with productive V-J spanning (IGL, IGH) pair",5.50% +Cells,VDJ B,,,Paired clonotype diversity,72.28 +Cells,VDJ T,,,Cells with productive TRA contig,79.03% +Cells,VDJ T,,,Cells with productive TRB contig,97.58% +Cells,VDJ T,,,"Cells with productive V-J spanning (TRA, TRB) pair",76.61% +Cells,VDJ T,,,Cells with productive V-J spanning pair,76.61% +Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of reads,"172,468,563" +Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of short reads skipped,0 +Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 RNA read,91.1% +Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8% +\'\'\' > \$sample_dir_2/metrics_summary.csv + """ } \ No newline at end of file From 7c73b816dcdc50d6ab110e487ceacc2ffc4df3f7 Mon Sep 17 00:00:00 2001 From: Fattigman Date: Wed, 23 Aug 2023 10:02:27 +0200 Subject: [PATCH 05/25] updated vdj reference path --- nextflow.config | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/nextflow.config b/nextflow.config index aaded70..f3c46ad 100755 --- a/nextflow.config +++ b/nextflow.config @@ -18,7 +18,7 @@ params { mixed_genome="$refdir/cellranger/hg38_mm10/refdata-gex-GRCh38-and-mm10-2020-A" // VDJ references - human_vdj="$refdir/cellranger/vdj/refdata-cellranger-vdj-GRCh38-alts-ensembl-7.0.0" + human_vdj="$refdir/cellranger/vdj/refdata-cellranger-vdj-GRCh38-alts-ensembl-7.1.0" mouse_vdj="$refdir/cellranger/vdj/refdata-cellranger-vdj-GRCm38-alts-ensembl-7.0.0" // Probe sets From 5574534cb6ab29c8742e2c14a3a6db5e28fe5187 Mon Sep 17 00:00:00 2001 From: Fattigman Date: Wed, 23 Aug 2023 10:02:48 +0200 Subject: [PATCH 06/25] update manifest --- templates/manifest.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/templates/manifest.html b/templates/manifest.html index c229c98..b2fff9c 100644 --- a/templates/manifest.html +++ b/templates/manifest.html @@ -83,7 +83,7 @@

Nextflow manifest

-

Pipeline release: 1.4.0

+

Pipeline release: 1.4.1

Subworkflow: xxSubWorkflowxx

Maintainer: jacob.karlstrom@med.lu.se

Description: Your data has been processed by the nextflow pipeline singleCellWorkflows. If you want more information on how your data was processed, follow the link below and navigate to your release!

From ae0015fa1c27a65dacc903d41c16e78425d30b6b Mon Sep 17 00:00:00 2001 From: Fattigman Date: Mon, 28 Aug 2023 12:22:42 +0200 Subject: [PATCH 07/25] removed debugging print --- subworkflows/finish.nf | 1 - 1 file changed, 1 deletion(-) diff --git a/subworkflows/finish.nf b/subworkflows/finish.nf index 83b21c1..5cf1a4d 100644 --- a/subworkflows/finish.nf +++ b/subworkflows/finish.nf @@ -24,5 +24,4 @@ workflow FINISH_PROJECTS { md5sum_ch = MD5SUM(publish_ch) deliver_auto_ch = DELIVER_PROJ(md5sum_ch.project_id) } - print params.ctg_mode } \ No newline at end of file From 69c9e8f4bdd460b788dd92b63d4c0817e24fccfb Mon Sep 17 00:00:00 2001 From: Fattigman Date: Mon, 28 Aug 2023 12:39:44 +0200 Subject: [PATCH 08/25] fixed output formatting error --- modules/cellranger/multi/main.nf | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/modules/cellranger/multi/main.nf b/modules/cellranger/multi/main.nf index 67de7f5..e3a4ff5 100644 --- a/modules/cellranger/multi/main.nf +++ b/modules/cellranger/multi/main.nf @@ -49,8 +49,7 @@ Cells,VDJ T,,,Cells with productive V-J spanning pair,76.61% Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of reads,"172,468,563" Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of short reads skipped,0 Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 RNA read,91.1% -Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8% -\'\'\' > \$sample_dir_1/metrics_summary.csv +Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8%\'\'\' > \$sample_dir_1/metrics_summary.csv echo \'\'\'Category,Library Type,Grouped By,Group Name,Metric Name,Metric Value Cells,Gene Expression,,,Cells,"5,727" @@ -72,8 +71,7 @@ Cells,VDJ T,,,Cells with productive V-J spanning pair,76.61% Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of reads,"172,468,563" Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of short reads skipped,0 Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 RNA read,91.1% -Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8% -\'\'\' > \$sample_dir_2/metrics_summary.csv +Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8%\'\'\' > \$sample_dir_2/metrics_summary.csv """ } \ No newline at end of file From f1ab609950a743f607f5145974c7662b52334276 Mon Sep 17 00:00:00 2001 From: Fattigman Date: Mon, 28 Aug 2023 13:17:28 +0200 Subject: [PATCH 09/25] fixed new config header --- modules/split_sheet/main.nf | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/modules/split_sheet/main.nf b/modules/split_sheet/main.nf index c185d91..1424139 100644 --- a/modules/split_sheet/main.nf +++ b/modules/split_sheet/main.nf @@ -5,7 +5,7 @@ process SPLITSHEET { output: path "10X_Data.csv", emit: data path "Data.csv", emit: pipe_data, optional: true - path "10X_Flex_Data.csv", emit: flex, optional: true + path "FlexConfig_Data.csv", emit: flex, optional: true path "FeatureReference_Data.csv", emit: feature_reference, optional: true shell: ''' From b155e45377d80514aa1a4678d7eaa108a1194f49 Mon Sep 17 00:00:00 2001 From: Fattigman Date: Mon, 28 Aug 2023 14:05:59 +0200 Subject: [PATCH 10/25] rewrote script to python for readability --- bin/countmetric2mqc.py | 18 ++++++++++++++++++ modules/cellranger2multiqc/count/main.nf | 5 +++-- 2 files changed, 21 insertions(+), 2 deletions(-) create mode 100644 bin/countmetric2mqc.py diff --git a/bin/countmetric2mqc.py b/bin/countmetric2mqc.py new file mode 100644 index 0000000..3757a5b --- /dev/null +++ b/bin/countmetric2mqc.py @@ -0,0 +1,18 @@ +import sys + +# Check if input file name was provided +if len(sys.argv) < 3: + print('Usage: python countmetric2mqc.py file.csv sample_name output_mqc.yaml') + sys.exit(1) + +# Get input file name from command-line argument +input_file = sys.argv[1] +sample_name = sys.argv[2] +mqc_yaml = sys.argv[3] +# Initialize dictionaries for each category +with open(input_file, 'r') as file: + keys, values = file.readline().split(','), file.readline().split(',') + data = {k.strip():v.strip() for (k,v) in zip(keys,values)} +# Appends to an already existing mqc.yaml file +with open(mqc_yaml, 'a') as file: + file.write(f' {sample_name}: {data}') \ No newline at end of file diff --git a/modules/cellranger2multiqc/count/main.nf b/modules/cellranger2multiqc/count/main.nf index 9a3a5ff..f731df2 100644 --- a/modules/cellranger2multiqc/count/main.nf +++ b/modules/cellranger2multiqc/count/main.nf @@ -22,10 +22,11 @@ process CELLRANGER_COUNT_TO_MULTIQC{ IFS=', ' read -ra project_array <<< \"\$project_string\" # Checks if mqc_yaml exists, if not create it for i in {0..$number_of_samples}; do + echo \$i if ! [ -f \"$params.outdir/\${project_array[\$i]}/1_qc/multiqc/multiqc_mqc.yaml\" ]; then mkdir -p \"$params.outdir/\${project_array[\$i]}/1_qc/multiqc/\" echo \"\"\"id: \"single_cell_workflows_table\" -section_name : \"Single Cell Workflows Stats\" +section_name : \"Single Cell Workflows Count Stats\" description: \"This table consists of the data gathered from cellranger output \" plot_type: \"table\" pconfig: @@ -34,7 +35,7 @@ pconfig: data:\"\"\" > \"$params.outdir/\${project_array[\$i]}/1_qc/multiqc/multiqc_mqc.yaml\" fi # Extends mqc_yaml file with sample information - echo \" \${sample_array[\$i]}: \$(cat $params.outdir/\${project_array[\$i]}/2_count/\${sample_array[\$i]}/outs/$summary | python -c 'import csv, json, sys; print(json.dumps([dict(r) for r in csv.DictReader(sys.stdin)]))')\" | tr -d '[]' >> \"$params.outdir/\${project_array[\$i]}/1_qc/multiqc/multiqc_mqc.yaml\" + python $projectDir/bin/countmetric2mqc.py $params.outdir/\${project_array[\$i]}/2_count/\${sample_array[\$i]}/outs/$summary \${sample_array[\$i]} $params.outdir/\${project_array[\$i]}/1_qc/multiqc/multiqc_mqc.yaml done """ } \ No newline at end of file From 2946b0c8e409649468104307dc9dd8893a13a0ed Mon Sep 17 00:00:00 2001 From: Fattigman Date: Mon, 28 Aug 2023 14:07:13 +0200 Subject: [PATCH 11/25] removed unecessary part --- examples/CTG_SampleSheet.csv | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/examples/CTG_SampleSheet.csv b/examples/CTG_SampleSheet.csv index b1e9edd..2a45cdf 100644 --- a/examples/CTG_SampleSheet.csv +++ b/examples/CTG_SampleSheet.csv @@ -1,13 +1,5 @@ [Header],,,,,,,, IEMFileVersion,4,,,,,,, -[Data],,,,,,,, -Sample_ID,index,index2,Sample_Project,,,,, -sample-1,SI-TT-A1,SI-TT-A1,project1,,,,, -sample-2,SI-TT-A2,SI-TT-A2,project1,,,,, -sample-3,SI-TT-A3,SI-TT-A3,project2,,,,, -sample-4,SI-TT-A4,SI-TT-A4,project3,,,,, -sample-5,SI-TT-A5,SI-TT-A5,project3,,,,, -sample-6,SI-TT-A6,SI-TT-A6,project3,,,,, [10X_Data],,,,,,,, Sample_ID,Sample_Species,Sample_Project,force,agg,sample_pair,libtype,pipeline,cytaimage,darkimage,image,slide,slide_area sample-1,human,project1,n,n,n,gex,scrna-10x,n,n,n,n,n @@ -31,6 +23,8 @@ sample-18,human,project10,n,n,n,gex,scvisium-10x,cytaimage,darkimage,image,slide sample-19,human,project7,n,n,5,gex,scmulti-10x,n,n,n,n,n sample-20,human,project7,n,n,5,tcr,scmulti-10x,n,n,n,n,n sample-21,human,project7,n,n,5,bcr,scmulti-10x,n,n,n,n,n +sample-22,human,project9,n,n,6,atac,scarc-10x,n,n,n,n,n +sample-23,human,project9,n,n,6,gex,scarc-10x,n,n,n,n,n [FlexConfig_Data],,,,,,,, sample_id,probe_barcode,Sample_Source,,,,,, sample1,BC001|BC002,sample_7,,,,,, From 548f695c7f8133130f2422d2eb1cf1c4183293f2 Mon Sep 17 00:00:00 2001 From: Fattigman Date: Mon, 28 Aug 2023 14:07:13 +0200 Subject: [PATCH 12/25] removed unecessary part --- bin/countmetric2mqc.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/bin/countmetric2mqc.py b/bin/countmetric2mqc.py index 3757a5b..0a664c3 100644 --- a/bin/countmetric2mqc.py +++ b/bin/countmetric2mqc.py @@ -15,4 +15,4 @@ data = {k.strip():v.strip() for (k,v) in zip(keys,values)} # Appends to an already existing mqc.yaml file with open(mqc_yaml, 'a') as file: - file.write(f' {sample_name}: {data}') \ No newline at end of file + file.write(f' {sample_name}: {data}\n') \ No newline at end of file From c712183a9482f6da863345e52ec8cb3f38e87c19 Mon Sep 17 00:00:00 2001 From: Fattigman Date: Mon, 28 Aug 2023 15:30:28 +0200 Subject: [PATCH 13/25] update manifest --- templates/manifest.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/templates/manifest.html b/templates/manifest.html index b2fff9c..18465c0 100644 --- a/templates/manifest.html +++ b/templates/manifest.html @@ -83,14 +83,14 @@

Nextflow manifest

-

Pipeline release: 1.4.1

+

Pipeline release: 1.4.2

Subworkflow: xxSubWorkflowxx

Maintainer: jacob.karlstrom@med.lu.se

Description: Your data has been processed by the nextflow pipeline singleCellWorkflows. If you want more information on how your data was processed, follow the link below and navigate to your release!

- +
From 1effa7d2defddb1b3c62dde88dda6decd9804020 Mon Sep 17 00:00:00 2001 From: Fattigman Date: Tue, 19 Sep 2023 16:36:20 +0200 Subject: [PATCH 14/25] blacked --- bin/countmetric2mqc.py | 12 ++++++------ bin/multimetric2mqc.py | 16 ++++++++-------- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/bin/countmetric2mqc.py b/bin/countmetric2mqc.py index 0a664c3..579ad68 100644 --- a/bin/countmetric2mqc.py +++ b/bin/countmetric2mqc.py @@ -2,7 +2,7 @@ # Check if input file name was provided if len(sys.argv) < 3: - print('Usage: python countmetric2mqc.py file.csv sample_name output_mqc.yaml') + print("Usage: python countmetric2mqc.py file.csv sample_name output_mqc.yaml") sys.exit(1) # Get input file name from command-line argument @@ -10,9 +10,9 @@ sample_name = sys.argv[2] mqc_yaml = sys.argv[3] # Initialize dictionaries for each category -with open(input_file, 'r') as file: - keys, values = file.readline().split(','), file.readline().split(',') - data = {k.strip():v.strip() for (k,v) in zip(keys,values)} +with open(input_file, "r") as file: + keys, values = file.readline().split(","), file.readline().split(",") + data = {k.strip(): v.strip() for (k, v) in zip(keys, values)} # Appends to an already existing mqc.yaml file -with open(mqc_yaml, 'a') as file: - file.write(f' {sample_name}: {data}\n') \ No newline at end of file +with open(mqc_yaml, "a") as file: + file.write(f" {sample_name}: {data}\n") diff --git a/bin/multimetric2mqc.py b/bin/multimetric2mqc.py index 9c0cd31..a26b678 100644 --- a/bin/multimetric2mqc.py +++ b/bin/multimetric2mqc.py @@ -4,7 +4,7 @@ # Check if input file name was provided if len(sys.argv) < 3: - print('Usage: python convert.py file.csv sample_name') + print("Usage: python convert.py file.csv sample_name") sys.exit(1) # Get input file name from command-line argument @@ -16,7 +16,7 @@ other = {} # Open CSV file -with open(input_file, 'r') as f: +with open(input_file, "r") as f: # Create CSV reader reader = csv.reader(f) @@ -29,15 +29,15 @@ category, library_type, grouped_by, group_name, metric_name, metric_value = row # Add metric_name and metric_value to appropriate dictionary based on category - if category == 'Cells': - cells[library_type+'_'+metric_name] = metric_value - elif category == 'Library': - library[library_type+'_'+metric_name] = metric_value + if category == "Cells": + cells[library_type + "_" + metric_name] = metric_value + elif category == "Library": + library[library_type + "_" + metric_name] = metric_value else: other[metric_name] = metric_value # Write dictionaries to JSON files -with open('{}_cells.json'.format(sample_name), 'w') as f: +with open("{}_cells.json".format(sample_name), "w") as f: json.dump(cells, f) -with open('{}_library.json'.format(sample_name), 'w') as f: +with open("{}_library.json".format(sample_name), "w") as f: json.dump(library, f) From 359f8fcf906051a4bbd3132f3f19a99047ff0610 Mon Sep 17 00:00:00 2001 From: Fattigman Date: Tue, 19 Sep 2023 16:51:37 +0200 Subject: [PATCH 15/25] added overarching workflow steps --- docs/sub-workflows/scarc-10x.md | 7 ++++++- docs/sub-workflows/scatac-10x.md | 6 +++++- docs/sub-workflows/scciteseq-10x.md | 13 ++++--------- docs/sub-workflows/scflex-10x.md | 17 ++++------------- docs/sub-workflows/scmulti-10x.md | 6 +++++- docs/sub-workflows/scrna-10x.md | 12 +++--------- docs/sub-workflows/scvisium-10x.md | 11 ++--------- 7 files changed, 29 insertions(+), 43 deletions(-) diff --git a/docs/sub-workflows/scarc-10x.md b/docs/sub-workflows/scarc-10x.md index ff9cb60..edf23c8 100644 --- a/docs/sub-workflows/scarc-10x.md +++ b/docs/sub-workflows/scarc-10x.md @@ -15,4 +15,9 @@ Other than standard parameters, following parameters needs to be defined: * human_atac : Path to the human arc reference genome * mouse_atac : Path to the arc mouse reference genome -* COUNT_ARC : The path to the container which cellranger-arc is called from \ No newline at end of file +* COUNT_ARC : The path to the container which cellranger-arc is called from + +# Workflow specific processing steps +* Generate library.csv files as described here https://support.10xgenomics.com/single-cell-multiome-atac-gex/software/pipelines/latest/using/count#library-csv +* `cellranger-arc count` on library files as described here +* Generate multiqc parseable files based on the metrics files generated by cellranger-arc \ No newline at end of file diff --git a/docs/sub-workflows/scatac-10x.md b/docs/sub-workflows/scatac-10x.md index eb7399d..bda40f7 100644 --- a/docs/sub-workflows/scatac-10x.md +++ b/docs/sub-workflows/scatac-10x.md @@ -16,4 +16,8 @@ Other than standard parameters, following parameters needs to be defined: * human_atac : Path to the human arc reference genome * mouse_atac : Path to the arc mouse reference genome -* COUNT_ATAC : The path to the container which cellranger-arc is called from \ No newline at end of file +* COUNT_ATAC : The path to the container which cellranger-arc is called from + +# Workflow specific processing steps +* `cellranger-atac count` on fastq files as described here https://support.10xgenomics.com/single-cell-atac/software/pipelines/latest/what-is-cell-ranger-atac +* Generate multiqc parseable files based on the metrics files generated by cellranger-atac \ No newline at end of file diff --git a/docs/sub-workflows/scciteseq-10x.md b/docs/sub-workflows/scciteseq-10x.md index 6151423..66b77fd 100644 --- a/docs/sub-workflows/scciteseq-10x.md +++ b/docs/sub-workflows/scciteseq-10x.md @@ -28,12 +28,7 @@ Explanation of each column: ## 10X_FeatureReference Described [here](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/feature-bc-analysis) by 10X official documentation. -# How to Run - -To execute the workflow, use the following command: - -``` -nextflow run main.nf --samplesheet -``` - -Replace `` with the actual path to your SampleSheet.csv file. The `--analysis` option should be set to `scciteseq-10x` to indicate the pipeline to use for analysis. \ No newline at end of file +# Workflow specific processing steps +* Generate library.csv files as described here https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/feature-bc-analysis#libraries-csv +* `cellranger count` on library files as described here: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/feature-bc-analysis#overview +* Generate multiqc parseable files based on the metrics files generated by cellranger \ No newline at end of file diff --git a/docs/sub-workflows/scflex-10x.md b/docs/sub-workflows/scflex-10x.md index fb77279..b8cb26d 100644 --- a/docs/sub-workflows/scflex-10x.md +++ b/docs/sub-workflows/scflex-10x.md @@ -28,18 +28,9 @@ Explanation of each column: Described [here](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/multi-frp#samples) by 10X official documentation. Special case is `Sample_Source` which need to match a sample from the 10X_Data section. If no match is found, pipeline assumes it's a singleplex sample. -# How to Run - -To execute the workflow, use the following command: - -``` -nextflow run main.nf --samplesheet -``` -Or if using custom probes: -``` -nextflow run main.nf --samplesheet --custom_genome --custom_probes -``` -Replace `` with the actual path to your SampleSheet.csv file. The `--analysis` option should be set to `scflex-10x` to indicate the pipeline to use for analysis. - ## Note if using custom probes You will need to construct both a reference genome and a reference probe set. Until I have set up a guide for that, contact 10X for more information. + +# Workflow specific processing steps +* Generation of config.csv files a described here: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/multi#examples +* Running of cellranger as described here: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/multi#cellranger-multi \ No newline at end of file diff --git a/docs/sub-workflows/scmulti-10x.md b/docs/sub-workflows/scmulti-10x.md index 180eeb7..8269f1c 100644 --- a/docs/sub-workflows/scmulti-10x.md +++ b/docs/sub-workflows/scmulti-10x.md @@ -18,4 +18,8 @@ Other than standard parameters, following parameters needs to be defined: * mouse : Path to the mouse reference transcriptome * human_vdj : Path to the human vdj reference * mouse_vdj : Path to the mouse vdj reference -* COUNT_ARC : The path to the container which cellranger-arc is called from \ No newline at end of file +* COUNT_ARC : The path to the container which cellranger-arc is called from + +# Workflow specific processing steps +* Generation of config.csv files a described here: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/multi#examples +* Running of cellranger as described here: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/multi#cellranger-multi \ No newline at end of file diff --git a/docs/sub-workflows/scrna-10x.md b/docs/sub-workflows/scrna-10x.md index e2905de..5f6ea05 100644 --- a/docs/sub-workflows/scrna-10x.md +++ b/docs/sub-workflows/scrna-10x.md @@ -20,12 +20,6 @@ Explanation of each column: * **agg**: If you want to aggregate all the processed samples for visualization, set this column accordingly. * **pipeline**: This column specifies the pipeline that should be used for the sample. In this case, the value should be set to `scrna-10x`. -# How to Run - -To execute the workflow, use the following command: - -``` -nextflow run main.nf --samplesheet -``` - -Replace `` with the actual path to your SampleSheet.csv file. The `--analysis` option should be set to `scrna-10x` to indicate the pipeline to use for analysis. \ No newline at end of file +# Workflow specific processing steps +* Generation of config.csv as described here: https://support.10xgenomics.com/single-cell-vdj/software/pipelines/latest/using/multi#examples +* Running of cellranger multi on config files as described here https://support.10xgenomics.com/single-cell-vdj/software/pipelines/latest/using/multi#running-multi \ No newline at end of file diff --git a/docs/sub-workflows/scvisium-10x.md b/docs/sub-workflows/scvisium-10x.md index e5d75d1..3732eda 100644 --- a/docs/sub-workflows/scvisium-10x.md +++ b/docs/sub-workflows/scvisium-10x.md @@ -18,12 +18,5 @@ In the metadata folder you will put all the images that will be used by spaceran You will also need to download the slidefiles which can be found [here](https://support.10xgenomics.com/spatial-gene-expression/software/pipelines/latest/using/slidefile-download) -# How to Run - -To execute the workflow, use the following command: - -``` -nextflow run main.nf --samplesheet -``` - -Replace `` with the actual path to your SampleSheet.csv file. \ No newline at end of file +# Workflow specific processing steps +* `spaceranger count` on fastq files as described here: https://support.10xgenomics.com/spatial-gene-expression/software/pipelines/latest/using/count#count \ No newline at end of file From f348bc6a294f1e309222eaafb34df0e352362368 Mon Sep 17 00:00:00 2001 From: Fattigman Date: Wed, 16 Aug 2023 18:14:48 +0200 Subject: [PATCH 16/25] shortened stub data --- modules/cellranger/multi/main.nf | 26 ++------------------------ 1 file changed, 2 insertions(+), 24 deletions(-) diff --git a/modules/cellranger/multi/main.nf b/modules/cellranger/multi/main.nf index e3a4ff5..d7a21b3 100644 --- a/modules/cellranger/multi/main.nf +++ b/modules/cellranger/multi/main.nf @@ -49,29 +49,7 @@ Cells,VDJ T,,,Cells with productive V-J spanning pair,76.61% Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of reads,"172,468,563" Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of short reads skipped,0 Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 RNA read,91.1% -Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8%\'\'\' > \$sample_dir_1/metrics_summary.csv - -echo \'\'\'Category,Library Type,Grouped By,Group Name,Metric Name,Metric Value -Cells,Gene Expression,,,Cells,"5,727" -Cells,Gene Expression,,,Confidently mapped reads in cells,84.20% -Cells,Gene Expression,,,Median UMI counts per cell,"1,585" -Cells,Gene Expression,,,Median genes per cell,744 -Cells,Gene Expression,,,Median reads per cell,"16,412" -Cells,Gene Expression,,,Total genes detected,"14,371" -Cells,VDJ B,,,Cells with productive IGH contig,78.52% -Cells,VDJ B,,,Cells with productive IGK contig,70.10% -Cells,VDJ B,,,Cells with productive IGL contig,6.19% -Cells,VDJ B,,,"Cells with productive V-J spanning (IGK, IGH) pair",49.31% -Cells,VDJ B,,,"Cells with productive V-J spanning (IGL, IGH) pair",5.50% -Cells,VDJ B,,,Paired clonotype diversity,72.28 -Cells,VDJ T,,,Cells with productive TRA contig,79.03% -Cells,VDJ T,,,Cells with productive TRB contig,97.58% -Cells,VDJ T,,,"Cells with productive V-J spanning (TRA, TRB) pair",76.61% -Cells,VDJ T,,,Cells with productive V-J spanning pair,76.61% -Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of reads,"172,468,563" -Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of short reads skipped,0 -Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 RNA read,91.1% -Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8%\'\'\' > \$sample_dir_2/metrics_summary.csv - +Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8% +\'\'\' > \$sample_dir/metrics_summary.csv """ } \ No newline at end of file From 7a6361fe4976fc56d5c54d40e3f23577f00918e4 Mon Sep 17 00:00:00 2001 From: Fattigman Date: Wed, 16 Aug 2023 18:17:17 +0200 Subject: [PATCH 17/25] two plex stub data --- modules/cellranger/multi/main.nf | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/modules/cellranger/multi/main.nf b/modules/cellranger/multi/main.nf index d7a21b3..67de7f5 100644 --- a/modules/cellranger/multi/main.nf +++ b/modules/cellranger/multi/main.nf @@ -50,6 +50,30 @@ Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of reads,"172,468,563" Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of short reads skipped,0 Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 RNA read,91.1% Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8% -\'\'\' > \$sample_dir/metrics_summary.csv +\'\'\' > \$sample_dir_1/metrics_summary.csv + +echo \'\'\'Category,Library Type,Grouped By,Group Name,Metric Name,Metric Value +Cells,Gene Expression,,,Cells,"5,727" +Cells,Gene Expression,,,Confidently mapped reads in cells,84.20% +Cells,Gene Expression,,,Median UMI counts per cell,"1,585" +Cells,Gene Expression,,,Median genes per cell,744 +Cells,Gene Expression,,,Median reads per cell,"16,412" +Cells,Gene Expression,,,Total genes detected,"14,371" +Cells,VDJ B,,,Cells with productive IGH contig,78.52% +Cells,VDJ B,,,Cells with productive IGK contig,70.10% +Cells,VDJ B,,,Cells with productive IGL contig,6.19% +Cells,VDJ B,,,"Cells with productive V-J spanning (IGK, IGH) pair",49.31% +Cells,VDJ B,,,"Cells with productive V-J spanning (IGL, IGH) pair",5.50% +Cells,VDJ B,,,Paired clonotype diversity,72.28 +Cells,VDJ T,,,Cells with productive TRA contig,79.03% +Cells,VDJ T,,,Cells with productive TRB contig,97.58% +Cells,VDJ T,,,"Cells with productive V-J spanning (TRA, TRB) pair",76.61% +Cells,VDJ T,,,Cells with productive V-J spanning pair,76.61% +Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of reads,"172,468,563" +Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of short reads skipped,0 +Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 RNA read,91.1% +Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8% +\'\'\' > \$sample_dir_2/metrics_summary.csv + """ } \ No newline at end of file From e83e45c985128df358961e9986cdf27384bdb70f Mon Sep 17 00:00:00 2001 From: Fattigman Date: Mon, 28 Aug 2023 12:39:44 +0200 Subject: [PATCH 18/25] fixed output formatting error --- modules/cellranger/multi/main.nf | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/modules/cellranger/multi/main.nf b/modules/cellranger/multi/main.nf index 67de7f5..e3a4ff5 100644 --- a/modules/cellranger/multi/main.nf +++ b/modules/cellranger/multi/main.nf @@ -49,8 +49,7 @@ Cells,VDJ T,,,Cells with productive V-J spanning pair,76.61% Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of reads,"172,468,563" Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of short reads skipped,0 Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 RNA read,91.1% -Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8% -\'\'\' > \$sample_dir_1/metrics_summary.csv +Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8%\'\'\' > \$sample_dir_1/metrics_summary.csv echo \'\'\'Category,Library Type,Grouped By,Group Name,Metric Name,Metric Value Cells,Gene Expression,,,Cells,"5,727" @@ -72,8 +71,7 @@ Cells,VDJ T,,,Cells with productive V-J spanning pair,76.61% Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of reads,"172,468,563" Library,Gene Expression,Fastq ID,1a_522_3wbm,Number of short reads skipped,0 Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 RNA read,91.1% -Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8% -\'\'\' > \$sample_dir_2/metrics_summary.csv +Library,Gene Expression,Fastq ID,1a_522_3wbm,Q30 UMI,94.8%\'\'\' > \$sample_dir_2/metrics_summary.csv """ } \ No newline at end of file From 9639485b0ad4ba95d0355fb219ac9978fd48a30f Mon Sep 17 00:00:00 2001 From: Fattigman Date: Tue, 19 Sep 2023 23:30:30 +0200 Subject: [PATCH 19/25] update manifest --- templates/manifest.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/templates/manifest.html b/templates/manifest.html index 18465c0..06eeb83 100644 --- a/templates/manifest.html +++ b/templates/manifest.html @@ -83,7 +83,7 @@

Nextflow manifest

-

Pipeline release: 1.4.2

+

Pipeline release: 1.5.0

Subworkflow: xxSubWorkflowxx

Maintainer: jacob.karlstrom@med.lu.se

Description: Your data has been processed by the nextflow pipeline singleCellWorkflows. If you want more information on how your data was processed, follow the link below and navigate to your release!

From fca3090c00378d14db2d3218f387304f553af57c Mon Sep 17 00:00:00 2001 From: Fattigman Date: Wed, 20 Sep 2023 08:40:00 +0200 Subject: [PATCH 20/25] Now uses new delivery script --- modules/deliver/main.nf | 2 +- nextflow.config | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/modules/deliver/main.nf b/modules/deliver/main.nf index 46f053b..00af3db 100644 --- a/modules/deliver/main.nf +++ b/modules/deliver/main.nf @@ -9,7 +9,7 @@ process DELIVER_PROJ { script: """ - bash /projects/fs1/shared/Development_Github/Yggdrasil/bin/delivery.sh $params.outdir/$project_id $project_id jacob.karlstrom@med.lu.se + bash /projects/fs1/shared/Yggdrasil/bin/delivery.sh -d $params.outdir/$project_id -p $project_id -e $params.deliver_to """ stub: """ diff --git a/nextflow.config b/nextflow.config index f3c46ad..edd7700 100755 --- a/nextflow.config +++ b/nextflow.config @@ -44,7 +44,7 @@ params { intron_mode="true" ctg_mode="true" - nextflow_log = "/projects/fs1/shared/Logs/nextflow.log" + deliver_to = "henning.onsbring@med.lu.se" } From 8704a6b7155f5b2d464e8a77adb7e02f75a895f7 Mon Sep 17 00:00:00 2001 From: Fattigman Date: Wed, 20 Sep 2023 09:05:16 +0200 Subject: [PATCH 21/25] removed deprecated function --- modules/pack_websummaries/main.nf | 66 ------------------------------- subworkflows/finish.nf | 4 +- 2 files changed, 1 insertion(+), 69 deletions(-) delete mode 100644 modules/pack_websummaries/main.nf diff --git a/modules/pack_websummaries/main.nf b/modules/pack_websummaries/main.nf deleted file mode 100644 index 768fb1f..0000000 --- a/modules/pack_websummaries/main.nf +++ /dev/null @@ -1,66 +0,0 @@ -process PACK_WEBSUMMARIES{ - tag "$project_id" - - publishDir "$params.outdir/$project_id/3_summaries/cellranger", mode: 'move', pattern : "web_summaries.tar" - - input: - val project_id - output: - path "web_summaries.tar", emit: tarball - val project_id, emit: project_id - - shell: - ''' - #!/bin/bash - - # set the folder path - folder_path="$(readlink -f !{params.outdir}/!{project_id})" - - # create an array to store the file paths and sample names - file_paths=() - sample_names=() - - # loop through the folder and find all the web_summary.html files - while IFS= read -r -d $'\0' file; do - if [[ "$file" == *"web_summary.html" ]]; then - # append the file path and the sample name to the array - file_paths+=("$file") - dir=$(dirname "$file") - if ! [ $(dirname "$dir") == "per_sample_output" ]; then - dir=$(dirname "$dir") - fi - result=$(basename "$dir") - sample_names+=("$result") - echo $file $result - fi - done < <(find "$folder_path" -name "web_summary.html" -print0) - - # create a temporary directory to store the renamed files - temp_dir=$(mktemp -d) - - # loop through the file paths and rename them with unique names in the temporary directory - for ((i=0; i<${#file_paths[@]}; i++)); do - # generate a unique name for the file - unique_name=${sample_names[$i]}_web_summary.html - echo $unique_name - # copy the file to the temporary directory with the unique name - cp "${file_paths[$i]}" "$temp_dir/$unique_name" - # update the file path in the array - file_paths[$i]="$unique_name" - done - - # create a tarball with the renamed files in the temporary directory - tar_filename="web_summaries.tar" - tar cf "$tar_filename" -C "$temp_dir" "${file_paths[@]}" - - # extract the tarball to the current directory, removing the prefixed folder path - tar xf "$tar_filename" --strip-components=1 - - # remove the temporary directory - rm -rf "$temp_dir" - ''' - stub: - """ - touch web_summaries.tar - """ -} \ No newline at end of file diff --git a/subworkflows/finish.nf b/subworkflows/finish.nf index 5cf1a4d..a8d6cf7 100644 --- a/subworkflows/finish.nf +++ b/subworkflows/finish.nf @@ -16,10 +16,8 @@ workflow FINISH_PROJECTS { if ( params.ctg_mode == 'true'){ SYNC_MULTIQC(multiqc_ch.html_report, multiqc_ch.project_id) } - - webpack_ch = PACK_WEBSUMMARIES(multiqc_ch.project_id) - publish_ch = PUBLISH_MANIFEST(webpack_ch.project_id, workflow) + publish_ch = PUBLISH_MANIFEST(multiqc_ch.project_id, workflow) if ( params.ctg_mode == 'true'){ md5sum_ch = MD5SUM(publish_ch) deliver_auto_ch = DELIVER_PROJ(md5sum_ch.project_id) From d27eab371b25bc5a989282e6b7de3b19e97f0f55 Mon Sep 17 00:00:00 2001 From: Fattigman Date: Wed, 20 Sep 2023 09:41:28 +0200 Subject: [PATCH 22/25] removed deprecated import --- subworkflows/finish.nf | 1 - 1 file changed, 1 deletion(-) diff --git a/subworkflows/finish.nf b/subworkflows/finish.nf index a8d6cf7..280b5e7 100644 --- a/subworkflows/finish.nf +++ b/subworkflows/finish.nf @@ -4,7 +4,6 @@ include { SYNC_MULTIQC } from "../modules/ctg/sync_multiqc/main" include { PUBLISH_MANIFEST } from '../modules/publish_manifest/main' include { MULTIQC } from "../modules/multiqc/main" include { MD5SUM } from "../modules/md5sum/main" -include { PACK_WEBSUMMARIES } from "../modules/pack_websummaries/main" workflow FINISH_PROJECTS { take: From 57e210533e9ab9b0a65bfaabf23a180d19bc39bc Mon Sep 17 00:00:00 2001 From: Fattigman Date: Wed, 20 Sep 2023 09:42:34 +0200 Subject: [PATCH 23/25] renamed github workflow --- .github/workflows/main.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index c1cc8ee..4288566 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -4,7 +4,7 @@ on: branches: - master jobs: - example: + stub-run: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 From e5c4e1f52bde1a32af346c4a96025469a3dd534b Mon Sep 17 00:00:00 2001 From: Fattigman Date: Wed, 20 Sep 2023 09:57:20 +0200 Subject: [PATCH 24/25] removed workflow on complete --- main.nf | 17 ----------------- 1 file changed, 17 deletions(-) diff --git a/main.nf b/main.nf index b74518a..14fba9a 100755 --- a/main.nf +++ b/main.nf @@ -1,14 +1,5 @@ #!/usr/bin/env nextFlow nextflow.enable.dsl=2 -def writetofile(String text) { - def file = new File(params.nextflow_log) - def lastline = file.readLines().last() - def newNumber = lastline.split('#')[1].toInteger() + 1 - file.withWriterAppend { out -> - out.println(text+newNumber) - } -} - // Import modules include { GET_ANALYSISES } from "./modules/get_analysises/main.nf" // Import subworkflows @@ -50,11 +41,3 @@ workflow { VISIUM() } } - -workflow.onComplete { - if (workflow.success) { - writetofile("${new Date()} [Information] singleCellWorkflow $params.samplesheet completed successfully #") - } else { - writetofile("${new Date()} [Critical] singleCellWorkflow failed. $params.samplesheet #") - } -} \ No newline at end of file From 5457d16fb1ef82c86fb42c5eb77dbf939d0dd218 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jacob=20Karlstr=C3=B6m?= <32386766+Fattigman@users.noreply.github.com> Date: Wed, 20 Sep 2023 13:13:24 +0200 Subject: [PATCH 25/25] Update docs/10X-genomics/visium.md Co-authored-by: Henning Onsbring <66672810+henningonsbring@users.noreply.github.com> --- docs/10X-genomics/visium.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/10X-genomics/visium.md b/docs/10X-genomics/visium.md index b3c5b97..772507d 100644 --- a/docs/10X-genomics/visium.md +++ b/docs/10X-genomics/visium.md @@ -1,5 +1,5 @@ # Introduction -This is where I will document how to run spaceranger. It will start very manually and hopefully in the end mature into an automated pipeline. I will make no promises, but I plan to continously revise this guide as the workflow progress. +This is a document describing how to run spaceranger. The document will be continuously updated while the pipeline matures. # Spaceranger Count **Required input files**