Bacon, W.A.‡,§,1,2,3, Hamilton, R.S.‡,2,3, Yu, Z.4, Kieckbusch, J.1,2, Hawkes, D.1, Krzak, A.1, Abell, C.4, Colucci, F.1,2 & Charnock-Jones, D.S.§,1,2
‡ Co-first authors,
§ Corresponding authors
1 Department of Obstetrics & Gynaecology,
2 Centre for Trophoblast Research,
3 Department of Physiology, Development, & Neuroscience, University of Cambridge, Downing Site, Cambridge, CB2 3DY,
4 Department of Chemistry
Bacon, W.A., Hamilton, R.S., Yu, Z., Kieckbusch, J., Hawkes, D., Krzak, A., Abell, C., Colucci, F. & Charnock-Jones, D.S. (2018) Single-Cell Analysis Identifies Thymic Maturation Delay in Growth-Restricted Neonatal Mice Front. Immunol., 9:2523 [DOI]
Preprint Version:
Bacon, W.A., Hamilton, R.S., Yu, Z., Kieckbusch, J., Hawkes, D., Krzak, A., Abell, C., Colucci, F. & Charnock-Jones, D.S. (2018) Single-Cell Analysis Identifies Thymic Maturation Delay in Growth-Restricted Neonatal Mice bioRxiv, 372862 [DOI]
If there are any questions or suggestions please submit an issue using the GitHub menu above otherwise contact rsh46 -at- cam.ac.uk for bioinformatics related queries
To be added on paper acceptance
All custom analysis scripts are freely available from https://github.com/CTR-BFX/2018_Bacon_Charnock-Jones including code to recreate the figures relating the single cell sequencing.
Note: The provided R scripts all assume the script is placed in a directory containing the DEMs and/or Robjects. The script can be run interactively in R-studio or as a batch using Rscript. Note that some of the figures in the manuscript have had some label positions moved manually to prevent overlaps. R package versions are listed in the table below.
Raw Fastq files are demultiplexed (dropseq_demultiplex.sh) using the Nextera indices and then converted to uBAM using PicardTools:FastqToSam (v2.9.0). Quality control, alignment (STAR v020201) gene quantification and final matrix generation were performed using DropSeqTools (v1.12 http://mccarrolllab.com/dropseq/). Alignments were performed against the mouse reference genome (mm10 available from http://mccarrolllab.com/dropseq/). The resulting digital expression matrix (DEM) was imported into Seurat (Butler et al, 2018)(v2.3.0) for downstream analysis. Initial thresholds of a minimum 200 genes per cell and genes must be present in at least 3 cells were applied.
All downstream analysis is performed using Seurat and has been split into two scripts. The first, dropseq_seurat_splitDEMs.R, performs the more computationally intensive tasks intended to be run on high performance computers, the Seurat object is saves in Robj format to be imported in to the, the second script, dropseq_seurat_splitDEMs_Plots.R, for plotting and figure creation. This second script is intended for running on a laptop.
Here we outline the Seurat pipeline used for the more computationally demanding steps dropseq_seurat_splitDEMs.R.
Two separate DEMs were calculated for the WT and WT+P0 samples. The WT only samples were used to calculate variable genes (FindVariableGenes), which were then used as input to generate the PCA (RunPCA), find clusters (FindClusters) and produce a tSNE (t-distributed stochastic neighbor embedding) visualisation (RunTSNE) from the combined WT and P0 sample DEM. FindClusters is run across multiple resolutions (0.2, 0.4, 0.6 0.8 and 1.0), each stored on the Seurat Object. Normalisation (NormalizeData), UMI and MT regression (FilterCells) were performed using Seurat. Cell cycle assignments were performed using SCRAN (Lun et al, 2016)(v1.6.9) on the combined WT+P0 DEM, using an intermediate SingleCellExperiment (v1.0.0) data structure, and then added back to the Seurat Object. Cell cycle genes were regressed out using a subtraction of G2M from S cell cycle scores per cell. The resulting Seurat data object is saved as an RObj for input into the plotting and differential analysis part of the pipeline.
Precalculated DEMs
Description | File Name |
---|---|
WT & KO | SLX-7632.XXXXXXXXXX.dge.txt.gz |
WT Only | SLX-7632.XXXXXXXXXX.WT.dge.txt.gz |
KO Only | SLX-7632.XXXXXXXXXX.KO.dge.txt.gz |
Here we outline the Seurat pipeline used for plotting and differential transcript identification dropseq_seurat_splitDEMs_Plots.R.
The RObj generated from the dropseq_seurat_splitDEMs.R is used to extract (e.g. with GetCellEmbeddings) the required data for each of the plots in the figure. Custom tSNE plots were generated using ggplot2. Transcript abundance dotplots were generated from AverageExpression extracted from the Seurat object and ggplot2. Cluster trees were generated using clustree (Zappia & Oshlack, 2018). Differential transcript analysis was performed by comparing each cluster (FindAllMarkers) and using an adjusted p value < 0.01. The heatmap (pHeatmap), used the same threshold, with just the top 20 genes for each cluster selected.
Figure | Output Filename | Description |
---|---|---|
2 | T-Cell.Figure.2.pdf | tSNE, classical cell markers, clustree and summary tSNE |
3 | T-Cell.Figure.3.pdf | Heatmap unsupervised clustering of top 20 genes per cluster |
4A | T-Cell.Figure.4A.pdf | tSNE with cells coloured by cell cycle assignments |
5B | T-Cell.Figure.5B.pdf | tSNE with cells coloured by averaged ribosomal protein transcript levels |
6A | T-Cell.Figure.6A.pdf | tSNE with cells coloured by genotype (WT and P0) |
7A | T-Cell.Figure.7A.pdf | Differential transcript levels by cell type |
7B | T-Cell.Figure.7B.pdf | Differential transcript levels WT/P0 per cell type cluster |
7C | T-Cell.Figure.7C.pdf | Comparison of WT/P0 per cluster transcripts |
Supp Fig 4 | T-Cell.Figure.Supp4.pdf | Sequencing coverage for the Igf2 genomic region |
Supp Fig 5 | T-Cell.Figure.Supp5.pdf | Cluster Mapping with and without cell cycle regression |
Supp Fig 6 | T-Cell.Figure.Supp6.pdf | Resolution Effects on Cluster Calling (tSNE & clustree) |
Supp Fig 7 | T-Cell.Figure.Supp7.pdf | Differential transcript levels by cluster size |
Supp Fig 8 | T-Cell.Figure.Supp8.pdf | Differential transcript levels WT/P0 per cluster (by size) |
Supp Fig 9 | T-Cell.Figure.Supp9.pdf | Gender assignment |
Supp Table 1 | T-Cell.Table.Supp1.xlsx | Table of Sequencing Metrics |
Supp Table 2 | T-Cell.SupplementalFiles.zip | Zip file of differential transcript level files |
Supp Movie 1 | T-Cell.RibosomeMovie.mpg | Ribosome structure with identified genes mapped on surface |
A custom tool was created to classify whether ribosomal proteins are exposed on the surface or are internal to the ribosome. See https://github.com/darogan/Ribosomal-Protein for more details. The output includes Pymol commands to render the structure showing the ribosomal proteins of interest.
ArrayExpress or GEO submission E-MTAB-6945
Seq ID | Index | Experiment | #Cells |
---|---|---|---|
SLX-7632 | N701 | WT | 1220 |
SLX-7632 | N704 | WT | 1341 |
SLX-7632 | N705 | WT | 813 |
SLX-7632 | N706 | WT | 1305 |
SLX-7632 | N702 | P0 | 1300 |
SLX-7632 | N703 | P0 | 675 |
SLX-7632 | N707 | P0 | 610 |
Expt | IDX | Genotype | Total Reads | Unique (#) | Unique (%) | Multi (#) | Multi (%) | Many (#) | Many (%) | Unmapped (#) | Unmapped (%) |
---|---|---|---|---|---|---|---|---|---|---|---|
SLX-7632 | N701 | WT | 32468733 | 26055586 | 80.2 | 3351483 | 10.3 | 346732 | 1.1 | 2714932 | 8.4 |
SLX-7632 | N702 | PO | 20962102 | 16847146 | 80.4 | 1817857 | 8.7 | 214847 | 1 | 2082252 | 9.9 |
SLX-7632 | N703 | PO | 11749198 | 9081883 | 77.3 | 1357304 | 11.6 | 121328 | 1 | 1188683 | 10.1 |
SLX-7632 | N704 | WT | 32297193 | 25720288 | 79.6 | 3029288 | 9.4 | 274716 | 0.9 | 3272901 | 10.1 |
SLX-7632 | N705 | WT | 77791423 | 56701446 | 72.9 | 11964824 | 15.4 | 672228 | 0.9 | 8452925 | 10.9 |
SLX-7632 | N706 | WT | 26006979 | 18412761 | 70.8 | 4057400 | 15.6 | 399552 | 1.5 | 3137266 | 12.1 |
SLX-7632 | N707 | PO | 26807105 | 18534932 | 69.1 | 5136585 | 19.2 | 235983 | 0.9 | 2899605 | 10.8 |
Cluster By Size | Cluster Paper Number | WT | P0 | Cell Type |
---|---|---|---|---|
0 | 2 | 964 | 745 | DP |
1 | 3 | 730 | 743 | DP |
2 | 6 | 961 | 231 | TMat |
3 | 4 | 974 | 48 | DP |
4 | 1 | 448 | 456 | DN |
5 | 5 | 554 | 293 | DP |
6 | 8 | 28 | 57 | RBC |
7 | 7 | 20 | 12 | Macrophage |
Description | URL |
---|---|
Publications | bioRxiv Frontiers In Immunology |
Raw Data | ArrayExpress EMBL-EBI E-MTAB-6945 |
Resource | URL |
---|---|
DropSeqTools | Link |
Mouse Genome | Link |
FastQC | Link |
MultiQC | DOI |
BBMap | Link |
STAR | Link |
clustree | Link |
RibosomeStructure.pl | GitHub |
Butler A, Hoffman P, Smibert P, Papalexi E, & Satija R (2018) Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nature Biotechnology 36, 411–420 DOI
Lun ATL, McCarthy DJ, Marioni JC (2016). A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor. F1000Res., 5, 2122. DOI
Zappia, L. & Oshlack, A. (2018) Clustering trees: a visualisation for evaluating clusterings at multiple resolutions. GigaScience, giy083, DOI
Details for the R version and packages used to create all figures
> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] tidyr_0.8.0 pheatmap_1.0.8 clustree_0.1.2 viridis_0.5.1 viridisLite_0.3.0 scater_1.6.3 scran_1.6.9
[8] SingleCellExperiment_1.0.0 SummarizedExperiment_1.8.1 DelayedArray_0.4.1 matrixStats_0.53.1 Biobase_2.38.0 GenomicRanges_1.30.3 GenomeInfoDb_1.14.0
[15] IRanges_2.12.0 S4Vectors_0.16.0 BiocGenerics_0.24.0 BiocParallel_1.12.0 biomaRt_2.34.2 reshape2_1.4.3 useful_1.2.3
[22] Seurat_2.3.0 Matrix_1.2-14 cowplot_0.9.2 ggraph_1.0.1 ggplot2_2.2.1 dplyr_0.7.4
loaded via a namespace (and not attached):
[1] shinydashboard_0.7.0 R.utils_2.6.0 tidyselect_0.2.4 RSQLite_2.1.0 AnnotationDbi_1.40.0 htmlwidgets_1.2 grid_3.4.4 trimcluster_0.1-2
[9] ranger_0.9.0 Rtsne_0.13 munsell_0.4.3 codetools_0.2-15 ica_1.0-1 units_0.5-1 DT_0.4 statmod_1.4.30
[17] withr_2.1.2 colorspace_1.3-2 knitr_1.20 rstudioapi_0.7 geometry_0.3-6 ROCR_1.0-7 robustbase_0.92-8 dtw_1.18-1
[25] dimRed_0.1.0 lars_1.2 tximport_1.6.0 GenomeInfoDbData_1.0.0 mnormt_1.5-5 bit64_0.9-7 rhdf5_2.22.0 ipred_0.9-6
[33] diptest_0.75-7 R6_2.2.2 ggbeeswarm_0.6.0 VGAM_1.0-5 locfit_1.5-9.1 flexmix_2.3-14 DRR_0.0.3 bitops_1.0-6
[41] assertthat_0.2.0 promises_1.0.1 SDMTools_1.1-221 scales_0.5.0 nnet_7.3-12 beeswarm_0.2.3 gtable_0.2.0 ddalpha_1.3.2
[49] timeDate_3043.102 rlang_0.2.0 CVST_0.2-1 scatterplot3d_0.3-41 RcppRoll_0.2.2 splines_3.4.4 lazyeval_0.2.1 ModelMetrics_1.1.0
[57] acepack_1.4.1 broom_0.4.4 checkmate_1.8.5 yaml_2.1.18 abind_1.4-5 backports_1.1.2 httpuv_1.4.0 Hmisc_4.1-1
[65] caret_6.0-79 tools_3.4.4 lava_1.6.1 psych_1.8.3.3 gplots_3.0.1 RColorBrewer_1.1-2 proxy_0.4-22 dynamicTreeCut_1.63-1
[73] ggridges_0.5.0 Rcpp_0.12.16 plyr_1.8.4 base64enc_0.1-3 progress_1.1.2 zlibbioc_1.24.0 purrr_0.2.4 RCurl_1.95-4.10
[81] prettyunits_1.0.2 rpart_4.1-13 pbapply_1.3-4 zoo_1.8-1 sfsmisc_1.1-2 ggrepel_0.7.0 cluster_2.0.7-1 magrittr_1.5
[89] data.table_1.10.4-3 lmtest_0.9-36 RANN_2.5.1 mvtnorm_1.0-7 fitdistrplus_1.0-9 xtable_1.8-2 mime_0.5 XML_3.98-1.11
[97] mclust_5.4 gridExtra_2.3 compiler_3.4.4 tibble_1.4.2 KernSmooth_2.23-15 R.oo_1.21.0 htmltools_0.3.6 later_0.7.1
[105] segmented_0.5-3.0 Formula_1.2-2 snow_0.4-2 udunits2_0.13 tclust_1.3-1 lubridate_1.7.4 DBI_0.8 diffusionMap_1.1-0
[113] tweenr_0.1.5 magic_1.5-8 MASS_7.3-49 fpc_2.1-11 R.methodsS3_1.7.1 gdata_2.18.0 metap_0.8 bindr_0.1.1
[121] gower_0.1.2 igraph_1.2.1 pkgconfig_2.0.1 sn_1.5-1 numDeriv_2016.8-1 foreign_0.8-69 recipes_0.1.2 foreach_1.4.4
[129] vipor_0.4.5 XVector_0.18.0 prodlim_2018.04.18 stringr_1.3.0 digest_0.6.15 tsne_0.1-3 htmlTable_1.11.2 edgeR_3.20.9
[137] kernlab_0.9-25 shiny_1.0.5 gtools_3.5.0 modeltools_0.2-21 rjson_0.2.15 nlme_3.1-137 bindrcpp_0.2.2 limma_3.34.9
[145] pillar_1.2.1 lattice_0.20-35 httr_1.3.1 DEoptimR_1.0-8 survival_2.42-3 glue_1.2.0 FNN_1.1 png_0.1-7
[153] prabclus_2.2-6 iterators_1.0.9 bit_1.1-12 ggforce_0.1.1 class_7.3-14 stringi_1.1.7 mixtools_1.1.0 blob_1.1.1
[161] doSNOW_1.0.16 latticeExtra_0.6-28 caTools_1.17.1 memoise_1.1.0 irlba_2.3.2 ape_5.1