Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error after running parallel on HPC #7

Open
saeedfc opened this issue Dec 18, 2022 · 1 comment
Open

Error after running parallel on HPC #7

saeedfc opened this issue Dec 18, 2022 · 1 comment

Comments

@saeedfc
Copy link

saeedfc commented Dec 18, 2022

Hi, I am trying to run the gsva on a seurat object with an assembled database form multiple origins. I already tested it with a subset of the seurat object and the subset of the database.
For instance, below works on the local machine;

> head(gene_annodf)

  gene_symbol gene_symbol               gs_name
1       ABCA1       ABCA1 HALLMARK_ADIPOGENESIS
2       ABCB8       ABCB8 HALLMARK_ADIPOGENESIS
3       ACAA2       ACAA2 HALLMARK_ADIPOGENESIS
4       ACADL       ACADL HALLMARK_ADIPOGENESIS
5       ACADM       ACADM HALLMARK_ADIPOGENESIS
6       ACADS       ACADS HALLMARK_ADIPOGENESIS
## subset of annotations
> table(gene_annodf[1:300,]$gs_name)

       HALLMARK_ADIPOGENESIS HALLMARK_ALLOGRAFT_REJECTION 
                         210                           90 

>gsva_res <- scgsva(
    fib[,1:50], ## SUBSET OF SEUOBJECT
    annot = gene_annodf[1:300,], ## SUBSET OF ANNOTATIONS
    kcdf = "Poisson",
    abs.ranking = FALSE,
    min.sz = 1,
    max.sz = 500,
    mx.diff = TRUE,
    method = "gsva",
    useTerm = TRUE,
    cores = 10,
    verbose = TRUE
)
> dim(gsva_res@gsva)
[1] 50  2
> head(gsva_res@gsva)
                             HALLMARK_ADIPOGENESIS HALLMARK_ALLOGRAFT_REJECTION
CC.C_1_Prox_AAAGAACGTGGATCGA           -0.18195536                   -0.2752013
CC.C_1_Prox_AAAGTCCAGAGTTGAT           -0.14447515                   -0.3477315
CC.C_1_Prox_AAAGTCCGTATCGAGG           -0.17582418                   -0.4758325
CC.C_1_Prox_AACAACCCAATTGAGA           -0.24332880                   -0.3421558
CC.C_1_Prox_AACACACCACAGTCGC           -0.05431398                   -0.4594946
CC.C_1_Prox_AACCACATCCAGCACG           -0.04031415                   -0.3908996




(Apparently, two columsn wouldn't work for the annotation data frame contrary to what was suggested in #6). After I tried to run this on the full seurat object and full annotation dataframe on the HPC with 35 cores (21 gb per core memory), it ran for 23 hours and threw an error which I do no understand. Help is greatly appreciated.

Input details

> dim(gene_annodf)
[1] 1823001       3
> length(unique(gene_annodf$gs_name))
[1] 18134
> head(gene_annodf)

  gene_symbol gene_symbol               gs_name
1       ABCA1       ABCA1 HALLMARK_ADIPOGENESIS
2       ABCB8       ABCB8 HALLMARK_ADIPOGENESIS
3       ACAA2       ACAA2 HALLMARK_ADIPOGENESIS
4       ACADL       ACADL HALLMARK_ADIPOGENESIS
5       ACADM       ACADM HALLMARK_ADIPOGENESIS
6       ACADS       ACADS HALLMARK_ADIPOGENESIS


> fib

An object of class Seurat 
33538 features across 20140 samples within 1 assay 
Active assay: RNA (33538 features, 0 variable features)
 1 dimensional reduction calculated: umap

> gsva_res <- scgsva(
    fib,
    annot = gene_annodf,
    kcdf = "Poisson",
    abs.ranking = FALSE,
    min.sz = 1,
    max.sz = 500,
    mx.diff = TRUE,
    method = "gsva",
    useTerm = TRUE,
    cores = 35,
    verbose = TRUE
)

Error!!

Attaching SeuratObject
iteration:  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99 100
Error in reducer$value.cache[[as.character(idx)]] <- values : 
  wrong args for environment subassignment
Calls: scgsva -> .sgsva
In addition: Warning message:
In asMethod(object) :
  sparse->dense coercion: allocating vector of size 4.1 GiB

Details from error (from the traceback I called to the output file from HPC)

Setting parallel calculations through a MulticoreParam back-end
with workers=35 and tasks=100.
Estimating GSVA scores for 18127 gene sets.
Estimating ECDFs with Poisson kernels
Estimating ECDFs in parallel on 35 cores

  |                                                                            
  |                                                                      |   0%19: (function () 
    {
        traceback(2, max.lines = 100)
        if (!interactive()) 
            quit(save = "no", status = 1, runLast = T)
    })()
18: .reducer_add(reducer, njob, value)
17: .reducer_add(reducer, njob, value)
16: .collect_result(manager, reducer, progress, BPPARAM)
15: .bploop_impl(ITER = ITER, FUN = FUN, ARGS = ARGS, BPPARAM = BPPARAM, 
        BPOPTIONS = BPOPTIONS, BPREDO = BPREDO, reducer = reducer, 
        progress.length = length(redo_index))
14: bploop.lapply(manager, BPPARAM = BPPARAM, BPOPTIONS = BPOPTIONS, 
        ...)
13: bploop(manager, BPPARAM = BPPARAM, BPOPTIONS = BPOPTIONS, ...)
12: .bpinit(manager = manager, X = X, FUN = FUN, ARGS = ARGS, BPPARAM = BPPARAM, 
        BPOPTIONS = BPOPTIONS, BPREDO = BPREDO)
11: bplapply(gset.idx.list, ks_test_m, gene.density = rank.scores, 
        sort.idxs = sort.sgn.idxs, mx.diff = mx.diff, abs.ranking = abs.ranking, 
        tau = tau, verbose = verbose, BPPARAM = BPPARAM)
10: bplapply(gset.idx.list, ks_test_m, gene.density = rank.scores, 
        sort.idxs = sort.sgn.idxs, mx.diff = mx.diff, abs.ranking = abs.ranking, 
        tau = tau, verbose = verbose, BPPARAM = BPPARAM)
9: compute.geneset.es(expr, gset.idx.list, 1:n.samples, rnaseq = rnaseq, 
       abs.ranking = abs.ranking, parallel.sz = parallel.sz, mx.diff = mx.diff, 
       tau = tau, kernel = kernel, verbose = verbose, BPPARAM = BPPARAM)
8: .gsva(expr, mapped.gset.idx.list, method, kcdf, rnaseq, abs.ranking, 
       parallel.sz, mx.diff, tau, kernel, ssgsea.norm, verbose, 
       BPPARAM)
7: .local(expr, gset.idx.list, ...)
6: gsva(input, annotation, method = method, kcdf = kcdf, tau = tau, 
       ssgsea.norm = ssgsea.norm, parallel.sz = cores, BPPARAM = SerialParam(progressbar = verbose))
5: gsva(input, annotation, method = method, kcdf = kcdf, tau = tau, 
       ssgsea.norm = ssgsea.norm, parallel.sz = cores, BPPARAM = SerialParam(progressbar = verbose))
4: withCallingHandlers(expr, warning = function(w) if (inherits(w, 
       classes)) tryInvokeRestart("muffleWarning"))
3: suppressWarnings(gsva(input, annotation, method = method, kcdf = kcdf, 
       tau = tau, ssgsea.norm = ssgsea.norm, parallel.sz = cores, 
       BPPARAM = SerialParam(progressbar = verbose)))
2: .sgsva(input = input, annotation = annotation, method = method, 
       kcdf = kcdf, abs.ranking = abs.ranking, min.sz = min.sz, 
       max.sz = max.sz, cores = cores, tau = tau, ssgsea.norm = ssgsea.norm, 
       verbose = verbose)
1: scgsva(fib, annot = gene_annodf, kcdf = "Poisson", abs.ranking = FALSE, 
       min.sz = 1, max.sz = 500, mx.diff = TRUE, method = "gsva", 
       useTerm = TRUE, cores = 35, verbose = TRUE)
@guokai8
Copy link
Owner

guokai8 commented Dec 18, 2022

Hi @saeedfc,
please try to use method="ssgsea" and to see if it works.
K

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants