Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalization issues with data imported from bam files #193

Open
perinom opened this issue Sep 21, 2021 · 4 comments
Open

Normalization issues with data imported from bam files #193

perinom opened this issue Sep 21, 2021 · 4 comments

Comments

@perinom
Copy link

perinom commented Sep 21, 2021

Installed smoothly on 21/09/2021 after #192 had been addressed

Unfortunately 'exp <- normalize_counts(exp, data_type='tss', method="DESeq2")' fails with:

Warning in `[.data.table`(x, , !c("normalized_score")) : 
column(s) not removed because not found: [normalized_score]

exp <- normalize_counts(exp, data_type='tss', method="edgeR") fails with:

Aggregate function missing, defaulting to 'length'
Warning in `[.data.table`(x, , !c("normalized_score")) :  column(s) not removed because not found: [normalized_score]

exp <- normalize_counts(exp, data_type='tss', method="CMP") fails with:

Aggregate function missing, defaulting to 'length'

In all the cases the slot with normalised data is missing, only the raw counts are available in the exp object

Using the sample bam included method="DESeq2" and method="edgeR" fail with:

Error: count_matrix is not a matrix

method="CPM" fails with:

Warning in eval(jsub, SDenv, parent.frame()) :
  NAs introduced by coercion
Warning in `[.data.table`(x, , !c("normalized_score")) :
  column(s) not removed because not found: [normalized_score]
@gzentner
Copy link
Collaborator

gzentner commented Nov 8, 2021

Hi there, sorry for the long silence (we both have new jobs and have been quite busy!). I ran through the workflow using some of our in-house BAMs and while I do get

Warning in [.data.table(x, , !c("normalized_score")) : column(s) not removed because not found: [normalized_score]

When normalizing, the normalized counts are there. There isn't a specific slot for the normalized counts; rather, they are included in exp@counts$TSSs$raw to avoid duplicating all the information associated with the raw counts. We will discuss renaming that slot to avoid confusion.

I do get the same errors when working through the data from the vignette, we will look into that. Thanks for your patience!

@perinom
Copy link
Author

perinom commented Nov 11, 2021

Alright, thanks for the clarification, I'll double check the values before and after.

I was confused because many functions downstream have an argument normalized which led me to expect the raw data to be stored next to the normalised ones for the functions to choose depending on the call.

Since the raw data are overwritten should I expect downstream functions to use normalized counts even with normalized = FALSE, which is the default in all of them?

@perinom
Copy link
Author

perinom commented Nov 11, 2021

I do see the normalized_score column, indeed.

However

exp <- apply_threshold(exp, 
                       threshold=5
                       n_samples=1
                       use_normalized = TRUE) # default FALSE

results in
Error in eval(jsub, SDenv, parent.frame()) : object 'normalized_score' not found

which, combined with the error from normalize_counts() in the original message led me to think the issue was with the normalization function.

use_normalized = FALSE runs w/o issues but I'm a bit hesitant to proceed as I'm not sure what's being used here.

Please let me know if you prefer to have this as a separate issue

@gzentner
Copy link
Collaborator

Both "score" and "normalized_score" are present in the exp@counts$TSSs$raw slot and so the raw data isn't overwritten; however, I do get the same issue when attempting to apply the threshold. We'll sort it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants