Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

undefined columns selected and ‘round’ not meaningful for factors #102

Open
antonkratz opened this issue Mar 13, 2024 · 4 comments
Open
Labels
bug Something isn't working

Comments

@antonkratz
Copy link

antonkratz commented Mar 13, 2024

I can run ggpicrust2 with the provided example data (but not plot it, made a separate entry for that #101) but I am struggling to get it work with my own data.

Here is my code

library(readr)
library(ggpicrust2)
library(tibble)
library(tidyverse)
library(ggprism)
library(patchwork)
library(ggpicrust2)

metadata <- read_delim("/home/kratz/my_meta.tsv",
delim = "\t",
    escape_double = FALSE,
    trim_ws = TRUE)

abundance_data <- read_delim("/home/kratz/path_abun_unstrat.tsv",
    delim = "\t",
    col_names = TRUE,
    trim_ws = TRUE)

results_file_input <- ggpicrust2(data = abundance_data,
                                 metadata = metadata,
                                 group = "biological_sex",
                                 pathway = "MetaCyc",
                                 daa_method = "LinDA",
                                 ko_to_kegg = FALSE,
                                 order = "pathway_class",
                                 p_values_bar = TRUE,
                                 x_lab = "pathway_name")

However this to Error in [.data.frame(daa_results_df, , x_lab) : undefined columns selected!

Starting the ggpicrust2 analysis...

Reading input file or using provided data...

Performing pathway differential abundance analysis...

Sample names extracted.
Identifying matching columns in metadata...
Matching columns identified: sample_name . This is important for ensuring data consistency.
Using all columns in abundance.
Converting abundance to a matrix...
Reordering metadata...
Converting metadata to a matrix and data frame...
Extracting group information...
Running LinDA analysis...
Performing LinDA analysis...
0  features are filtered!
The filtered data has  118  samples and  389  features will be tested!
Pseudo-count approach is used.
Fit linear models ...
Completed.
Processing LinDA results...
LinDA analysis is complete.
Annotating pathways...

Starting pathway annotation...
DAA results data frame is not null. Proceeding...
KO to KEGG is set to FALSE. Proceeding with standard workflow...
Loading MetaCyc reference data...
Returning DAA results data frame...
Creating pathway error bar plots...

Error in `[.data.frame`(daa_results_df, , x_lab) : 
  undefined columns selected
In addition: Warning message:
In MicrobiomeStat::linda(abundance, LinDA_metadata_df, formula = "~Group_group_nonsense_",  :
  Some features have less than 3 nonzero values! 
                                                They have virtually no statistical power. You may consider filtering them in the analysis!

Therefore, I follow the step-by-step approach, start a new R session, load the same libraries and then:

kegg_abundance <- ko2kegg_abundance("/home/kratz/path_abun_unstrat.tsv")
metadata <- read_delim("/home/kratz/my_meta.tsv", delim = "\t", escape_double = FALSE, trim_ws = TRUE)
daa_results_df <- pathway_daa(abundance = kegg_abundance, metadata = metadata, group = "biological_sex", daa_method = "ALDEx2", select = NULL, reference = NULL)

Which results in:

Sample names extracted.
Identifying matching columns in metadata...
Matching columns identified: sample_name . This is important for ensuring data consistency.
Using all columns in abundance.
Converting abundance to a matrix...
Reordering metadata...
Converting metadata to a matrix and data frame...
Extracting group information...
Running ALDEx2 with two groups. Performing t-test...
Error in Math.factor(c(2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L,  :
  ‘round’ not meaningful for factors

I made extra sure that the entries in metadata, first column, precisely match the column names of the actual data frame

Please help.

I am using R version 4.3.1

@antonkratz antonkratz added the bug Something isn't working label Mar 13, 2024
@cafferychen777
Copy link
Owner

Dear Anton,

Thank you for reaching out and providing detailed information about the issues you're encountering with ggpicrust2. It seems like there might be some inconsistencies or specific characteristics in your data that are causing these errors.

To better assist you, would it be possible for you to send your data (both the metadata and abundance data) to my email at [email protected]? This will allow me to take a closer look and potentially identify the root cause of the problems.

Please ensure that any sensitive information is removed or anonymized before sending the data. I appreciate your cooperation and look forward to helping you resolve these issues.

Best regards,
Caffery Yang

@cafferychen777
Copy link
Owner

cafferychen777 commented Mar 21, 2024

Dear @antonkratz,

Thank you for your patience. After reviewing your data and the errors you encountered, I have identified a solution for the issues you reported with ggpicrust2.

Regarding the "‘round’ not meaningful for factors" error, it seems like this issue is related to the ALDEx2 method. As a temporary workaround, you could try using a different differential abundance analysis (DAA) method, such as "DESeq2" or "edgeR", to see if the issue persists. Alternatively, you could try the solution provided by another user, which involves installing an older version of ALDEx2 (v.1.28) from the Bioconductor archive.

Please try these suggestions and let me know if they resolve the issues. If you continue to encounter problems, feel free to reach out again, and I'll be happy to assist further.

Best regards,
Caffery Yang

@jrhaulung
Copy link

Thank you for your quick reply!

The error occurs with ALDEx2_1.34.0 but also with ALDEx2_1.28.0.

@jrdickey9
Copy link

Howdy all, been trying to resolve a round error "not meaningful for factors" myself and have had no luck. My group is indeed characters and not factors. Using ALDEx2 v 1.37.0 and 1.36.0 the issue still arises. Interestingly, when I have ran a ALDEx2 analysis outside of ggpicrust2 I did not have this error. Going to ditch ALDEx2 (bummer) here as there isn't a solid solution and move on to DESeq2 (given #56, #67, #107). If I'm reading the comment above correctly, v.1.28.0 still produces the error... is this correct? I would rather make progress trying the ggpicrust2 suggested pipeline than troubleshooting version errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants