Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of FlashWeave on meta-omic protein data? #40

Open
Rridley7 opened this issue Jun 3, 2024 · 1 comment
Open

Use of FlashWeave on meta-omic protein data? #40

Rridley7 opened this issue Jun 3, 2024 · 1 comment

Comments

@Rridley7
Copy link

Rridley7 commented Jun 3, 2024

Hello, thanks for your development of this tool! I was curious of your thoughts on using it on meta-omic datasets which comprise genetic information (e.g. metatranscriptomics, gene-level metagenomics)? My initial thoughts on this would be:

  • These datasets are obviously more sparse and potentially higher variance, but (could) still fall under the same statistical framework used here?
  • Is the code base scalable to the data size of this sort of analysis on an HPC (20+ million genes, several hundred samples)?
@jtackm
Copy link
Member

jtackm commented Jun 5, 2024

Hi,

Yes, the framework is applicable to many types of data, though it has only been properly benchmarked with OTU counts + meta variables. Scalability should in principle be no issue, I've run it on tables as large as 1mio samples x 100k variables. One thing to consider: the high dimensionality of your data and relatively low sample count could lead to power issues when running with default parameters (in particular max_k=3 would have to be relaxed to 2 or even 1).

The main consideration would be normalization, i.e. different types of data may require different normalization methods. I suggest manually normalizing your tables as appropriate and specifying normalize=false in learn_network. FlashWeave already includes a (poorly documented) feature to provide several independently normalized tables that should be useful for this, e.g.: learn_network([<norm_omic1_data_path>, <norm_omic2_data_path>], meta_data_path; normalize=false, <kwargs...>)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants