UnPaSt

UnPaSt is a novel method for identification of differentially expressed biclusters.

Requirements:

Python (version 3.8.16):
    fisher==0.1.9
    jenkspy==0.2.0
    pandas==1.3.5
    python-louvain==0.15
    matplotlib==3.7.1
    seaborn==0.11.1
    numba==0.51.2
    numpy==1.22.3
    scikit-learn==1.2.2
    scikit-network==0.24.0
    scipy==1.7.1
    statsmodels==0.13.2
    lifelines==0.27.4

R (version 4.3.1):
    WGCNA==1.70-3
    limma==3.42.2

Installation tips

It is recommended to use "BiocManager" for the installation of WGCNA:

install.packages("BiocManager")
library(BiocManager)
BiocManager::install("WGCNA")

Examples

UnPaSt requires a tab-separated file with features (e.g. genes) in rows, and samples in columns. Feature and sample names must be unique.

cd test;
mkdir -p results;

# running UnPaSt with default parameters and example data
python ../run_unpast.py --exprs scenario_B500.exprs.tsv.gz --basename results/scenario_B500

# with different binarization and clustering methods
python ../run_unpast.py --exprs scenario_B500.exprs.tsv.gz --basename results/scenario_B500 --binarization ward --clustering Louvain

# help
python run_unpast.py -h

Outputs

<basename>.[parameters].biclusters.tsv - a .tsv table with found biclsuters, where
- the first line starts from '#' and stores parameters
- each following line represents a bicluster
- SNR column contains SNR of a bicluster
- columns "n_genes" and "n_samples" provide the numbers of genes and samples, respectively
- "gene","sample" contain gene and sample names respectively
- "gene_indexes" and "sample_indexes" - 0-based gene and sample indexes in the input matrix.
binarized expressions, background distributions of SNR for each bicluster size and binarization statistics [if clustering is WGCNA, or '--save_binary' flag is added]

About

UnPaSt is an unconstrained version of DESMOND method (repository, publication)

Major modifications:

it does not require the network of feature interactions
UnPaSt clusters individual features instead of pairs of features
uses 2-means, hierarchicla clustering or GMM for binarization of individual gene expressions
SNR threshold for featuer selection is authomatically determined; it depends on bicluster size in samples and user-defined p-value cutoff

Name		Name	Last commit message	Last commit date
Latest commit History 296 Commits
data		data
evaluation		evaluation
figures		figures
poster		poster
test		test
test_old		test_old
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Th2_asthma_performance.ipynb		Th2_asthma_performance.ipynb
UnPaSt_example.ipynb		UnPaSt_example.ipynb
__init__.py		__init__.py
consensus.ipynb		consensus.ipynb
evaluation_on_real_data.ipynb		evaluation_on_real_data.ipynb
evaluation_on_simulated_data.ipynb		evaluation_on_simulated_data.ipynb
method.py		method.py
requirements.txt		requirements.txt
run_desmond.py		run_desmond.py
run_unpast.py		run_unpast.py
run_unpast_on_subsampled.ipynb		run_unpast_on_subsampled.ipynb
setup.py		setup.py
simulated_data.ipynb		simulated_data.ipynb
survival_and_association_with_sex.ipynb		survival_and_association_with_sex.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UnPaSt

Requirements:

Installation tips

Examples

Outputs

About

About

Releases

Packages

Languages

License

fpatroni/DESMOND2

Folders and files

Latest commit

History

Repository files navigation

UnPaSt

Requirements:

Installation tips

Examples

Outputs

About

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages