diff --git a/docs/src/contents.rst b/docs/src/contents.rst index 1153937..98c70f5 100644 --- a/docs/src/contents.rst +++ b/docs/src/contents.rst @@ -33,6 +33,7 @@ NetworkCommons: Table of Contents :caption: Vignettes vignettes/1_quickstart + vignettes/2_multiple_methods vignettes/2_moon vignettes/3_evaluation_decryptm vignettes/4_cptac_phosphoactivity \ No newline at end of file diff --git a/docs/src/vignettes/2_multiple_methods.ipynb b/docs/src/vignettes/2_multiple_methods.ipynb new file mode 100644 index 0000000..508d677 --- /dev/null +++ b/docs/src/vignettes/2_multiple_methods.ipynb @@ -0,0 +1,3521 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Vignette 2: Using multiple methods to infer the networks" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Welcome to our second vignette! Here we assume you already know the basics of `Networkcommons`. If you don't, please check our [Introduction vignette](1_quickstart.html)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this vignette, we will expand on the first one and perform network inference using several methods included in this package. We will start from the most simple, topological methods, to then move on to more complex ones, such as diffusion-like and integer-linear-programming based ones. You can find a more detailed description of the methods in the [dedicated section](../methods.html)" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import networkcommons as nc\n", + "import pandas as pd" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Loading preprocessed transcriptomics data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Like in the previous vignette, we will use a specific contrast from the [PANACEA](../datasets.html#panacea) (Afatinib versus DMSO in ASPC cell line) to extract the transcription factors that are dysregulated in this scenario." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "dc_estimates = nc.data.omics.panacea_tables(cell_line='ASPC', drug='AFATINIB', type='TF_scores')\n", + "dc_estimates.set_index('items', inplace=True)\n", + "measurements = nc.utils.targetlayer_formatter(dc_estimates, act_col='act')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, we have our set of TF measurements, that encodes the effects of the drug-induced perturbation." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Afatinib targets EGFR by inhibition, so we will create a source dictionary containing the origin of perturbation, EGFR, and the sign of the perturbation (negative, therefore -1)." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "source_df = pd.DataFrame({'source': ['EGFR'], \n", + " 'sign': [-1]}, columns=['source', 'sign'])\n", + "source_df.set_index('source', inplace=True)\n", + "sources = source_df['sign'].to_dict()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Network inference" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We have now our set of TF measurements, which we will use as footprints of the perturbation induced by Afatinib. We will use these to contextualised a general PPI network retrieved from OmniPath." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "network = nc.data.network.get_omnipath()\n", + "graph = nc.utils.network_from_df(network)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Afatinib targets EGFR by inhibition, so we will create a source dictionary containing the origin of perturbation, EGFR, and the sign of the perturbation (negative, therefore -1)." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "source_df = pd.DataFrame({'source': ['EGFR'], \n", + " 'sign': [-1]}, columns=['source', 'sign'])\n", + "source_df.set_index('source', inplace=True)\n", + "sources = source_df['sign'].to_dict()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 2.1 Topological methods: shortest paths, all paths" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "NetworkCommons also includes several topological methods. These methods follow very simple assumptions, and therefore are well suited to be compared against more advanced methodologies." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Shortest paths**\n", + "\n", + "This method retrieves the shortest path between source and target nodes. In the case where there are many paths with the same length, all these paths will be retrieved." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "shortest_path_network, shortest_paths_list = nc.methods.run_shortest_paths(graph, sources, measurements)" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "EGFR\n", + "\n", + "\n", + "\n", + "\n", + "MAP2K1\n", + "\n", + "MAP2K1\n", + "\n", + "\n", + "\n", + "EGFR->MAP2K1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ERBB2\n", + "\n", + "ERBB2\n", + "\n", + "\n", + "\n", + "EGFR->ERBB2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PTPN1\n", + "\n", + "PTPN1\n", + "\n", + "\n", + "\n", + "EGFR->PTPN1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC\n", + "\n", + "PRKDC\n", + "\n", + "\n", + "\n", + "EGFR->PRKDC\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1\n", + "\n", + "PIK3R1\n", + "\n", + "\n", + "\n", + "EGFR->PIK3R1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "E2F1\n", + "\n", + "E2F1\n", + "\n", + "\n", + "\n", + "EGFR->E2F1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "STAT1\n", + "\n", + "STAT1\n", + "\n", + "\n", + "\n", + "EGFR->STAT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PKIA\n", + "\n", + "PKIA\n", + "\n", + "\n", + "\n", + "EGFR->PKIA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ESR1\n", + "\n", + "ESR1\n", + "\n", + "\n", + "\n", + "EGFR->ESR1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK3\n", + "\n", + "MAPK3\n", + "\n", + "\n", + "\n", + "MAP2K1->MAPK3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1\n", + "\n", + "MAPK1\n", + "\n", + "\n", + "\n", + "MAP2K1->MAPK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B\n", + "\n", + "GSK3B\n", + "\n", + "\n", + "\n", + "MAP2K1->GSK3B\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CDK1\n", + "\n", + "CDK1\n", + "\n", + "\n", + "\n", + "ERBB2->CDK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ABL1\n", + "\n", + "ABL1\n", + "\n", + "\n", + "\n", + "PTPN1->ABL1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "AKT1\n", + "\n", + "AKT1\n", + "\n", + "\n", + "\n", + "PRKDC->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIC1\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->HIC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CHEK1\n", + "\n", + "CHEK1\n", + "\n", + "\n", + "\n", + "E2F1->CHEK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP73\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CREBBP\n", + "\n", + "CREBBP\n", + "\n", + "\n", + "\n", + "STAT1->CREBBP\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "STAT1->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKACA\n", + "\n", + "PRKACA\n", + "\n", + "\n", + "\n", + "PKIA->PRKACA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "RXRA\n", + "\n", + "RXRA\n", + "\n", + "\n", + "\n", + "ESR1->RXRA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SP3\n", + "\n", + "\n", + "\n", + "\n", + "MAPK3->SP3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1->SP3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CSNK2A1\n", + "\n", + "CSNK2A1\n", + "\n", + "\n", + "\n", + "MAPK1->CSNK2A1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SFPQ\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B->SFPQ\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CDK1->CSNK2A1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ATR\n", + "\n", + "ATR\n", + "\n", + "\n", + "\n", + "ABL1->ATR\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIPK2\n", + "\n", + "HIPK2\n", + "\n", + "\n", + "\n", + "ABL1->HIPK2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PHF20\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->PHF20\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CHEK1->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CREBBP->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKACA->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "NR1H4\n", + "\n", + "\n", + "\n", + "\n", + "RXRA->NR1H4\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "FOSB\n", + "\n", + "\n", + "\n", + "\n", + "CSNK2A1->FOSB\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "KMT2A\n", + "\n", + "KMT2A\n", + "\n", + "\n", + "\n", + "ATR->KMT2A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MECP2\n", + "\n", + "\n", + "\n", + "\n", + "HIPK2->MECP2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "FOXC2\n", + "\n", + "\n", + "\n", + "\n", + "KMT2A->FOXC2\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + ">" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "visualizer = nc.visual.NetworkXVisualizer(shortest_path_network)\n", + "visualizer.visualize_network(sources, measurements)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can use the signs of the network to add an additional layer of constrain to the problem by removing the paths which are not coherent in terms of sign. The algorithm computes an overall sign of the path by multiplying the signs of the edges contained in said path, and then evaluates whether perturbation_sign * path_sign = measurement_sign. If it does not, the path is discarded." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "shortest_sc_network, shortest_sc_list = nc.methods.run_sign_consistency(shortest_path_network, shortest_paths_list, sources, measurements)" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "EGFR\n", + "\n", + "\n", + "\n", + "\n", + "ERBB2\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->ERBB2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PTPN1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PTPN1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PRKDC\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PIK3R1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "E2F1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->E2F1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "STAT1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->STAT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PKIA\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PKIA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CDK1\n", + "\n", + "\n", + "\n", + "\n", + "ERBB2->CDK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ABL1\n", + "\n", + "\n", + "\n", + "\n", + "PTPN1->ABL1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "AKT1\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIC1\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->HIC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CHEK1\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->CHEK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CREBBP\n", + "\n", + "\n", + "\n", + "\n", + "STAT1->CREBBP\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "STAT1->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKACA\n", + "\n", + "\n", + "\n", + "\n", + "PKIA->PRKACA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CSNK2A1\n", + "\n", + "\n", + "\n", + "\n", + "CDK1->CSNK2A1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ATR\n", + "\n", + "\n", + "\n", + "\n", + "ABL1->ATR\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PHF20\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->PHF20\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "CHEK1->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CREBBP->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKACA->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "FOSB\n", + "\n", + "\n", + "\n", + "\n", + "CSNK2A1->FOSB\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "KMT2A\n", + "\n", + "\n", + "\n", + "\n", + "ATR->KMT2A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "FOXC2\n", + "\n", + "\n", + "\n", + "\n", + "KMT2A->FOXC2\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + ">" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "visualizer = nc.visual.NetworkXVisualizer(shortest_sc_network)\n", + "visualizer.visualize_network(sources, measurements, network_type='sign_consistent')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We used the measurements to select which paths we keep and which paths don't make sense, given our observed measurements. For cases in which we don't have a sign of a measurement, we can infer the signs by evaluating the agreement of the paths on the sign of the downstream measurement. For this, we must not provide the `run_sign_consistency()` function with a target dictionary. It will then infer the signs of the downstream layer and then return the dictionary with the inferred signs." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "No target sign provided. Inferring target signs by majority consensus.\n" + ] + } + ], + "source": [ + "shortest_sc_network_inferred, shortest_sc_list_inferred, inferred_signs = nc.methods.run_sign_consistency(shortest_path_network, shortest_paths_list, sources)" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "EGFR\n", + "\n", + "\n", + "\n", + "\n", + "MAP2K1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->MAP2K1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PTPN1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PTPN1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PRKDC\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PIK3R1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "E2F1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->E2F1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "STAT1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->STAT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PKIA\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PKIA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ESR1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->ESR1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK3\n", + "\n", + "\n", + "\n", + "\n", + "MAP2K1->MAPK3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1\n", + "\n", + "\n", + "\n", + "\n", + "MAP2K1->MAPK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B\n", + "\n", + "\n", + "\n", + "\n", + "MAP2K1->GSK3B\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ABL1\n", + "\n", + "\n", + "\n", + "\n", + "PTPN1->ABL1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "AKT1\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIC1\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->HIC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CHEK1\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->CHEK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP73\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CREBBP\n", + "\n", + "\n", + "\n", + "\n", + "STAT1->CREBBP\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "STAT1->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKACA\n", + "\n", + "\n", + "\n", + "\n", + "PKIA->PRKACA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "RXRA\n", + "\n", + "\n", + "\n", + "\n", + "ESR1->RXRA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SP3\n", + "\n", + "\n", + "\n", + "\n", + "MAPK3->SP3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1->SP3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CSNK2A1\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1->CSNK2A1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SFPQ\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B->SFPQ\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ATR\n", + "\n", + "\n", + "\n", + "\n", + "ABL1->ATR\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIPK2\n", + "\n", + "\n", + "\n", + "\n", + "ABL1->HIPK2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PHF20\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->PHF20\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "CHEK1->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CREBBP->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKACA->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "NR1H4\n", + "\n", + "\n", + "\n", + "\n", + "RXRA->NR1H4\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "FOSB\n", + "\n", + "\n", + "\n", + "\n", + "CSNK2A1->FOSB\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "KMT2A\n", + "\n", + "\n", + "\n", + "\n", + "ATR->KMT2A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MECP2\n", + "\n", + "\n", + "\n", + "\n", + "HIPK2->MECP2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "FOXC2\n", + "\n", + "\n", + "\n", + "\n", + "KMT2A->FOXC2\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + ">" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "visualizer = nc.visual.NetworkXVisualizer(shortest_sc_network_inferred)\n", + "visualizer.visualize_network(sources, inferred_signs, network_type='sign_consistent')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**All paths**\n", + "\n", + "As an alternative method, we can retrieve all possible paths (within a limit, due to computational constrains) between source and measurements layers. This might help us capture additional biological information which might not be available when considering more restrictive methods, such as shortest paths." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [], + "source": [ + "all_paths_network, all_paths_list = nc.methods.run_all_paths(graph, sources, measurements, depth_cutoff=3)" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "EGFR\n", + "\n", + "\n", + "\n", + "\n", + "MAP2K1\n", + "\n", + "MAP2K1\n", + "\n", + "\n", + "\n", + "EGFR->MAP2K1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC\n", + "\n", + "PRKDC\n", + "\n", + "\n", + "\n", + "EGFR->PRKDC\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1\n", + "\n", + "PIK3R1\n", + "\n", + "\n", + "\n", + "EGFR->PIK3R1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "E2F1\n", + "\n", + "E2F1\n", + "\n", + "\n", + "\n", + "EGFR->E2F1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "STAT1\n", + "\n", + "STAT1\n", + "\n", + "\n", + "\n", + "EGFR->STAT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PKIA\n", + "\n", + "PKIA\n", + "\n", + "\n", + "\n", + "EGFR->PKIA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "JAK1\n", + "\n", + "JAK1\n", + "\n", + "\n", + "\n", + "EGFR->JAK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PCNA\n", + "\n", + "PCNA\n", + "\n", + "\n", + "\n", + "EGFR->PCNA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PTPN1\n", + "\n", + "PTPN1\n", + "\n", + "\n", + "\n", + "EGFR->PTPN1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ERBB2\n", + "\n", + "ERBB2\n", + "\n", + "\n", + "\n", + "EGFR->ERBB2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ESR1\n", + "\n", + "ESR1\n", + "\n", + "\n", + "\n", + "EGFR->ESR1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK3\n", + "\n", + "MAPK3\n", + "\n", + "\n", + "\n", + "MAP2K1->MAPK3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1\n", + "\n", + "MAPK1\n", + "\n", + "\n", + "\n", + "MAP2K1->MAPK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B\n", + "\n", + "GSK3B\n", + "\n", + "\n", + "\n", + "MAP2K1->GSK3B\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "AKT1\n", + "\n", + "AKT1\n", + "\n", + "\n", + "\n", + "PRKDC->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP53\n", + "\n", + "TP53\n", + "\n", + "\n", + "\n", + "PRKDC->TP53\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MDM2\n", + "\n", + "MDM2\n", + "\n", + "\n", + "\n", + "PRKDC->MDM2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->TP53\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIC1\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->HIC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CHEK1\n", + "\n", + "CHEK1\n", + "\n", + "\n", + "\n", + "E2F1->CHEK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CASP3\n", + "\n", + "CASP3\n", + "\n", + "\n", + "\n", + "E2F1->CASP3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP73\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "STAT1->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CREBBP\n", + "\n", + "CREBBP\n", + "\n", + "\n", + "\n", + "STAT1->CREBBP\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ARNT\n", + "\n", + "ARNT\n", + "\n", + "\n", + "\n", + "STAT1->ARNT\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKACA\n", + "\n", + "PRKACA\n", + "\n", + "\n", + "\n", + "PKIA->PRKACA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "JAK1->STAT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "EP300\n", + "\n", + "EP300\n", + "\n", + "\n", + "\n", + "PCNA->EP300\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ABL1\n", + "\n", + "ABL1\n", + "\n", + "\n", + "\n", + "PTPN1->ABL1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKCD\n", + "\n", + "PRKCD\n", + "\n", + "\n", + "\n", + "PTPN1->PRKCD\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CDK1\n", + "\n", + "CDK1\n", + "\n", + "\n", + "\n", + "ERBB2->CDK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "RXRA\n", + "\n", + "RXRA\n", + "\n", + "\n", + "\n", + "ESR1->RXRA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK3->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SP3\n", + "\n", + "\n", + "\n", + "\n", + "MAPK3->SP3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1->SP3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SFPQ\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B->SFPQ\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PHF20\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->PHF20\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP53->HIC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP53->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP53->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MDM2->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MDM2->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CHEK1->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CHEK1->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CASP3->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CREBBP->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ARNT->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKACA->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "EP300->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ABL1->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKCD->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CDK1->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "NR1H4\n", + "\n", + "\n", + "\n", + "\n", + "RXRA->NR1H4\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + ">" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "visualizer = nc.visual.NetworkXVisualizer(all_paths_network)\n", + "visualizer.visualize_network(sources, measurements)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Like in the previous case, we can apply an additional constrain by removing non-coherent paths in terms of signed interactions." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [], + "source": [ + "allpaths_sc_network, allpaths_sc_list = nc.methods.run_sign_consistency(all_paths_network, all_paths_list, sources, measurements)" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "EGFR\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PRKDC\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PIK3R1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "E2F1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->E2F1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "STAT1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->STAT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PKIA\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PKIA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "JAK1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->JAK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PCNA\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PCNA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PTPN1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PTPN1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "AKT1\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP53\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC->TP53\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MDM2\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC->MDM2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->TP53\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIC1\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->HIC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CHEK1\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->CHEK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "STAT1->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CREBBP\n", + "\n", + "\n", + "\n", + "\n", + "STAT1->CREBBP\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKACA\n", + "\n", + "\n", + "\n", + "\n", + "PKIA->PRKACA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "JAK1->STAT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "EP300\n", + "\n", + "\n", + "\n", + "\n", + "PCNA->EP300\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ABL1\n", + "\n", + "\n", + "\n", + "\n", + "PTPN1->ABL1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKCD\n", + "\n", + "\n", + "\n", + "\n", + "PTPN1->PRKCD\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PHF20\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->PHF20\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP53->HIC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP53->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MDM2->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP73\n", + "\n", + "\n", + "\n", + "\n", + "MDM2->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "CHEK1->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CREBBP->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKACA->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "EP300->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ABL1->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKCD->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + ">" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "visualizer = nc.visual.NetworkXVisualizer(allpaths_sc_network)\n", + "visualizer.visualize_network(sources, measurements, network_type='sign_consistent')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 2.2 Diffusion-like methods: Personalised PageRank" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Several other methods follow a heat diffusion philosophy: \"heat\" propagates using the network topology and the \"hottest\" nodes are in the perturbation/measurement layers, cooling down the further away from these layers a node is.\n", + "\n", + "In our approach, we use personalised PageRank (PPR), in which the probabilities of the random walker are altered, as a computationally inexpensive approach to this philosophy.\n", + "\n", + "We compute PPR values starting from both perturbation and measurement layers, and then by applying a threshold (top % nodes with highest PPR value) we retrieve only the network which is most accessible from these two layers." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [], + "source": [ + "ppr_network = nc.methods.add_pagerank_scores(graph, sources, measurements, personalize_for='source')\n", + "ppr_network = nc.methods.add_pagerank_scores(ppr_network, sources, measurements, personalize_for='target')\n", + "\n", + "ppr_network = nc.methods.compute_ppr_overlap(ppr_network, percentage=1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, we can use a path-recover method, such as shortest paths or all paths, to retrieve a smaller subnetwork." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [], + "source": [ + "shortest_ppr_network, shortest_ppr_list = nc.methods.run_shortest_paths(ppr_network, sources, measurements)" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "EGFR\n", + "\n", + "\n", + "\n", + "\n", + "MAP2K1\n", + "\n", + "MAP2K1\n", + "\n", + "\n", + "\n", + "EGFR->MAP2K1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ERBB2\n", + "\n", + "ERBB2\n", + "\n", + "\n", + "\n", + "EGFR->ERBB2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PTPN1\n", + "\n", + "PTPN1\n", + "\n", + "\n", + "\n", + "EGFR->PTPN1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1\n", + "\n", + "PIK3R1\n", + "\n", + "\n", + "\n", + "EGFR->PIK3R1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC\n", + "\n", + "PRKDC\n", + "\n", + "\n", + "\n", + "EGFR->PRKDC\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "E2F1\n", + "\n", + "E2F1\n", + "\n", + "\n", + "\n", + "EGFR->E2F1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "STAT1\n", + "\n", + "STAT1\n", + "\n", + "\n", + "\n", + "EGFR->STAT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ESR1\n", + "\n", + "ESR1\n", + "\n", + "\n", + "\n", + "EGFR->ESR1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK3\n", + "\n", + "MAPK3\n", + "\n", + "\n", + "\n", + "MAP2K1->MAPK3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1\n", + "\n", + "MAPK1\n", + "\n", + "\n", + "\n", + "MAP2K1->MAPK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B\n", + "\n", + "GSK3B\n", + "\n", + "\n", + "\n", + "MAP2K1->GSK3B\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CDK1\n", + "\n", + "CDK1\n", + "\n", + "\n", + "\n", + "ERBB2->CDK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ABL1\n", + "\n", + "ABL1\n", + "\n", + "\n", + "\n", + "PTPN1->ABL1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "AKT1\n", + "\n", + "AKT1\n", + "\n", + "\n", + "\n", + "PIK3R1->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIC1\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->HIC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP73\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "STAT1->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "RXRA\n", + "\n", + "RXRA\n", + "\n", + "\n", + "\n", + "ESR1->RXRA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SP3\n", + "\n", + "\n", + "\n", + "\n", + "MAPK3->SP3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1->SP3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CSNK2A1\n", + "\n", + "CSNK2A1\n", + "\n", + "\n", + "\n", + "MAPK1->CSNK2A1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SFPQ\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B->SFPQ\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CDK1->CSNK2A1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ATR\n", + "\n", + "ATR\n", + "\n", + "\n", + "\n", + "ABL1->ATR\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIPK2\n", + "\n", + "HIPK2\n", + "\n", + "\n", + "\n", + "ABL1->HIPK2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PHF20\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->PHF20\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "NR1H4\n", + "\n", + "\n", + "\n", + "\n", + "RXRA->NR1H4\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "FOSB\n", + "\n", + "\n", + "\n", + "\n", + "CSNK2A1->FOSB\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "KMT2A\n", + "\n", + "KMT2A\n", + "\n", + "\n", + "\n", + "ATR->KMT2A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MECP2\n", + "\n", + "\n", + "\n", + "\n", + "HIPK2->MECP2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "FOXC2\n", + "\n", + "\n", + "\n", + "\n", + "KMT2A->FOXC2\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + ">" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "visualizer = nc.visual.NetworkXVisualizer(shortest_ppr_network)\n", + "visualizer.visualize_network(sources, measurements)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can also add sign consistency checks:" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [], + "source": [ + "shortest_sc_ppr_network, shortest_sc_ppr_list = nc.methods.run_sign_consistency(shortest_ppr_network, shortest_ppr_list, sources, measurements)" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "EGFR\n", + "\n", + "\n", + "\n", + "\n", + "ERBB2\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->ERBB2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PTPN1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PTPN1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PIK3R1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->PRKDC\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "E2F1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->E2F1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "STAT1\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->STAT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CDK1\n", + "\n", + "\n", + "\n", + "\n", + "ERBB2->CDK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ABL1\n", + "\n", + "\n", + "\n", + "\n", + "PTPN1->ABL1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "AKT1\n", + "\n", + "\n", + "\n", + "\n", + "PIK3R1->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKDC->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIC1\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->HIC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "STAT1->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CSNK2A1\n", + "\n", + "\n", + "\n", + "\n", + "CDK1->CSNK2A1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ATR\n", + "\n", + "\n", + "\n", + "\n", + "ABL1->ATR\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PHF20\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->PHF20\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "FOSB\n", + "\n", + "\n", + "\n", + "\n", + "CSNK2A1->FOSB\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "KMT2A\n", + "\n", + "\n", + "\n", + "\n", + "ATR->KMT2A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "FOXC2\n", + "\n", + "\n", + "\n", + "\n", + "KMT2A->FOXC2\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + ">" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "visualizer = nc.visual.NetworkXVisualizer(shortest_sc_ppr_network)\n", + "visualizer.visualize_network(sources, measurements, network_type='sign_consistent')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 2.3 Integer linear programming: CORNETO" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Last, we will use Carnival via CORNETO, which incorporates more advanced approaches. However, it is the most computationally expensive of all the methods included in `Networkcommons`. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + "\n", + "**Warning**\n", + " \n", + "This section of the tutorial uses CORNETO, a package specialised in Integer Linear programming for network inference. Some CORNETO methods (such as Carnival) depend on GUROBI, a third-party solver that holds a commercial license. Therefore, in order to run this part of the code, you will have to install a license in your system. Please check their home page for more information. \n", + "\n", + "
" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "corneto_network = nc.methods.run_corneto_carnival(graph, sources, measurements, betaWeight=0.01, solver='GUROBI')" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "AKT1\n", + "\n", + "AKT1\n", + "\n", + "\n", + "\n", + "SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->SMARCC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B\n", + "\n", + "GSK3B\n", + "\n", + "\n", + "\n", + "AKT1->GSK3B\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PHF20\n", + "\n", + "\n", + "\n", + "\n", + "AKT1->PHF20\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B->NFKB1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PHLPP1\n", + "\n", + "PHLPP1\n", + "\n", + "\n", + "\n", + "GSK3B->PHLPP1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SFPQ\n", + "\n", + "\n", + "\n", + "\n", + "GSK3B->SFPQ\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "PRKCA\n", + "\n", + "PRKCA\n", + "\n", + "\n", + "\n", + "PHLPP1->PRKCA\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "NR1H4\n", + "\n", + "\n", + "\n", + "\n", + "PRKCA->NR1H4\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1\n", + "\n", + "MAPK1\n", + "\n", + "\n", + "\n", + "PRKCA->MAPK1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ABL1\n", + "\n", + "ABL1\n", + "\n", + "\n", + "\n", + "PRKCA->ABL1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "CSNK2A1\n", + "\n", + "CSNK2A1\n", + "\n", + "\n", + "\n", + "MAPK1->CSNK2A1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "SP3\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1->SP3\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "MAPK1->HIF1A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ATR\n", + "\n", + "ATR\n", + "\n", + "\n", + "\n", + "ABL1->ATR\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "TP73\n", + "\n", + "\n", + "\n", + "\n", + "ABL1->TP73\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "FOSB\n", + "\n", + "\n", + "\n", + "\n", + "CSNK2A1->FOSB\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "KMT2A\n", + "\n", + "KMT2A\n", + "\n", + "\n", + "\n", + "ATR->KMT2A\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "FOXC2\n", + "\n", + "\n", + "\n", + "\n", + "KMT2A->FOXC2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "ATM\n", + "\n", + "ATM\n", + "\n", + "\n", + "\n", + "ATM->AKT1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIPK2\n", + "\n", + "HIPK2\n", + "\n", + "\n", + "\n", + "ATM->HIPK2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "MECP2\n", + "\n", + "\n", + "\n", + "\n", + "HIPK2->MECP2\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "XPC\n", + "\n", + "XPC\n", + "\n", + "\n", + "\n", + "XPC->ATM\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "E2F1\n", + "\n", + "E2F1\n", + "\n", + "\n", + "\n", + "E2F1->XPC\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "HIC1\n", + "\n", + "\n", + "\n", + "\n", + "E2F1->HIC1\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "EGFR\n", + "\n", + "\n", + "\n", + "\n", + "EGFR->E2F1\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + ">" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "visualizer = nc.visual.NetworkXVisualizer(corneto_network)\n", + "visualizer.visualize_network(sources, measurements)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "And we're done! As you can see, the networks look quite different depending on the methods used to contextualise the PKN. However, how do we know which one is the most \"correct\" one? In the next vignette, we will start exploring the last module in `Networkcommons` we have not talked about: **Evaluation**, where we will try out some strategies to evaluate the performance of the methods. See you soon!" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "networkcommons-DX9y6Uxu-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}