Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dev with changes in main #63

Merged
merged 5 commits into from
Nov 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions docs/src/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,20 @@ NCI60
data.omics.nci60_datatypes
data.omics.nci60_table

.. _api-phosphoegf:

Phospho-EGF meta-analysis
~~~~~
.. module::networkcommons.data.omics
.. currentmodule:: networkcommons

.. autosummary::
:toctree: api
:recursive:

data.omics.phospho_egf_datatypes
data.omics.phospho_egf_tables

.. _api-eval:

Evaluation and description
Expand Down
19 changes: 19 additions & 0 deletions docs/src/datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,25 @@ NCI60

.. _details-pk:


Phosphoproteomics in response to EGF
-----

**Alias:** PhosphoEGF

**Description:** A meta-analysis of phosphoproteomics data in response to EGF stimulation

**Publication Link:** `Garrido-Rodriguez et al. Evaluating signaling pathway inference from kinase-substrate interactions and phosphoproteomics data. bioRxiv (2024). <https://www.biorxiv.org/content/10.1101/2024.10.21.619348v1>`_

**Data location:** `Supplementary Data files of the manuscript <https://www.biorxiv.org/content/10.1101/2024.10.21.619348v1.supplementary-material>`_

**Detailed Description:** This dataset the results of a meta-analysis of phosphoproteomics data in response to EGF stimulation across different labs and stimulation times. The data is available at two different levels. First, the phosphosite differential abundance is provided for every combination of study and treatment time. In the table, 'This study' refers to the data generated in the manuscript. Second, we offer access to the kinase-level activities inerred using decoupleR and the different kinase-substrate networks described in the paper. Briefly, four different networks were employed: A first one based on literature (literature), one based on kinase-substrate interaction prediction via protein language models (phosformer), one based on positionl peptide array screening (kinlibrary) and a combination of all of them (combined).

**Functions:** See API documentation for :ref:`Phospho-EGF meta-analysis<api-phosphoegf>`.

.. _details-pk:


---------------
Prior Knowledge
---------------
Expand Down
6 changes: 6 additions & 0 deletions networkcommons/data/datasets.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,9 @@ omics:
It includes three files: TF activities from transcriptomics data,
metabolite abundances and gene reads.
path: NCI60/{cell_line}/{cell_line}__{data_type}.tsv
phosphoegf:
name: PhosphoEGF
description: Phosphoproteomics meta-analysis of the response to EGF stimulus
publication_link: https://www.biorxiv.org/content/10.1101/2024.10.21.619348v1
detailed_description: >-
This dataset contains phosphoproteomics data after EGF stimulus gathered and preprocessed from multiple studies.
1 change: 1 addition & 0 deletions networkcommons/data/omics/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,4 @@
from ._scperturb import *
from ._nci60 import *
from ._cptac import *
from ._phosphoegf import *
58 changes: 58 additions & 0 deletions networkcommons/data/omics/_phosphoegf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#!/usr/bin/env python

#
# This file is part of the `networkcommons` Python module
#
# Copyright 2024
# Heidelberg University Hospital
#
# File author(s): Saez Lab ([email protected])
#
# Distributed under the GPLv3 license
# See the file `LICENSE` or read a copy at
# https://www.gnu.org/licenses/gpl-3.0.txt
#

"""
Meta-analysis of phosphoproteomics response to EGF stimulus.
"""

import pandas as pd
import warnings

def phospho_egf_datatypes() -> pd.DataFrame:
"""
Table describing the available data types in the Phospho EGF dataset.

Returns:
DataFrame with all data types.
"""

return pd.DataFrame({
'type': ['phosphosite', 'kinase'],
'description': ['Differential phosphoproteomics at the site level for all studies in the meta-analysis',
'Kinase activities obtained using each of the kinase-substrate prior knowledge resources'],
})


def phospho_egf_tables(type='diffabundance'):
"""
A table with the corresponding data type for the phospho EGF dataset.

Args:
type:
Either 'diffabundance' or 'kinase_scores'.

Returns:
A DataFrame with the corresponding data.
"""

if type == 'phosphosite':
out_table = pd.read_csv('https://www.biorxiv.org/content/biorxiv/early/2024/10/22/2024.10.21.619348/DC3/embed/media-3.gz', compression='gzip', low_memory=False)
elif type == 'kinase':
out_table = pd.read_csv('https://www.biorxiv.org/content/biorxiv/early/2024/10/22/2024.10.21.619348/DC4/embed/media-4.gz', compression='gzip', low_memory=False)
else:
warnings.warn(f'Unknown data type "{type}"')
return None
return out_table

Loading