Skip to content

Commit

Permalink
update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
sky1ove committed Feb 23, 2024
1 parent d9daef1 commit d4c738f
Showing 1 changed file with 37 additions and 70 deletions.
107 changes: 37 additions & 70 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,38 @@
# Katlas
# KATLAS


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

<img alt="Katlas logo" width="500" caption="Katlas logo" src="dataset/images/logo.png" id="logo"/>
<img alt="Katlas logo" width="1600" caption="Katlas logo" src="dataset/images/logo.png" id="logo"/>

KATLAS is a repository containing python tools to predict kinases given
a substrate sequence. It also contains data: phosphophrylation sites,
and human phosphoproteomics.

***References***: Please cite the appropriate papers if KATLAS is
helpful to your research.

- KATLAS was initially described in the paper \[Decoding Human Kinome
Specificities through a Computational Data-Driven Approach
(manuscript)\]

- The position-specific scoring matrices (PSSMs) of human kinome derived
from positional scanning peptide array (PSPA) is based on the paper
[An atlas of substrate specificities for the human serine/threonine
kinome](https://www.nature.com/articles/s41586-022-05575-3)

- The kinase substrate datasets used for generating PSSMs are derived
from
[PhosphoSitePlus](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245126/)
and paper [Large-scale Discovery of Substrates of the Human
Kinome](https://www.nature.com/articles/s41598-019-46385-4)

- Phosphorylation sites are acquired from
[PhosphoSitePlus](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245126/),
paper [The functional landscape of the human
phosphoproteome](https://www.nature.com/articles/s41587-019-0344-3),
and [CPTAC](https://pdc.cancer.gov/pdc/cptac-pancancer) /
[LinkedOmics](https://academic.oup.com/nar/article/46/D1/D956/4607804)

## Install

Expand Down Expand Up @@ -52,21 +81,6 @@ df = Data.get_ochoa_site().head()
df.iloc[:,-2:]
```

<div>

<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
&#10; .dataframe tbody tr th {
vertical-align: top;
}
&#10; .dataframe thead th {
text-align: right;
}
</style>

| | site_seq | gene_site |
|-----|-----------------|----------------|
| 0 | VDDEKGDSNDDYDSA | A0A075B6Q4_S24 |
Expand All @@ -75,9 +89,7 @@ df.iloc[:,-2:]
| 3 | KSRFTEYSMTSSVMR | A0A075B6Q4_S68 |
| 4 | FTEYSMTSSVMRRNE | A0A075B6Q4_S71 |

</div>

</div>

``` python
results = predict_kinase_df(df,'site_seq',**param4)
Expand All @@ -91,20 +103,7 @@ results

100%|██████████| 289/289 [00:00<00:00, 6727.01it/s]

<div>

<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
&#10; .dataframe tbody tr th {
vertical-align: top;
}
&#10; .dataframe thead th {
text-align: right;
}
</style>


| kinase | SRC | EPHA3 | FES | NTRK3 | ALK | EPHA8 | ABL1 | FLT3 | EPHB2 | FYN | ... | MEK5 | PKN2 | MAP2K7 | MRCKB | HIPK3 | CDK8 | BUB1 | MEKK3 | MAP2K3 | GRK1 |
|--------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|-----|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|
Expand All @@ -114,10 +113,8 @@ results
| 3 | 0.803826 | 0.836527 | 0.800759 | 0.894570 | 0.839905 | 0.781001 | 0.847847 | 0.807040 | 0.805877 | 0.801402 | ... | 1.110307 | 1.703637 | 1.795092 | 1.469653 | 1.549936 | 1.491344 | 1.446922 | 1.055452 | 1.534895 | 1.741090 |
| 4 | 0.822793 | 0.796532 | 0.792343 | 0.839882 | 0.810122 | 0.781420 | 0.805251 | 0.795022 | 0.790380 | 0.864538 | ... | 1.062617 | 1.357689 | 1.485945 | 1.249266 | 1.456078 | 1.422782 | 1.376471 | 1.089629 | 1.121309 | 1.697524 |

<p>5 rows × 289 columns</p>
</div>
5 rows × 289 columns

</div>

### Input sequences are in phosphorylated status

Expand Down Expand Up @@ -157,20 +154,7 @@ results

100%|██████████| 289/289 [00:00<00:00, 6732.43it/s]

<div>

<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
&#10; .dataframe tbody tr th {
vertical-align: top;
}
&#10; .dataframe thead th {
text-align: right;
}
</style>


| kinase | SRC | EPHA3 | FES | NTRK3 | ALK | EPHA8 | ABL1 | FLT3 | EPHB2 | FYN | ... | MEK5 | PKN2 | MAP2K7 | MRCKB | HIPK3 | CDK8 | BUB1 | MEKK3 | MAP2K3 | GRK1 |
|--------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|-----|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|
Expand All @@ -180,10 +164,8 @@ results
| 3 | 0.646383 | 0.701740 | 0.676063 | 0.782448 | 0.729290 | 0.648563 | 0.697185 | 0.678297 | 0.687545 | 0.651732 | ... | 1.030239 | 1.558569 | 1.619387 | 1.321234 | 1.394759 | 1.366344 | 1.306156 | 0.971825 | 1.384048 | 1.692299 |
| 4 | 0.695749 | 0.706868 | 0.697352 | 0.766069 | 0.739252 | 0.682100 | 0.703865 | 0.712811 | 0.703658 | 0.744865 | ... | 1.039890 | 1.324106 | 1.344163 | 1.215167 | 1.363295 | 1.400055 | 1.283447 | 1.042570 | 1.098709 | 1.661380 |

<p>5 rows × 289 columns</p>
</div>
5 rows × 289 columns

</div>

### To replicate the results from PSPA

Expand Down Expand Up @@ -226,20 +208,7 @@ results

100%|██████████| 303/303 [00:00<00:00, 6683.64it/s]

<div>

<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
&#10; .dataframe tbody tr th {
vertical-align: top;
}
&#10; .dataframe thead th {
text-align: right;
}
</style>


| kinase | AAK1 | ACVR2A | ACVR2B | AKT1 | AKT2 | AKT3 | ALK2 | ALK4 | ALPHAK3 | AMPKA1 | ... | VRK1 | VRK2 | WNK1 | WNK3 | WNK4 | YANK2 | YANK3 | YSK1 | YSK4 | ZAK |
|--------|------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
Expand All @@ -249,10 +218,8 @@ results
| 3 | -4.849113 | 2.271636 | 2.057240 | -2.886034 | -2.379830 | -3.634907 | 1.547144 | 2.735341 | -2.825794 | -1.696886 | ... | -2.757951 | -1.699232 | -1.725384 | -0.091196 | -0.672545 | 0.313278 | -0.207212 | -2.315848 | -0.053572 | -1.117657 |
| 4 | -6.596842 | -1.387696 | -0.956218 | -2.834231 | -3.794276 | -4.968521 | -1.862002 | -1.717226 | -2.653170 | -3.514512 | ... | -1.546328 | -1.457323 | -1.277532 | 0.510635 | -1.045845 | -0.314193 | -1.023331 | -2.482345 | -2.227114 | -1.592725 |

<p>5 rows × 303 columns</p>
</div>
5 rows × 303 columns

</div>

## Datasets of phosphorylation sites

Expand Down

0 comments on commit d4c738f

Please sign in to comment.