Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update readme #140

Merged
merged 3 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Try to use the following format:
- Fixed wrong models when chromosome X was named `chrX` and not `X`
- Added GitHub Actions workflows for automatic publishing to PyPI on release, and keep a changelog reminder ([#136](https://github.com/Clinical-Genomics/genmod/pull/136))
- Optional user defined threshold and penalty for compound scoring
- Update README with current github.io docs page

## [3.8.3]

Expand Down
2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ identifiers:
type: doi
value: "10.5281/zenodo.3841142"
license: MIT Licence
repository-code: "https://github.com/moonso/genmod"
repository-code: "https://github.com/Clinical-Genomics/genmod"
90 changes: 23 additions & 67 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@


[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3841142.svg)](https://doi.org/10.5281/zenodo.3841142)
[![Build Status](https://travis-ci.org/moonso/vcf_parser.svg)](https://travis-ci.org/moonso/genmod)
![Build Status - GitHub][actions-build-status]


**GENMOD** is a simple to use command line tool for annotating and analyzing genomic variations in the [VCF](http://samtools.github.io/hts-specs/VCFv4.1.pdf) file format.
Expand All @@ -17,26 +17,24 @@ The tools in the genmod suite are:
- **genmod score**, Score the variants of a vcf based on their annotation
- **genmod filter**, Filter the variants of a vcf based on their annotation

##Installation:##
## Installation

**GENMOD**

pip install genmod

or

git clone https://github.com/moonso/genmod.git
git clone https://github.com/Clinical-Genomics/genmod.git
cd genmod
python setup.py install


## USAGE: ##
## Usage

<!-- TODO change documentation link -->
*This is an overview, for more in depth documentation see [documentation](http://moonso.github.io/genmod/)*
*This is an overview, for more in depth documentation see [documentation](https://Clinical-Genomics.github.io/genmod)*


### Example: ###
### Example


The following command should work when installed successfully. The files are distributed with the package.
Expand Down Expand Up @@ -139,30 +137,25 @@ $genmod models <vcf_file> -f/--family_file <family.ped>

```

<!-- ###genmod annotate###

#### genmod annotate

```
genmod annotate variant_file.vcf
```

This will print a new vcf to standard out with all variants annotated according to the statements below.
All individuals described in the [ped](http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml#ped) file must be present in the vcf file

See examples in the folder ```genmod/examples```.

**From version 1.9 genmod can split multiallelic calls in vcf:s, use flag -split/--split_variants.**
**From version 1.9 genmod can split multiallelic calls in VCFs: use flag `-split/--split_variants`.**

To get an example of how splitting variants work, run genmod on the file ```examples/multi_allele_example.vcf``` with the dominant trio.
That is:
```genmod annotate examples/multi_allele_example.vcf -f examples/dominant_trio.ped -split```

Compare the result when not using the ```-split``` flag.

Genmod is distributed with a annotation database that is built from the refGene data.
If the user wants to build a new annotation set use the command below:

genmod build_annotation [--type] annotation_file


Each variant in the VCF-file will be annotated with which genetic models that are followed in the family if a family file
(ped file) is provided.

Expand All @@ -184,7 +177,7 @@ It is possible to run without a family file, in this case all variants will be a

[Variant Effect Predictor](http://www.ensembl.org/info/docs/tools/vep/index.html)(vep) annotations are supported, use the ```--vep```-flag if variants are already annotated with vep.

**GENMOD** will add entrys to the INFO column for the given VCF file depending on what information is given.
**GENMOD** will add entries to the INFO column for the given VCF file depending on what information is given.

If ```--vep``` is NOT provided:

Expand Down Expand Up @@ -222,36 +215,7 @@ All annotations will be present only if they have a value.
- If you want canonical splice site region to be bigger than 2 base pairs on each side of the exons, use `-splice/--splice_padding <integer>`
- The `-strict/--strict` flag tells **genmod** to only annotate genetic models if they are proved by the data. If a variant is not called in a family member it will not be annotated.


###genmod build_annotation###

genmod build_annotation [--type] [-o/--outdir] annotation_file

The following file formats are supported for building new annotations:

- bed
- ccds
- gtf
- gene_pred

The user can also specify the amount of positions around exon boundaries that should be considered as splice sites. Use

```--splice_padding INTEGER```

###genmod analyze###

From version 1.6 there is also a tool for analyzing the variants annotated by **genmod**. This tool will look at all variants in a vcf and do an analysis based on which inheritance patterns they follow. The variants are then ranked based on the cadd scores, the highest ranked variants for each category is printed to screen and the full list for each category is printed to new vcf files.
Run with:

genmod analyze path/to/file.vcf

For more information do

genmod analyze --help


### genmod sort ###

#### genmod sort

Sort a VCF file based on Rank Score.

Expand All @@ -269,18 +233,9 @@ Options:
--help Show this message and exit.
```

###genmod summarize###

Tool to get basic statistics of the annotated in a vcf file.
Run

genmod summarize --help
## Conditions for Genetic Models

for more information.

## Conditions for Genetic Models ##

### Short explanation of genotype calls in VCF format:###
### Short explanation of genotype calls in VCF format

Since we only look at humans, that are diploid, the genotypes represent what we see on both alleles in a single position.
0 represents the reference sequence, 1 is the first of the alternative alleles, 2 second alternative and so on.
Expand All @@ -290,8 +245,7 @@ Some chromosomes are only present in one copy in humans, here it is allowed to o

If phasing has been done the pairs are not unordered anymore and the delimiter is then changed to '|', so one can be heterozygote in two ways; 0|1 or 1|0.


### Autosomal Recessive ###
### Autosomal Recessive

For this model individuals can be carriers so healthy individuals can be heterozygous. Both alleles need to have the variant for an individual to be sick so a healthy individual can not be homozygous alternative and a sick individual *has* to be homozygous alternative.

Expand All @@ -300,14 +254,13 @@ For this model individuals can be carriers so healthy individuals can be heteroz
* Variant is considered _de novo_ if both parents are genotyped and do not carry the variant


### Autosomal Dominant ###
### Autosomal Dominant

* Affected individuals have to be heterozygous (het.)
* Healthy individuals cannot have the alternative variant
* Variant is considered _de novo_ if both parents are genotyped and do not carry the variant


### Autosomal Compound Heterozygote ###
### Autosomal Compound Heterozygote

This model includes pairs of exonic variants that are present within the same gene.
**The default behaviour of GENMOD is to look for compounds only in exonic/canonical splice sites**.
Expand All @@ -326,7 +279,7 @@ If the user wants all variants in genes checked use the flag -gene/--whole_gene.
* If only one or no variant is found in parents it is considered _de novo_


### X-Linked Dominant###
### X-Linked Dominant

These traits are inherited on the x-chromosome, of which men have one allele and women have two.

Expand All @@ -338,7 +291,7 @@ These traits are inherited on the x-chromosome, of which men have one allele and
* If sex is female variant is considered _de novo_ if none of the parents carry the variant


### X Linked Recessive ###
### X Linked Recessive

* Variant has to be on chromosome X
* Affected males have to be het. or hom. alt. (het is theoretically not possible in males, but can occur due to Pseudo Autosomal Regions).
Expand All @@ -347,4 +300,7 @@ These traits are inherited on the x-chromosome, of which men have one allele and
* Healthy males cannot carry the variant
* If sex is male the variant is considered _de novo_ if mother is genotyped and does not carry the variant
* If sex is female variant is considered _de novo_ if not both parents carry the variant
-->



[actions-build-status]: https://github.com/Clinical-Genomics/genmod/actions/workflows/build_and_publish.yml/badge.svg
14 changes: 7 additions & 7 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Genmod #

<p align="center">
<a href="https://github.com/moonso/genmod">
<img src="https://github.com/moonso/genmod/raw/master/artwork/tree_man.JPG"/>
<a href="https://github.com/Clinical-Genomics/genmod">
<img src="https://github.com/Clinical-Genomics/genmod/raw/master/artwork/tree_man.JPG"/>
</a>
</p>

Expand All @@ -24,15 +24,15 @@ whole exome data and whole genome data.
## Installation ##

**GENMOD**

```bash
pip install genmod

```
or

git clone https://github.com/moonso/genmod.git
```bash
git clone https://github.com/Clinical-Genomics/genmod.git
cd genmod
python setup.py install

```


### Example: ###
Expand Down
18 changes: 9 additions & 9 deletions examples/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ If the user want to build own annotations please use **genmod build_annotation**

##Annotate variants for Recessive Family##


```
genmod annotate examples/test_vcf.vcf -f examples/recessive_trio.ped -o examples/test_vcf_recessive_annotated.vcf

```
The vcf file have a couple of variants made up so it will be easy to understand how the genetic inheritance patterns are annotated.

With the basic command listed above the output should look like the variants in ```examples/test_vcf_recessive_annotated.vcf```
Expand All @@ -30,24 +30,24 @@ The following variants are to show how the ``-strict`` flag affects the analysis


##Annotate variants for Dominant Family##

```
genmod annotate test_data/test_vcf.vcf -f test_data/dominant_trio.ped -o examples/test_vcf_dominant_annotated.vcf

```
We can now see how the conditions change when one of the parents are affected. For example the recessive pattern for the first variant is not followed since all affected needs to be homozygote alternative if the variant should follow the Autosomal Recessive pattern.


##Annotate variants for Multiple Families##

```
genmod annotate test_data/test_vcf.vcf -f test_data/multi_family.ped -o examples/test_vcf_multi_annotated.vcf

```
We can now see how the conditions change when one of the parents are affected. For example the recessive pattern for the first variant is not followed since all affected needs to be homozygote alternative if the variant should follow the Autosomal Recessive pattern.


##Annotate variants with CADD scores and population frequencies##

This is another example of how one can annotate with genmod:

```
genmod annotate examples/test_vcf.vcf --cadd_file examples/small_CADD.tsv.gz --thousand_g examples/small_1000G.vcf.gz
```


Please post issues on http://github.com/moonso/genmod if any problems.
Please post issues on http://github.com/Clinical-Genomics/genmod in case of any problems.
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ site_name: genmod docs
site_description: Project documentation with Markdown
site_author: Måns Magnusson

repo_url: https://github.com/moonso/genmod
repo_url: https://github.com/Clinical-Genomics/genmod

pages:
- Home: index.md
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
description='Annotate genetic inheritance models in variant files',
author = 'Mans Magnusson',
author_email = '[email protected]',
url = 'http://github.com/moonso/genmod',
url = 'http://github.com/Clinical-Genomics/genmod',
license = 'MIT License',
python_requires="~=3.8.0",
install_requires=[
Expand Down
Loading