You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In some gff files are features that cover most of the track.
For example : GCF_000247795.1
In the related gff file (enclosed), there is a feature named "match" that fully overlaps with the first chromosome
NC_032650.1 RefSeq region 1 161108492 . + . ID=NC_032650.1:1..161108492;Dbxref=taxon:9915;Name=1;breed=Nelore;chromosome=1;country=Brazil;gb-synonym=Bos taurus indicus;gbkey=Src;genome=chromosome;isolate=QUIL7308;mol_type=genomic DNA;note=animal owned by Agropecuaria Quilombo Inc.;sex=male;tissue-type=peripheral blood mononuclear cells line num 37235:
NC_032650.1 RefSeq match 1 161108492 . + . ID=aln0;Target=NC_032650.1 1 161108492 +;gap_count=0;num_mismatch=0;pct_coverage=100;pct_identity_gap=100
In consequence orfget is not able to define any pure intergenic ORF :
Would it be possible as a preliminary step in orftrack, to exclude features whose region coverage exceeds lets say 90% to avoid this behavior ?
Meanwhile, since the 6 only genomes with this error I identified so far, all contain a 'match' feature, I suggest to simply add 'match' to line 597 of gff_parser.py
if element_type not in ['chromosome', 'region','match']:
The text was updated successfully, but these errors were encountered:
In some gff files are features that cover most of the track.
For example : GCF_000247795.1
In the related gff file (enclosed), there is a feature named "match" that fully overlaps with the first chromosome
NC_032650.1 RefSeq region 1 161108492 . + . ID=NC_032650.1:1..161108492;Dbxref=taxon:9915;Name=1;breed=Nelore;chromosome=1;country=Brazil;gb-synonym=Bos taurus indicus;gbkey=Src;genome=chromosome;isolate=QUIL7308;mol_type=genomic DNA;note=animal owned by Agropecuaria Quilombo Inc.;sex=male;tissue-type=peripheral blood mononuclear cells
line num 37235:
NC_032650.1 RefSeq match 1 161108492 . + . ID=aln0;Target=NC_032650.1 1 161108492 +;gap_count=0;num_mismatch=0;pct_coverage=100;pct_identity_gap=100
In consequence orfget is not able to define any pure intergenic ORF :
NC_032650.1
ORF type Quantity Average length (aa)
c_CDS 7649 100.45
nc_ovp_opp-CDS 19987 58.68
nc_ovp_opp-cDNA_match 201 39.65
nc_ovp_opp-match 1983772 46.8
nc_ovp_same-CDS 11740 52.03
nc_ovp_same-cDNA_match 713 39.64
nc_ovp_same-lnc_RNA 15831 42.05
nc_ovp_same-mRNA 439133 44.33
nc_ovp_same-match 2449854 46.35
nc_ovp_same-pseudogene 10750 48.33
nc_ovp_same-tRNA 16 68.0
nc_ovp_same-transcript 281 65.47
Would it be possible as a preliminary step in orftrack, to exclude features whose region coverage exceeds lets say 90% to avoid this behavior ?
Meanwhile, since the 6 only genomes with this error I identified so far, all contain a 'match' feature, I suggest to simply add 'match' to line 597 of gff_parser.py
if element_type not in ['chromosome', 'region','match']:
The text was updated successfully, but these errors were encountered: