Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The column names are not correct #16

Open
petrokvitka opened this issue Nov 14, 2018 · 0 comments
Open

The column names are not correct #16

petrokvitka opened this issue Nov 14, 2018 · 0 comments

Comments

@petrokvitka
Copy link

petrokvitka commented Nov 14, 2018

I have recently tried the gtfparse and run into a problem. I am not sure if I am using it not right or if there is a small bug in the gtfparse. I will be thankful for any help!

my code is:
df = read_gtf(gtf_genes)
df_genes = df[df["feature"] == "gene"][df["gene_name"] == genes_of_interest[0]]
print(df_genes)
print(df_genes["seqname"], df_genes["source"], df_genes["feature"], df_genes["start"], df_genes["end"])

and the output is:
Extracted GTF attributes: ['gene_id', 'gene_type', 'gene_name', 'level', 'tag', 'havana_gene']
seqname source feature start ... gene_name level tag havana_gene
0 chr14 HAVANA gene 19062316 ... DUXAP9 1 pseudo_consens OTTHUMG00000188246.3
[1 rows x 14 columns]
0 chr14
Name: seqname, dtype: object 0 HAVANA
Name: source, dtype: object 0 gene
Name: feature, dtype: object 0 19062316
Name: start, dtype: int64 0 19115270
Name: end, dtype: int64

while I would expect to see something like this:

Name: seqname, dtype: object 0 chr14
Name: source, dtype: object 0 HAVANA
Name: feature, dtype: object 0 gene
Name: start, dtype: int64 0 19062316
Name: end, dtype: int64 19115270

it seems to me like the names of columns are switched to the right and I don't understand how to handle it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant