Make a set of reference mRNAs: Major isoforms of non-redundant, non-NMD candidates with no wrong ORF
python3 mRNA-filter.py --refflat REFFLAT --fasta-directory FA_DIR --outfile OUT_FILE
- NCBI RefFlat file, downloaded from UCSC Table Browser
- Directory containing FASTA files of assembly sequence in one file per chromosome, downloaded from UCSC Genome Browser
Tab-separated file (RefFlat format) in extended columns: tx_Size, 5'UTR_Size, ORF_Size, 3'UTR_Size
- Python3
- Python package tqdm
mRNA-filter is on-going project. This repository can be edited or removed without any kind of notice. It was only tested by hg38 RefSeq.