Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding Hyb output #10

Open
dstrib opened this issue Dec 12, 2023 · 8 comments
Open

Question regarding Hyb output #10

dstrib opened this issue Dec 12, 2023 · 8 comments

Comments

@dstrib
Copy link

dstrib commented Dec 12, 2023

Hi! I am creating a new reference dataset for Hyb for use with my project, and I have a question about hybrids that are present with the original hOH7 database but are missing from my new output.

For the hOH7 the ".blast" output from the hyb pipeline contains the following records (and others):

405_1    ENSG00000055609_ENST00000262189_MLL3_mRNA   100.00  32  0   0   44  75  1174    1205    4.3e-09 60.2
405_1    MIMAT0000244_MirBase_miR-30c_microRNA   100.00  22  0   0   23  44  1   22  0.0016  41.7

Result:
405_1 AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA . MIMAT0000244_MirBase_miR-30c_microRNA 23 44 1 22 0.0016 ENSG00000055609_ENST00000262189_MLL3_mRNA 44 75 1174 1205 4.3e-09

For the updated database, the ".blast" output from the hyb pipeline contains the following records (and others):

405_1    ENSG00000055609_ENST00000262189_KMT2C_mRNA  100.00  32  0   0   44  75  1172    1203    1.2e-08 60.2
405_1    MIMAT0000244_miRBase_hsa-miR-30c-5p_microRNA    100.00  22  0   0   23  44  1   22  0.0044  41.7

Result: no hybrid 405_1 is included in the output.

To make sure the e-val is not a problem, I used a setting of hval=100.0 when running hyb.
Presumably I would expect the second library to also provide a hybrid for the provided sequence given that there are compatible blast results in both cases. Is there some other selection criteria I am missing that would prevent outputting of a record in the second case? If not, why would I not expect a hybrid output here?

Thanks much!

@gkudla
Copy link
Owner

gkudla commented Dec 12, 2023 via email

@dstrib
Copy link
Author

dstrib commented Dec 12, 2023

Thanks very much for the reply, that would make sense.
I presume the alignment quality would be ranked by e-value. Scanning the other blast results for the second database there aren't any with an evalue better than 1.2e-08, and all those with an equivalent score correspond to a SAM CIGAR string of 43S32M for the alignment (and there aren't any better alignments).
So as far as I can tell there isn't a better alignment that is "overriding" the potential hybrid. Do you have other thoughts?

@gkudla
Copy link
Owner

gkudla commented Dec 12, 2023 via email

@dstrib
Copy link
Author

dstrib commented Dec 12, 2023

There is an additional distant alignment to an mRNA with the same score that does not overlap the microRNA:

(CIGAR: 22M53S) 405_1    ENSG00000184992_ENST00000341446_BRI3BP_mRNA 100.00  22  0   0   1   22  5012    5033    0.0044  41.7
(CIGAR: 22S22M31S) 405_1    MIMAT0000244_miRBase_hsa-miR-30c-5p_microRNA    100.00  22  0   0   23  44  1   22  0.0044  41.7

However given that this is run in "mim" mode I wouldn't expect it to prefer the former?

@dstrib
Copy link
Author

dstrib commented Dec 12, 2023

Here are all relevant sam/blast entries:

file.sam:405_1  0   ENSG00000055609_ENST00000679882_KMT2C_mRNA  1105    0   43S32M  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:32 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:32 YT:Z:UU
file.sam:405_1  256 ENSG00000055609_ENST00000682283_KMT2C_mRNA  955 255 43S32M  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:32 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:32 YT:Z:UU
file.sam:405_1  256 ENSG00000055609_ENST00000684550_KMT2C_mRNA  1315    255 43S32M  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:32 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:32 YT:Z:UU
file.sam:405_1  256 ENSG00000055609_ENST00000683616_KMT2C_mRNA  1017    255 43S32M  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:32 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:32 YT:Z:UU
file.sam:405_1  256 ENSG00000290523_ENST00000470054_UNKGENE_lncRNA  430 255 43S32M  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:32 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:32 YT:Z:UU
file.sam:405_1  256 ENSG00000055609_ENST00000681082_KMT2C_mRNA  1175    255 43S32M  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:32 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:32 YT:Z:UU
file.sam:405_1  256 ENSG00000055609_ENST00000262189_KMT2C_mRNA  1172    255 43S32M  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:32 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:32 YT:Z:UU
file.sam:405_1  256 ENSG00000055609_ENST00000682916_KMT2C_mRNA  106 255 43S32M  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:32 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:32 YT:Z:UU
file.sam:405_1  256 ENSG00000055609_ENST00000683490_KMT2C_mRNA  1172    255 43S32M  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:32 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:32 YT:Z:UU
file.sam:405_1  256 ENSG00000187172_ENST00000496773_BAGE2_transcribed-unprocessed-pseudogene    106 255 43S32M  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:32 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:32 YT:Z:UU
file.sam:405_1  256 ENSG00000184992_ENST00000341446_BRI3BP_mRNA 5012    255 22M53S  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:22 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:22 YT:Z:UU
file.sam:405_1  256 MIMAT0000244_miRBase_hsa-miR-30c-5p_microRNA    1   255 22S22M31S   *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:22 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:22 YT:Z:UU
file.sam:405_1  256 ENSG00000197536_ENST00000337752_IRF1-AS1_lncRNA 2077    255 20M55S  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:20 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:20 YT:Z:UU
file.sam:405_1  256 ENSG00000165813_ENST00000369287_CCDC186_mRNA    7089    255 6S20M49S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:20 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:20 YT:Z:UU
file.sam:405_1  256 ENSG00000286449_ENST00000668926_UNKGENE_lncRNA  3274    255 20M55S  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:20 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:20 YT:Z:UU
file.sam:405_1  256 ENSG00000165813_ENST00000648613_CCDC186_mRNA    7243    255 6S20M49S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:20 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:20 YT:Z:UU
file.sam:405_1  256 ENSG00000167554_ENST00000601151_ZNF610_mRNA 2497    255 2S20M53S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:20 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:20 YT:Z:UU
file.sam:405_1  256 ENSG00000153930_ENST00000682825_ANKFN1_mRNA 8550    255 20M55S  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:20 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:20 YT:Z:UU
file.sam:405_1  256 ENSG00000167554_ENST00000403906_ZNF610_mRNA 2695    255 2S20M53S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:20 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:20 YT:Z:UU
file.sam:405_1  256 ENSG00000253897_ENST00000517788_UNKGENE_transcribed-unprocessed-pseudogene  58019   255 6S19M50S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:19 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:19 YT:Z:UU
file.sam:405_1  256 ENSG00000231752_ENST00000458200_EMBP1_transcribed-unprocessed-pseudogene    35362   255 6S19M50S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:19 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:19 YT:Z:UU
file.sam:405_1  256 ENSG00000186591_ENST00000649897_UBE2H_mRNA  1531    255 6S19M50S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:19 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:19 YT:Z:UU
file.sam:405_1  256 ENSG00000269821_ENST00000597346_KCNQ1OT1_lncRNA 72241   255 22M53S  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:19 XS:i:32 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:4T17   YT:Z:UU
file.sam:405_1  256 ENSG00000231752_ENST00000458200_EMBP1_transcribed-unprocessed-pseudogene    36333   255 6S19M50S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:19 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:19 YT:Z:UU
file.sam:405_1  256 ENSG00000186591_ENST00000355621_UBE2H_mRNA  1878    255 6S19M50S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:19 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:19 YT:Z:UU
file.sam:405_1  256 ENSG00000276521_ENST00000615334_UNKGENE_unprocessed-pseudogene  777 255 1S21M53S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:18 XS:i:32 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:3G17   YT:Z:UU
file.sam:405_1  256 ENSG00000240438_ENST00000447585_OFD1P5Y_unprocessed-pseudogene  12245   255 2S18M55S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:18 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:18 YT:Z:UU
file.sam:405_1  256 ENSG00000242153_ENST00000451061_OFD1P6Y_transcribed-unprocessed-pseudogene  29535   255 2S18M55S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:18 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:18 YT:Z:UU
file.sam:405_1  256 ENSG00000271519_ENST00000603371_UNKGENE_processed-pseudogene    4271    255 2S18M55S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:18 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:18 YT:Z:UU
file.sam:405_1  256 ENSG00000233321_ENST00000685074_LINC02669_lncRNA    1632    255 2S18M55S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:18 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:18 YT:Z:UU
file.sam:405_1  256 MIMAT0000420_miRBase_hsa-miR-30b-5p_microRNA    1   255 22S18M35S   *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:18 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:18 YT:Z:UU
file.sam:405_1  256 ENSG00000106771_ENST00000374586_TMEM245_mRNA    6821    255 18M57S  *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:18 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:18 YT:Z:UU
file.sam:405_1  256 ENSG00000233963_ENST00000426792_ATP8A2P3_unprocessed-pseudogene 11220   255 7S18M50S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:18 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:18 YT:Z:UU
file.sam:405_1  256 ENSG00000006432_ENST00000554752_MAP3K9_mRNA 8182    255 2S18M55S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:18 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:18 YT:Z:UU
file.sam:405_1  256 ENSG00000175387_ENST00000262160_SMAD2_mRNA  12609   255 5S18M52S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:18 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:18 YT:Z:UU
file.sam:405_1  256 ENSG00000288025_ENST00000664814_UNKGENE_lncRNA  1693    255 2S18M55S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:18 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:18 YT:Z:UU
file.sam:405_1  256 ENSG00000255185_ENST00000534700_PDXDC2P_transcribed-unprocessed-pseudogene  16357   255 7S18M50S    *   0   0   AAAAAAGTGTGTGTGTGTGTATTGTAAACATCCTACACTCTCAGATTTCAGTCACATCTTCCTGCTTTGTCCAGA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:18 XS:i:32 XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:18 YT:Z:UU

file.blast:405_1    ENSG00000055609_ENST00000679882_KMT2C_mRNA  100.00  32  0   0   44  75  1105    1136    1.2e-08 60.2
file.blast:405_1    ENSG00000055609_ENST00000682283_KMT2C_mRNA  100.00  32  0   0   44  75  955 986 1.2e-08 60.2
file.blast:405_1    ENSG00000055609_ENST00000684550_KMT2C_mRNA  100.00  32  0   0   44  75  1315    1346    1.2e-08 60.2
file.blast:405_1    ENSG00000055609_ENST00000683616_KMT2C_mRNA  100.00  32  0   0   44  75  1017    1048    1.2e-08 60.2
file.blast:405_1    ENSG00000290523_ENST00000470054_UNKGENE_lncRNA  100.00  32  0   0   44  75  430 461 1.2e-08 60.2
file.blast:405_1    ENSG00000055609_ENST00000681082_KMT2C_mRNA  100.00  32  0   0   44  75  1175    1206    1.2e-08 60.2
file.blast:405_1    ENSG00000055609_ENST00000262189_KMT2C_mRNA  100.00  32  0   0   44  75  1172    1203    1.2e-08 60.2
file.blast:405_1    ENSG00000055609_ENST00000682916_KMT2C_mRNA  100.00  32  0   0   44  75  106 137 1.2e-08 60.2
file.blast:405_1    ENSG00000055609_ENST00000683490_KMT2C_mRNA  100.00  32  0   0   44  75  1172    1203    1.2e-08 60.2
file.blast:405_1    ENSG00000187172_ENST00000496773_BAGE2_transcribed-unprocessed-pseudogene    100.00  32  0   0   44  75  106 137 1.2e-08 60.2
file.blast:405_1    ENSG00000184992_ENST00000341446_BRI3BP_mRNA 100.00  22  0   0   1   22  5012    5033    0.0044  41.7
file.blast:405_1    MIMAT0000244_miRBase_hsa-miR-30c-5p_microRNA    100.00  22  0   0   23  44  1   22  0.0044  41.7
file.blast:405_1    ENSG00000197536_ENST00000337752_IRF1-AS1_lncRNA 100.00  20  0   0   1   20  2077    2096    0.058   38.1
file.blast:405_1    ENSG00000165813_ENST00000369287_CCDC186_mRNA    100.00  20  0   0   7   26  7089    7108    0.058   38.1
file.blast:405_1    ENSG00000286449_ENST00000668926_UNKGENE_lncRNA  100.00  20  0   0   1   20  3274    3293    0.058   38.1
file.blast:405_1    ENSG00000165813_ENST00000648613_CCDC186_mRNA    100.00  20  0   0   7   26  7243    7262    0.058   38.1
file.blast:405_1    ENSG00000167554_ENST00000601151_ZNF610_mRNA 100.00  20  0   0   3   22  2497    2516    0.058   38.1
file.blast:405_1    ENSG00000153930_ENST00000682825_ANKFN1_mRNA 100.00  20  0   0   1   20  8550    8569    0.058   38.1
file.blast:405_1    ENSG00000167554_ENST00000403906_ZNF610_mRNA 100.00  20  0   0   3   22  2695    2714    0.058   38.1
file.blast:405_1    ENSG00000253897_ENST00000517788_UNKGENE_transcribed-unprocessed-pseudogene  100.00  19  0   0   7   25  58019   58037   0.21    36.2
file.blast:405_1    ENSG00000231752_ENST00000458200_EMBP1_transcribed-unprocessed-pseudogene    100.00  19  0   0   7   25  35362   35380   0.21    36.2
file.blast:405_1    ENSG00000186591_ENST00000649897_UBE2H_mRNA  100.00  19  0   0   7   25  1531    1549    0.21    36.2
file.blast:405_1    ENSG00000269821_ENST00000597346_KCNQ1OT1_lncRNA 95.45   22  1   0   1   22  72241   72262   0.21    36.2
file.blast:405_1    ENSG00000231752_ENST00000458200_EMBP1_transcribed-unprocessed-pseudogene    100.00  19  0   0   7   25  36333   36351   0.21    36.2
file.blast:405_1    ENSG00000186591_ENST00000355621_UBE2H_mRNA  100.00  19  0   0   7   25  1878    1896    0.21    36.2
file.blast:405_1    ENSG00000276521_ENST00000615334_UNKGENE_unprocessed-pseudogene  95.24   21  1   0   2   22  777 797 0.74    34.4
file.blast:405_1    ENSG00000240438_ENST00000447585_OFD1P5Y_unprocessed-pseudogene  100.00  18  0   0   3   20  12245   12262   0.74    34.4
file.blast:405_1    ENSG00000242153_ENST00000451061_OFD1P6Y_transcribed-unprocessed-pseudogene  100.00  18  0   0   3   20  29535   29552   0.74    34.4
file.blast:405_1    ENSG00000271519_ENST00000603371_UNKGENE_processed-pseudogene    100.00  18  0   0   3   20  4271    4288    0.74    34.4
file.blast:405_1    ENSG00000233321_ENST00000685074_LINC02669_lncRNA    100.00  18  0   0   3   20  1632    1649    0.74    34.4
file.blast:405_1    MIMAT0000420_miRBase_hsa-miR-30b-5p_microRNA    100.00  18  0   0   23  40  1   18  0.74    34.4
file.blast:405_1    ENSG00000106771_ENST00000374586_TMEM245_mRNA    100.00  18  0   0   1   18  6821    6838    0.74    34.4
file.blast:405_1    ENSG00000233963_ENST00000426792_ATP8A2P3_unprocessed-pseudogene 100.00  18  0   0   8   25  11220   11237   0.74    34.4
file.blast:405_1    ENSG00000006432_ENST00000554752_MAP3K9_mRNA 100.00  18  0   0   3   20  8182    8199    0.74    34.4
file.blast:405_1    ENSG00000175387_ENST00000262160_SMAD2_mRNA  100.00  18  0   0   6   23  12609   12626   0.74    34.4
file.blast:405_1    ENSG00000288025_ENST00000664814_UNKGENE_lncRNA  100.00  18  0   0   3   20  1693    1710    0.74    34.4
file.blast:405_1    ENSG00000255185_ENST00000534700_PDXDC2P_transcribed-unprocessed-pseudogene  100.00  18  0   0   8   25  16357   16374   0.74    34.4

@gkudla
Copy link
Owner

gkudla commented Dec 13, 2023 via email

@dstrib
Copy link
Author

dstrib commented Dec 13, 2023

Thank you for this clarification.

Given that they have the same alignment e-value, is this due to BRI3BP_mRNA occurring before hsa-miR-30c-5p in the list of alignments?

Additionally, are there any intermediate files between the blast and initial hyb file where the selected segments could be identified?

@gkudla
Copy link
Owner

gkudla commented Dec 13, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants