Feature Request: Consistent access to genome ranges #5

mnshgl0110 · 2022-11-08T17:11:19Z

Currently, the pansyn object treats the reference genome differently then the other genomes. Consequently, different steps are required to access the coordinates for the reference and other genomes.

print(df.iloc[0][0])
Pansyn(Range(chm13, NC_060946.1, NaN, 10588391, 12801843), {'mat': Range(mat, CM039032.1, NaN, 3783261, 5889298), 'pat': Range(pat, CM039055.1, NaN, 1, 1723416)})

Is there any specific region for this? If not, then would not it be better to treat all genomes similarly and allow consistent parsing scheme? I guess, this would also be useful when we start identifying crossyn between query genomes only.

The text was updated successfully, but these errors were encountered:

lrauschning · 2022-11-11T16:04:56Z

Hi Manish,
the reason for this is that the reference genome has a special role in the synteny intersection algorithm, as its the genome the regions are "joined" on.
The way I thought of handling reference-free multisynteny calling would be to leave the reference field as None – which will not work with the intersection algorithm as it is implemented at the moment as that is inherently reference-based, but it would be possible to adapt that.
I've been thinking of writing an associated function that shifts the reference Range into the main dict and leaves it empty to seamlessly transition between reference-based and non-reference-based identification.
A reverse function taking an organism from the dict and shifting it to the ref field could then be used for adapting to reference-free synteny intersection, perhaps imputing the cigar strings for each entry in the Cigars dict.
Do you think this approach makes sense?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Consistent access to genome ranges #5

Feature Request: Consistent access to genome ranges #5

mnshgl0110 commented Nov 8, 2022

lrauschning commented Nov 11, 2022

Feature Request: Consistent access to genome ranges #5

Feature Request: Consistent access to genome ranges #5

Comments

mnshgl0110 commented Nov 8, 2022

lrauschning commented Nov 11, 2022