Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simulated thermodynamic model is not recovered after transition analysis #96

Open
snguyen49 opened this issue Aug 22, 2023 · 9 comments
Assignees
Labels
good first issue Good for newcomers invalid This doesn't seem right

Comments

@snguyen49
Copy link

snguyen49 commented Aug 22, 2023

The diagram does not match the amount of states and state probabilities.

Email: [email protected]

@snguyen49 snguyen49 added the bug Something isn't working label Aug 22, 2023
@mca-sh
Copy link
Collaborator

mca-sh commented Aug 23, 2023

Hello snguyen49 and thank you for your feedback.

Is the diagram you are talking about the Treilli diagram in Transition analysis module?
Could you please tell me what do you mean by it "does not match" the number of states. Do you mean it does not recover the number of states in the data you've simulated?
Also, just to be sure, the state populations are shown by the sizes of the circles.

I would need a bit more information if you don't mind.
Mélodie

@snguyen49
Copy link
Author

snguyen49 commented Aug 23, 2023

Yes, for some reason diagram does not create the same number of circles as the amount of requested states.
Screenshot 2023-08-23 150439
Screenshot 2023-08-23 150419

@mca-sh
Copy link
Collaborator

mca-sh commented Aug 23, 2023

It seems to be a tricky model.
Could you please tell me the FRET values corresponding to each state? Their might be some states that are identical in value and kinetics that are not distinguishable in the dwell time histogram.
I guess you used ML-DPH to access the number of degenerated states, if so, did you set the parameter Dmax (here: https://rna-fretools.github.io/MASH-FRET/transition-analysis/components/panel-kinetic-model.html#state-degeneracy) to a value large enough?

@snguyen49
Copy link
Author

snguyen49 commented Aug 23, 2023 via email

@mca-sh
Copy link
Collaborator

mca-sh commented Aug 24, 2023

Ok I understand. The thing is that ML-DPH needs the degenerate states (states with same FRET values) to have substantially different lifetimes in order to distinguish them in the dwell time histogram.
Also, what I observed is that the dwell time histogram of can have two shapes:

  1. irreversible loop (--> O1 --> O2--> O3 --> O4 --> ) gives a dwell time histogram with a maximum
  2. any other case gives a sum of exponential

Good luck with your experiments and thank you again for your feedback!
Mélodie

@snguyen49
Copy link
Author

snguyen49 commented Aug 24, 2023 via email

@mca-sh mca-sh changed the title MASH-FRET bug report simulated thermodynamic model is not recovered after transition analysis Aug 26, 2023
@mca-sh mca-sh added good first issue Good for newcomers question Further information is requested and removed bug Something isn't working labels Aug 26, 2023
@mca-sh
Copy link
Collaborator

mca-sh commented Aug 26, 2023

Hi Sydney,
To help you, I need to know the FRET values of the states you defined in the thermodynamic model of the Simulation module. To do this, could you read the "FRETj" values that is displayed for all 5 "state(j)" and report them to me?
It is a whole different problem if there are states that have similar FRET values.

@snguyen49
Copy link
Author

snguyen49 commented Aug 28, 2023 via email

@mca-sh
Copy link
Collaborator

mca-sh commented Aug 29, 2023

Ok I see. Don't worry we will try anyway.

First of all, please use the last version of MASH-FRET (https://github.com/RNA-FRETools/MASH-FRET/archive/refs/heads/master.zip) as I just brought some modifications about the part you are using.

Based on the results you obtained, I will assume that on the five states you've simulated, you changed the FRET value of the first state only to FRET_1=1 and that the four others were left to FRET_2,3,4,5=0.

In that case, you recovered only two states that correspond to the two observed FRET states, i.e., FRET=0 and 1.
This would mean that the four degenerate states at FRET=0 were not identified.

The algorithm in MASH that finds the number of degenerate states (i.e., state degeneracy) hidden behind one FRET value is called ML-DPH. ML-DPH analyses the shape of your dwell time histogram at FRET=1 and FRET=0:

  • if no degenerate states are present, the dwell time histogram is described by a single exponential decay,
  • if N degenerate states having sufficiently different lifetimes are present, the dwell time histogram is described as a N-phase-type distribution.

ML-DPH is very sensitive to:

  • the a sufficiently large lifetime gap between the degenerate states;
  • a sufficient proportion of exit transitions (in your case: the transition 0-->1 relative to 0 --> any);

What I've been observing until now is that the two parameters are inter-connected and the combination of all values is not yet tested. However, I can give a threshold for two specific conditions in a system having two degenerate states:

  • the lifetimes should be increased by at least a factor of 3 with 80% of transitions 0-->1
  • the proportion of transitions 0-->1 should be at least 20% with lifetimes increased by a factor of 20.

It is most probable that these thresholds get more demanding with additional degenerate states.
Your system is having four degenerate states, which mean we will have to be extra careful with these parameters.

Let's have a look at the lifetimes of the degenerate states in your system:

  • tau_2 = 1/(k_21+k_23+k_24+k_25) = 1/(0.1+0.13+0.1+0.1) = 1/0.43 = 2.33 data points
  • tau_3 = 1/(k_31+k_32+k_34+k_35) = 1/(0.1+0.12+0.1+0.1) = 1/0.42 = 2.38 data points
  • tau_4 = 1/(k_41+k_42+k_43+k_45) = 1/(0.1+0.12+0.13+0.1) = 1/0.45 = 2.22 data points
  • tau_5 = 1/(k_51+k_52+k_53+k_54) = 1/(0.1+0.12+0.13+0.1) = 1/0.45 = 2.22 data points

The first thing we notice is that lifetimes are very short. In a real experiment, they would correspond to time resolution too slow for the system under study. The second thing is that your degenerate states have very similar (or identical) lifetimes and that are far from satisfying the requirements for ML-DPH.

Now let's have a look at the proportion of transitions 0 --> 1 w_ij for each of the four degenerate states:

  • w_21 = k_21/(k_21+k_23+k_24+k_25) = 0.1/(0.1+0.13+0.1+0.1) = 23.3%
  • w_31 = k_31/(k_31+k_32+k_34+k_35) = 0.1/(0.1+0.12+0.1+0.1) = 23.8%
  • w_41 = k_41/(k_41+k_42+k_43+k_45) = 0.1/(0.1+0.12+0.13+0.1) = 22.2%
  • w_51 = k_51/(k_51+k_52+k_53+k_54) = 0.1/(0.1+0.12+0.13+0.1) = 22.2%

These proportions would be satisfying for a system with two degenerate states, but most probably not with four.

So lets try to define a 5-state transition rate matrix that would satisfy ML-DPH.
Just to be safe, we will set the proportion of transitions 0 --> 1 to 80% and use a factor of 5 for the lifetime gap.
Using a suitable number of data points for the shortest lifetime, we get:

  • tau_2' = 10 data points
  • tau_3' = 50 data points
  • tau_4' = 250 data points
  • tau_5' = 1250 data points
  • and let's set tau_1' = 50 data points

And finally, let' calculate the rate constants corresponding to these parameters.
If we assume an equal partition in transitions 0 --> 0 and 1 --> 0, we get the following proportions of transitions:

  • w_21 = w_31 = w_41 = w_51 = 80% = 0.800
  • w_23 = w_24 = w_25 = w_32= w_34 = w_35 = w_42 = w_43 = w_45 = w_52 = w_53 = w_54 = (1-0.8)/3 = 0.067
  • w_12 = w_13 = w_14 = w_15 = 1/4 = 0.250

That we can convert into rate constants such as:



k_ij' = w_ij'/tau_i' = w_11'/tau_1' w_12'/tau_1' w_13'/tau_1' w_14'/tau_1' w_15'/tau_1' = 0.000/50   0.250/50   0.250/50   0.250/50   0.250/50
                       w_21'/tau_2' w_22'/tau_2' w_23'/tau_2' w_24'/tau_2' w_25'/tau_2'   0.800/10   0.000/10   0.067/10   0.067/10   0.067/10
                       w_31'/tau_3' w_32'/tau_3' w_33'/tau_3' w_34'/tau_3' w_35'/tau_3'   0.800/50   0.067/50   0.000/50   0.067/50   0.067/50
                       w_41'/tau_4' w_42'/tau_4' w_43'/tau_4' w_44'/tau_4' w_45'/tau_4'   0.800/250  0.067/250  0.067/250  0.000/250  0.067/250
                       w_51'/tau_5' w_52'/tau_5' w_53'/tau_5' w_54'/tau_5' w_55'/tau_5'   0.800/1250 0.067/1250 0.067/1250 0.067/1250 0.000/1250

k_ij' = 0.00000 0.00500 0.00500 0.00500 0.00500 
        0.08000 0.00000 0.00667 0.00667 0.00667
        0.01600 0.00133 0.00000 0.00133 0.00133
        0.00320 0.00027 0.00027 0.00000 0.00027
        0.00064 0.00005 0.00005 0.00005 0.00005


Providing the calculations are correct, this could be a suitable transition rate matrix to use with FRET_1 = 1 and FRET_2 to _5 = 0. You can give it a try and play around with these parameters.

Some additional remarks:

  • Rate constants are given in units per data point as is calibrated your interface (menu Units > Time > in sampling steps (frames)).
  • Be sure the length of your simulated trajectories is large enough to collect a sufficient number of dwell times. A rule of thumbs would be to use a length of J-times the largest state lifetime, with J being the total number of states in your system.
  • If you are using the simulated FRET state trajectories directly, you can set the ML-DPH parameter bin to 1 as you don't have to correct for the time resolution of discretization algorithms
  • The connectivity of the states in the diagram is obtained with a hierarchical search: it aims to yield the most simple scheme that is equivalent to the ground truth. This is why it will show simpler state connectivities than the simulated ones.

I hope it was not too confusing.
Do not hesitate to ask for precisions, I will try my best.

Mélodie

@mca-sh mca-sh added invalid This doesn't seem right and removed question Further information is requested labels Aug 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

3 participants