-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dealing with Clock Wander #2
Comments
I tried simply finding the slew rate over time, then, slowly adjusting it over time to back that back out, but I don't know if I'm doing it right. Do you think you could increase your algorithm to be more general to try to determine how all that goes? Additionally, can you try verifying the send time using your algorithm, or rather, make sure that send time does not have a great deal of jitter within it? It looks like everything else you have here is gerat for results. |
Ok, I have some updated data here: https://dan.drown.org/clocks/ The data and tools I used are here: https://github.com/ddrown/esp8266rawpackets-proc Instead of a full PID controller, I'm just calculating rate differences and applying those. The remaining offsets are from one of: receiver jitter, transmitter jitter, or fast clock frequency changes. I'm feeding 32 samples at a time, which works out to about a second and a half worth of data at 22 packets per second. The end result was: 50% of the time, all clocks were within +14ns -7ns (ignoring phase differences due to propagation delay). 98% of the time, all clocks were within +208ns -135ns. +/-10ns is about +/-10ft so that might be the accuracy limit. |
Does the data seem centered around the expected locations of the target ESPs (and differential receive times, i.e. diagonal nodes are (10' difference)? Coincidence? Additionally, can you zoom in on your last two graphs? The data looks /really/ good! It looks like given enough data it should center around the expected locations. |
I just can't get over how good those last few graphs look, and really hope to be able to zoom in on them! |
Ok, I added a second series of graphs showing the 250ns..-250ns range. I also added a histogram series. - https://dan.drown.org/clocks/ Clock sync has two pieces: phase and frequency. This is just the frequency part, the phase differences aren't handled yet. |
EDITED Hmm... Your results are much, much better than mine. I don't know how you got everything to match the skew so well. Considering light travels at ~1ft/ns (why I use feet for this sort of stuff) Those results look /really/ good. What do you suppose causes the groupings of several like-packets periodically? In all of my analysis, I was seeing random meyandering and many, many outliers. You still have outliers, but, you also seem to have bunches of groups of data within the 99th percentile and outside the 25th percentile. Any idea what to attribute that bunching to? I really can't wait to see what would happen when you do start to correlated this, i.e. use each as a master, and start to correlate the time differences. Actually... That would give you a better time-density, so there would be less drift/shifts between time syncs. Right now it's 30-50ms between packets being sent, if you use all the node tx's, it could go down to ~10ms between syncs to arbitrary nodes. I wonder if that would be much better? Charles |
The grouping/bunching is probably an artifact of how I'm doing clock sync. I'm not limiting changes from one group of 32 to the next, so a high/low average can throw the whole group off. The next thing I want to do is apply this clock sync to the data from the other transmitters and see if those offsets are the expected values. The change in distance should show as a straight translation up or down on these graphs (but remain as a straight horizontal line). |
That would be awesome. Any way you can "window" the groups, i.e. every one calculates for the next 32. If you get outlier syncs, throw them out? But yes! Keep going! |
you guys rock. https://www.cs.umd.edu/class/spring2010/cmsc818g/slides/2010-03-25-TimeBasedLocation.pdf |
Ok, here's another set of graphs: https://dan.drown.org/clocks/index2.html I used the time and frequency data from the first set, which is using .241 as the phase and frequency reference. I applied those corrections to each module's local clock and calculated the offsets of the other transmitters. An interesting pattern shows up in this data: .241 is around 38 microseconds higher (2 times higher) than the other modules. I believe this is due to tx and rx delays. The local timestamps on each module are relative to: The rebroadcast timestamps (these graphs) can be calculated as: So, for the .179 transmitter this looks like: This leads to .179->.241 having 2 * (txdelay + rxdelay) while the other paths cancel out one set of txdelay+rxdelay (on average as txdelay + rxdelay isn't a static number). So I believe txdelay + rxdelay ~= 38 microseconds |
I can believe that's about the right number. My fear is that tx can't be trusted AT ALL. It sounds like you've confirmed those fears. |
I am bookmarking it and will read it more tomorrow. |
Copied from Youtube comment:
I took toprecorder/data10.txt data and looked specifically at offset and frequency differences between all the clocks:
Using .241's broadcasts as the "master" clock:
Removing the average frequency differences and the two clock jumps, I get this graph, which shows the clock wander: https://dan.drown.org/clocks/data10.png
Maybe trying to insulate the esp8266's from any airflow would lower their temperature changes, which should lower their clock wander.
Also, maybe using a PID control loop on each node would work to sync the frequencies. This is what I've done with NTP on the esp8266 along those lines: https://github.com/ddrown/Arduino_ClockPID
NTP uses round trip time to try to eliminate the phase offset due to one way latency. I'm not sure that would be needed for this application. Knowing the distances between the fixed points should make it possible to cancel out those terms in the equation.
Lastly, the rx and tx timestamp accuracy will add errors as well, but I haven't measured how accurate they are.
The text was updated successfully, but these errors were encountered: