-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fs-storage
: Millisecond timestamps and proper rounding
#37
Comments
I think this is a brilliant idea. very straightforward to implement as well.
In my opinion, this seems like a rare case that we can overlook.
I think microseconds or milliseconds should be enough We should also account for the fact that we only find this particular error either on the open PR or ARK-Builders/arklib#87 (comment) on macOS. I notice that our CI doesn't detect these issues either. We should really include macOS CI tests and run them with each PR. |
I agree that including milliseconds will significantly reduce conflicts. It's very unlikely (probably impossible) that the user will edit/update the same file multiple times within the same millisecond. So this solution fits well for human interaction based use cases. For systems that might be using nautilus file system in an automated way like a server. It can use atomic versioning. Also it's best to avoid nanoseconds since different OS have different levels of support. https://doc.rust-lang.org/std/time/struct.SystemTime.html#platform-specific-behavior |
fs-storage
: preventing missed updatesfs-storage
: Millisecond timestamps and proper rounding
Now it seems the problem is not just related to rounding but more about the semantics around syncing. We have two timestamps here, one in the metadata of the physical file stored on disk (T2) and a timestamp recorded for when the in-memory key-value mapping (T1) is updated. Here's what we have -
So we have three cases here -
Case 1The most common flow will be where new entries are added to the mapping making T1 > T2, then the mapping is written to disk making T2 == T1. Case 2However, there can be cases where the mapping is updated and the underlying file is also updated. This case will need a full sync that uses monoid implementation to merge the data. However, this is tricky because this can be triggered by
We have no way of differentiating between case 1 and case 2 so all syncs will need to be full syncs which will be very inefficient. We want to be able to differentiate case 1 and case 2. So that we can use the more efficient Possible solutions
Now we have the following cases -
What are your thoughts? |
Context: #23 (comment)
We might need more granular timestamps than just seconds. Let's consider such a scenario:
HH:MM:ss:xxx
, wherexxx
is milliseconds part.cat.jpg
is modified at the momentHH:MM:ss:yyy
.If we truncate
xxx
andyyy
then on next scan we'll skipcat.jpg
and we miss the update. We has performed the scan in the momentHH:MM:ss
and the file has been modified atHH:MM:ss
, so this doesn't look like an update.As a workaround, it seems possible to round
HH:MM:ss:xxx
always to the lowest side when we memoize the timestamp of latest scan, and to roundHH:MM:ss:yyy
always to the highest side when we retrieve the timestamp of file modification. This would result in excessive updates for the files which were updated in the same second when scan happened, but it should be better than losing an update.With this logic implemented, we control amount of excessive scans by altering the timestamp granularity. What is the smallest division of time that makes sense for file systems? Nanoseconds seem to be too low, but what about milliseconds?
The text was updated successfully, but these errors were encountered: