-
-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make a distinction between content and metadata changes #260
Comments
I think it's better to split the 'modified' event into two events. Has to be investigated if this can be implemented on all platforms first. |
Filtering at a lower level should improve performance as the OS will send less events and watchdog will have to filter/ignore fewer events. Maybe there is a way to group events and then pass the groups to subscribe
|
Why do you treat both contents and attribute change the same way? watchdog/src/watchdog/observers/inotify.py Lines 157 to 162 in 5efb90e
Should I submit a pull request and fix this? |
@GreatBahram yes go ahead. Introduce new events for attributes changes. I guess such change must be done on all platforms. |
Thanks, I'll fix it then. |
I like the idea! Though it may be difficult to implement for the Windows and Polling backends: Polling observer: we cannot rely on the mtime to detect file modifications (even though we currently do...) since it can be changed by the user. However, the ctime is incremented whenever the content or metadata changes, making it impossible to distinguish between the two without reverting to something like content hashes. Windows: According to the docs here, events that are written to the buffer by If this cannot be implemented on all backends, I am wondering if it still makes sense to implement for inotify and fsevents. As it stands, there already are many (undocumented) differences between backends and I would rather strive toward uniform behaviour, even if it means that the API is limited by the least versatile backend. But of course this is not my decision to make. @BoboTiG, what do you think? |
I am 100% OK with you @samschott. I was hoping we could have universal new events. If it is not possible, I would rather not make APIs more and more divergent. |
After consideration, WDYT of simply adding I am thinking of that because it is useful to be able to filter on such changes. |
I think it's better to expose these features, What's the final decision here? |
@BoboTiG makes the calls, I am merely an opinionated contributor :) I do agree that it would be a great feature to have and I'm not saying that's impossible on Windows (I have very little experience there). I just have a slight preference towards a uniform API, or at last a clear documentation of the differences between backends. But documentation is another issue by itself... |
@GreatBahram go with |
Hello @BoboTiG, I've read the source code and made some changes to support the attrib event. And I realized with the current implementation; it is possible to have the attrib event on all platforms except for windows which I'm still struggling with... And I was able to test it on Linux (notify) and FreeBSD (kqueue), but I couldn't find any way to run mac os on a virtual machine, and I'm wondering is there any simple way to do this? If you have any ideas, please let me know. There were two fsevents modules, and I searched through Github issues and spotted some issues about making fsevents2 the default observer on mac os. So I only add the changes to the
class FileAttribEvent(FileSystemEvent):
"""
File system event representing file metadata modification on the file system.
"""
event_type = EVENT_TYPE_ATTRIB
class DirAttribEvent(FileSystemEvent):
"""
File system event representing directory metadata modification on the file system.
"""
event_type = EVENT_TYPE_ATTRIB
is_directory = True
class FileSystemHandler:
...
def on_attrib(self, event):
"""Called when a file or directory metadata is modified.
:param event:
Event representing file/directory metadata modification.
:type event:
:class:`FileAttribEvent` or :class:`DirAttribEvent`
"""
class InotifyEmitter:
....
elif event.is_attrib:
cls = DirAttribEvent if event.is_directory else FileAttribEvent
self.queue_event(cls(src_path))
class KqueueEmitter:
....
elif is_attrib_modified(kev):
if descriptor.is_directory:
yield DirAttribEvent(src_path)
else:
yield FileAttribEvent(src_path)
clsss FSEventsEmitter:
....
elif event.is_inode_meta_mod or event.is_xattr_mod:
cls = DirAttribEvent if event.is_directory else FileAttribEvent
self.queue_event(cls(event.path)) And I've also updated the def on_attrib(self, event):
super().on_attrib(event)
what = 'directory' if event.is_directory else 'file'
self.logger.info("Attrib %s: %s", what, event.src_path) I know it might seem like breaking the uniformity across the project, but I think the community will accept this feature because it is not a big deal and make sense to have a separate function for this event. At the same time, there is no burden on windows developers because even if we cannot support this feature, using Please let me know if you think this is ok or should I stick with the And Do you have any idea how I should test the fsevents module while I don't have access to a mac os computer? I couldn't even find a proper virtual machine image for it... Thanks |
I've checked the windows library itself, and also this go library which is a cross-platform, and it seems to me it is not possible to get attrib event inside the windows platform. If you have any points or comment on this, please let me know. Finally, If you agree on the implementation, I want to add some test cases to cover these new notifications. |
Nice work! Let's move forward with your idea :) It will be a breaking patch, so we will just bump the major version number. Couple of notes:
And a big thank you for your time :) |
For Windows, it might not be possible to differentiate attribute and content changes, but it IS possible to filter which events you want. I'm getting "change" events for a directory of images, that I'm pretty sure are actually just the "Last Accessed" date being updated on read. https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-readdirectorychangesw Throwing together a quick C# test shows that with the default "everything" filter, I get the same kind of behaviour as watchdog, but with LastAccess removed from the filter, I get what I really wanted. |
So one could theoretically start two watches, one that only picks up content modified events and another that only picks up metadata changes. It will be a bit more resource heavy but we would maintain a uniform API across backends. |
That's what I wasn't sure about - I assume it's a bit less efficient to do it that way. Although the differentiation I'm looking in my application (basically a |
Providing an exclude option across all backends should be easy, you can either filter events as they come in or directly subscribe to only some types of events on pretty much all backends. However, it would go somewhat against the current API design where all events are always emitted and the event handler then decides how to process different types of events. When I say "metadata" I mean anything that does not affect the location or the binary content of the file. For example, looking at the list of provided notifications in Windows, that would be |
Is that actually the possible events in Windows? It seems to just be Modified, regardless of what changed. (looking at My current (working) monkey-patch for this is:
which gets me to where I want to be, although obviously not very elegantly. |
Indeed, the emitted event type would always be "modified". I was rather proposing a change to the Windows backend to start two watches:
That way, we can ensure that any "modified" events from the first watch are content changes and all "modified" events from the second watch are attribute / metadata changes. |
Sorry - misunderstood which layer you were talking about. Yes, that would work. Although even then I think the option to filter LAST_ACCESS (or others) from the second is useful, in lieu of a way to know what event it was. My use case is a file server - read accesses are completely normal and uninteresting, and not being able to tell them apart from permissions changes is unhelpful. |
That comes back to API design then. We could either:
The watchdog API currently tries to implement the second approach and this issue makes a strong case that the provided granularity is insufficient for some use cases. There can of course still be a discussion about API design, for example providing arguments to the Observer to suppress certain types of events before they are passed on to the Handler. This could marginally improve performance in some cases or reduce the number of required watches in Windows. However, IMO, we should still provide different subclasses of |
Is there a way in Watchdog 2.1.2 to identify xattr only modify events and filter them out ? Currently I am stuck with v0.10.3 which doesn't generate modify event for metadata changes |
for the wandering googler, if your platform has stat, then its trivial to get around this issue by checking if the mtime has changed. such a solution would persist even after #800 has merged. |
Maybe I am doing something wrong... but I have an use case, in which I only care about file content ... I don't want to trigger the handler when file attributes or file security changes.
I know that in other cases, those informations are very important, but since both Inotify and readdirectorychanges provides the options to filter events from the source, maybe it would make sense to add some configuration in watchdog.
In the same time, directory poller, only raise changes for file content and not metadata changes.
Maybe this API would help and make sense
when calling observer.subscribeMetadata() on directory poller, we can either raise an error that this is not supported, or create a special directory snapshot which polls for security data.
What do you think?
The text was updated successfully, but these errors were encountered: