-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(file source): Allow file pattern with curly braces in include directive #5927
Conversation
4a95559
to
178f889
Compare
Things I noticed while implementing / follow-up issues:
|
Nice work @pablosichert! Would you mind opening issues for both of those points? That helps us decide if we should defer or schedule them for work. |
…ective Signed-off-by: Pablo Sichert <[email protected]>
Signed-off-by: Pablo Sichert <[email protected]>
Signed-off-by: Pablo Sichert <[email protected]>
Signed-off-by: Pablo Sichert <[email protected]>
178f889
to
497c976
Compare
Thanks! Done. |
Signed-off-by: Pablo Sichert <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a pretty good and straightforward move to make this happen. I do have a few concerns I left inline below.
Signed-off-by: Pablo Sichert <[email protected]>
Signed-off-by: Pablo Sichert <[email protected]>
Signed-off-by: Pablo Sichert <[email protected]>
Signed-off-by: Pablo Sichert <[email protected]>
04e7ae9
to
3954aed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are just the few enhancements that can be done on errors, otherwise this looks good.
Failure of |
Signed-off-by: Pablo Sichert <[email protected]>
Signed-off-by: Pablo Sichert <[email protected]>
@ktff good catch. I found out that |
That exact bug/behavior is mentioned in Gilnaa/globwalk#28. Looks like we don't have an easy path out here. |
That is quite problematic since it will break configurations that rely on it, so this can't be merged until we figure something out or rethink our approach here. I think helping to implement wanted functionalities in either |
Signed-off-by: Pablo Sichert <[email protected]>
@@ -27,7 +27,9 @@ pub mod glob; | |||
pub trait PathsProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the design choices for the path provider was that it's never supposed to push it's error to the caller while enumerating the paths.
My thought process was this.
If the paths provider errors out - how would the caller - file server handle it? It is a meaningful error condition to recognize and act upon there? I deemed not, and decided that there's no good purpose to allow errors to be pushed to the file server. If the error is temporary - the valid behaviour would be to just log and return an empty iter. If the error is critial, and we should terminate the file server - then we should pass the error, however that situation just couldn't occur at the time.
There is another way I can think of that file server can react to an error - and that is, upon receiving a certain kind of error - just skip the current update, and keep using the paths obtained from the previous iterations. It may be useful for some kinds of use cases, like when the paths provider suddenly became unavailable, but we would rather use stale data and hope for the path provider to become available again at next call than terminate the file server. That,s however, also wasn't useful at the time.
The situation may have changed, and the change might be justified, but I'd like to validate that this way of handling error is what we're looking for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That for the rationale. Also don't have a clear cut answer to this, but my thoughts were the following:
It is a meaningful error condition to recognize and act upon there?
This information can be encoded in/inferred from the associated Error
type. In the case of the Glob
PathProvider
and GlobError
, the caller can decide that e.g. InvalidIncludePattern
is unrecoverable because it is a programmer error, whereas a WalkError
can be temporary (due to the directory structure it is executed in), and ignored.
I would even argue that the caller has more information available if the error is recoverable or not, rather than deciding it in the paths
implementation. With this change, the caller can still decide to ignore all errors, but the inverse is not true. If the paths
implementation decides to unwrap and panic, the caller has little chance to override this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This information can be encoded in/inferred from the associated
Error
type.
Correct. The caller can't be coupled with an exact error type to carry this information though - otherwise it won't be generic. To establish a generic protocol we either have to add a trait to the error type, or use a trait-owned wrapper type to encode this data - i.e. Result<..., PathsError<Self::Error>>
, where enum PathsError<T> { UsePreviousPaths(T), Fatal(T) }
.
Yet another question is how much sense does this protocol extension make. In other words - do we need the caller need to be able to decide to ignore the errors, etc? I mean, currently, we're just doing ?
there. Do we need to further extend this protocol?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you're right. We don't specialize the PathProvider
yet in the FileServer::run
function.
i.e. Result<..., PathsErrorSelf::Error>, where enum PathsError { UsePreviousPaths(T), Fatal(T) }.
It makes the trait a bit more verbose, but it does exactly solve the problem we are discussing. I'm leaning for it.
let glob = glob(include_pattern).with_context(|| InvalidIncludePattern { | ||
glob: include_pattern.to_owned(), | ||
})?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In relation to https://github.com/timberio/vector/pull/5927/files#r559039229, and since we're validating that the include patterns are valid at construction, it might be meaningful to assume they're valid here and elide the possibility of an error via an unwrap
.
If I understand correctly, once we've validated the patterns above, it is not possible that they will actually return errors here?
We might want to introduce a dedicated type wrapper that would contain the new(patterns) -> Result<Self, ...>
and build(&self) -> GlobWalker
to better capture this invariant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, either unwrapping or parsing into a more approriate type would be the ideal behavior here.
Let's see what we can do here after the changes to glob
or globwalk
.
Signed-off-by: Pablo Sichert <[email protected]>
Signed-off-by: Pablo Sichert <[email protected]>
match result { | ||
Ok(()) => {} | ||
Err(error) => { | ||
error!(message = "Output channel closed.", error = ?error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log message should probably be preserved.
It seems like this is still blocked by lack of support upstream in |
Resolves #5003.
This PR makes it possible to use glob patterns with curly braces in the
file
source, e.g.{foo,bar}.txt
.