-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement basic linting #282
base: main
Are you sure you want to change the base?
Conversation
This implements a basic linter into the compiler that can check for required metadata fields and types. I chose to implement it into the compiler so that we can get these checks for free across any tool/binding that uses the compiler directly. Also, it was easier to get nicer reporting by building it into the compiler. This is an optional feature that you can turn on in the compiler using the "required_metadata" method on the compiler. If you do this the compiler will issue warnings if any metadata is missing or if the type is incorrect. I've provided all the basic types (string, int, bool, float) and also some convenience types like md5, sha1, sha256 and hash to make it a bit more flexible for users. Extending these with more convenience types is an easy exercise too.
Prior to this change if you had an invalid config file (ie: using the wrong type for an option) we would just silently replace it with the default config, which made debugging way more difficult than it should have been. We now pass errors back up to the caller if the error is a because the config file can not be parsed. Errors now look like this: ``` Error: invalid type: found signed int `0`, expected a boolean for key "default.fmt.patterns.align_values" in .yara-x.toml TOML file ``` Also, because we now display errors like the above it doesn't make sense to have each command have a -C option. If the config file in ${HOME}/.yara-x.toml is improperly formatted we would try to parse it and display an error even before we called the subcommand. This meant if you had an invalid config file in the default location and used "yr fmt -C /tmp/foo ..." you would see the error from your default config file despite the -C argument. To deal with this the -C argument was removed from each command and moved into the main one. It now loads the default config if, and only if, you do not specify one with the -C argument.
Also a bunch of other smaller changes that got lumped in here.
Move file io out of config helper function and generate better error messages when the file can't be parsed.
rule_name_regexp now returns a Result. This makes error handling in the case of an invalid (or too large) regex easier.
Now that I have a working python environment I can finally fix up these tests.
…doesn't exist In that case the program was failing instead of using the default configuration. Now this case works correctly, while at the same time it raises an error if `--config` is specified with a non-existing path.
Instead of adding multiple new methods to the `Compiler` object for each type of check, add a single `add_linter` method that takes a type that implements the `Linter` trait. Each check is implemented as a new type that implements this type and put under `yara_x::linters`. With this change the compiler's API is not bloated with multiple methods and we can introduce more linters in the future with more complicated APIs.
I wasn't convinced about the proposed API for a number of reasons:
I've refactored the API to make it more flexible, adding a single method to the compiler ( |
Apologies as this PR is a little bigger than I wanted, partially due to fixing a bug in config file handling and partially because it implements two linter checks (required metadata and rule name regexp).
Config file handling improvements
Prior to this change if you had an invalid config file in your home dir we would just silently replace it with the default config, which is unexpected from a user standpoint (invalid configs should produce an error and refuse to run). To deal with this we now pass errors back up in the main command instead of using
.unwrap_or_default()
.Also, rather than have each sub-command implement config file handling I moved it into main, to remove duplicate code. Also, because we are now propagating errors if you have an invalid config in
${HOME}/.yara-x.toml
I improved error messages so it is clear what file failed to parse:Linting checks
To implement linting checks for required metadata and rule name format I chose to implement them in the compiler directly. I initially had an implementation with a stand-alone Linter type that worked on the AST from the parser. While this worked it turned out that getting it to use the existing Report/ReportBuilder mechanism was exceedingly intrusive as reporting is coupled too closely to the compiler (or at least I couldn't figure out how to make it work easily). Because of this I just went with implementing it directly in the compiler, which I think is also a good idea because we now get warnings anywhere the compiler is used, which can be quite beneficial for enforcing standards.
To use these linting checks I added a section in the config file and exposed the options for it there. See the config guide changes for details, but the TLDR is you can now specify a table (TOML term for a dictionary) where the key is the metadata identifier that is required and the value is the type that is required. Any rule that does not meet these requirements will get warnings from the compiler.
The other check is a regex that must match the rule name. Again, this is specified in the config file and warnings are issued by the compiler for any rule that does not match this regex.
The default value for both of these options is off so by default the compiler does not enforce any particular standards but you can opt in to it for linting if you want. The only place that currently sets these options is the
yr check
command.If you're happy with these changes I'd like to also expose them in the various bindings so users can get lint warnings in the language of their choice.