-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] Guideline statistics #4365
base: master
Are you sure you want to change the base?
Conversation
a5d4382
to
7fb79d6
Compare
The new `Guideline statistics` tab on the Statistics page can list all rules for the selected guidelines. The user can select multiple guidelines (but currently the only one is `sei-cert` and it is the default). The table can show the checker statistics that are related to the specified guideline rule. Rules may connect to more than one checker or may not have any checker. The checker statistics are calculated for runs that are selected (or for all runs if no run selected) in the report filter. It can show guideline name, guideline rule, checker name, checker severity, checker status, number of closed and outstanding reports. The status informs the user about how many runs the given checker was enabled or disabled. Closed and outstanding report counts depend on review and detection status. New config dir was created to store guideline files. Each yaml file represents a guideline an contains its rules. The `Guidelines` class can parse the yamls. We can reach the guideline data via `getGuidelineRules` API endpoint that can return a list of `Rules`.
7fb79d6
to
5a02fd4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. I have some minor comments. otherwise looks good.
- Remove the sei-cert rule titles as it may infringe copyright config/guidelines/sei-cert.yaml
- Separate the guidelines to C and C++ (also 2 files sei-cert-c.yaml, sei-cert-cpp.yaml)
- Call the guideline in the dropdown as SEI CERT C Coding Standard
- Add a new field in the sei-cert.yaml to describe the above human readable title
- include a short doc page with screenshot here https://codechecker-demo.eastus.cloudapp.azure.com/userguide#statistics-pages
- I don't see the tests. shouldn't there be tests for parsing the yaml file and new getGuidelineRules() API function?
align: "center" | ||
}, | ||
{ | ||
text: "Checker Status", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider renaming the column to "Checker Enabled/Disabled"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, please see my remarks, and questions.
1: string ruleId, // The identifier of the rule. | ||
2: string title, // The rule summary. | ||
3: string url, // The link of the rule page. | ||
4: list<map<string, string>> checkers // List of checker names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a list of a map of strings, what are the maps for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case the map represents one checker and contains the checker name and its severity. It is not the same as the struct Checker
but I did not want to create another one checker struct.
@@ -52,13 +53,17 @@ def __init__(self): | |||
if 'CC_TEST_LABELS_DIR' in os.environ: | |||
labels_dir = os.environ['CC_TEST_LABELS_DIR'] | |||
|
|||
guidelines_dir = os.path.join(self._data_files_dir_path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future, I would rather use Path from the pathlib module.
f'{guidelines_dir} is not a directory.') | ||
|
||
guideline_yaml_files = map( | ||
lambda f: os.path.join(guidelines_dir, f), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment from above applies to this os.path usage.
Return the list of rules of a guideline. | ||
""" | ||
|
||
guideline_rules = self.__all_rules.get(guideline_name, []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are using the Defaultdict, then you don't need to provide an empty list for the get()
function. You could also use self.__all_rules[guideline_name]
which is more readable IMO.
import yaml | ||
|
||
|
||
class Guidelines: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a Guidlines
class would benefit from an interface that would return all the available guideline names.
def get_guidelines(self):
return self.__all_rules.keys()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a function that already provides all guidelines and their rules.
def all_guideline_rules(self) -> DefaultDict[str, List[Dict[str, str]]]:
return self.__all_rules
Do you think it is enough or it would be better to have a dedicated guideline name function?
@@ -70,7 +71,11 @@ def __init__(self): | |||
if 'CC_TEST_LABELS_DIR' in os.environ: | |||
labels_dir = os.environ['CC_TEST_LABELS_DIR'] | |||
|
|||
guidelines_dir = os.path.join(self._data_files_dir_path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about Path
usage applies here. It is the recommended way to handle paths in python3.
"checkerName": checker_name, | ||
"severity": self._context.checker_labels.severity( | ||
checker_name).lower() | ||
} for checker_name in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is really hard to understand. Can't you precompute a dict of checker_name -> checker labels instead of iterating it every time in the third nested cycle?
@@ -89,7 +89,7 @@ export default { | |||
}, | |||
props: { | |||
tag: { type: String, default: "span" }, | |||
numGood: { type: Number, required: true }, | |||
numGood: { type: Number, default: 0 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
required: true
is not needed anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is not needed and there are some cases when we need to show just the numBad count chip.
@@ -412,6 +413,15 @@ export default { | |||
}), | |||
}, | |||
|
|||
watch: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What issue does this auto refresh solve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is useful when the url changes and we want to update the report filter as well. For example, a user restricts runs that are not natively analyzed by CodeChecker 6.24 or above.
guideline_name = guideline_data.get("guideline") | ||
rules = guideline_data.get("rules") | ||
|
||
all_rules[guideline_name].extend(rules) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some format checking of the YAML file could be useful. The format checker function of checker label file may give some hint:
def __check_json_format(self, data: dict): |
{ | ||
"guideline1": [ | ||
{ | ||
"rule_id": ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a dictionary where rule_id
is a key:
"guideline1": {
"rule_id1": {
"rule_url": ...
"title": ...
},
"rule_id2": { ... }
}
The new
Guideline statistics
tab on the Statistics page can list all rules for the selected guidelines. The user can select multiple guidelines (but currently the only one issei-cert
and it is the default).The table can show the checker statistics that are related to the specified guideline rule. Rules may connect to more than one checker or may not have any checker.
The checker statistics are calculated for runs that are selected (or for all runs if no run selected) in the report filter. It can show guideline name, guideline rule, checker name, checker severity, checker status, number of closed and outstanding reports.
The status informs the user about how many runs the given checker was enabled or disabled. Closed and outstanding report counts depend on review and detection status.
New config dir was created to store guideline files. Each yaml file represents a guideline an contains its rules. The
Guidelines
class can parse the yamls. We can reach the guideline data viagetGuidelineRules
API endpoint that can return a list ofRules
.