Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Guideline statistics #4365

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

cservakt
Copy link
Collaborator

The new Guideline statistics tab on the Statistics page can list all rules for the selected guidelines. The user can select multiple guidelines (but currently the only one is sei-cert and it is the default).

The table can show the checker statistics that are related to the specified guideline rule. Rules may connect to more than one checker or may not have any checker.

The checker statistics are calculated for runs that are selected (or for all runs if no run selected) in the report filter. It can show guideline name, guideline rule, checker name, checker severity, checker status, number of closed and outstanding reports.

The status informs the user about how many runs the given checker was enabled or disabled. Closed and outstanding report counts depend on review and detection status.

New config dir was created to store guideline files. Each yaml file represents a guideline an contains its rules. The Guidelines class can parse the yamls. We can reach the guideline data via getGuidelineRules API endpoint that can return a list of Rules.

@cservakt cservakt added API change 📄 Content of patch changes API! GUI 🎨 config ⚙️ new feature 👍 New feature request label-tool 🔖 Related to tooling that manages the analyzer/checker label configuration labels Oct 14, 2024
@cservakt cservakt added this to the release 6.25.0 milestone Oct 14, 2024
@cservakt cservakt requested a review from dkrupp October 14, 2024 18:37
@cservakt cservakt force-pushed the guideline-statistics branch 2 times, most recently from a5d4382 to 7fb79d6 Compare October 15, 2024 12:36
The new `Guideline statistics` tab on the Statistics page can list all rules for the selected guidelines. The user can select multiple guidelines (but currently the only one is `sei-cert` and it is the default).

The table can show the checker statistics that are related to the specified guideline rule. Rules may connect to more than one checker or may not have any checker.

The checker statistics are calculated for runs that are selected (or for all runs if no run selected) in the report filter. It can show guideline name, guideline rule, checker name, checker severity, checker status, number of closed and outstanding reports.

The status informs the user about how many runs the given checker was enabled or disabled. Closed and outstanding report counts depend on review and detection status.

New config dir was created to store guideline files. Each yaml file represents a guideline an contains its rules. The `Guidelines` class can parse the yamls. We can reach the guideline data via `getGuidelineRules` API endpoint that can return a list of `Rules`.
Copy link
Member

@dkrupp dkrupp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I have some minor comments. otherwise looks good.

  1. Remove the sei-cert rule titles as it may infringe copyright config/guidelines/sei-cert.yaml
  2. Separate the guidelines to C and C++ (also 2 files sei-cert-c.yaml, sei-cert-cpp.yaml)
  3. Call the guideline in the dropdown as SEI CERT C Coding Standard
  4. Add a new field in the sei-cert.yaml to describe the above human readable title
  5. include a short doc page with screenshot here https://codechecker-demo.eastus.cloudapp.azure.com/userguide#statistics-pages
  6. I don't see the tests. shouldn't there be tests for parsing the yaml file and new getGuidelineRules() API function?

align: "center"
},
{
text: "Checker Status",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming the column to "Checker Enabled/Disabled"

Copy link
Collaborator

@vodorok vodorok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, please see my remarks, and questions.

1: string ruleId, // The identifier of the rule.
2: string title, // The rule summary.
3: string url, // The link of the rule page.
4: list<map<string, string>> checkers // List of checker names
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a list of a map of strings, what are the maps for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case the map represents one checker and contains the checker name and its severity. It is not the same as the struct Checker but I did not want to create another one checker struct.

@@ -52,13 +53,17 @@ def __init__(self):
if 'CC_TEST_LABELS_DIR' in os.environ:
labels_dir = os.environ['CC_TEST_LABELS_DIR']

guidelines_dir = os.path.join(self._data_files_dir_path,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, I would rather use Path from the pathlib module.

f'{guidelines_dir} is not a directory.')

guideline_yaml_files = map(
lambda f: os.path.join(guidelines_dir, f),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment from above applies to this os.path usage.

Return the list of rules of a guideline.
"""

guideline_rules = self.__all_rules.get(guideline_name, [])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are using the Defaultdict, then you don't need to provide an empty list for the get() function. You could also use self.__all_rules[guideline_name] which is more readable IMO.

import yaml


class Guidelines:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a Guidlines class would benefit from an interface that would return all the available guideline names.

def get_guidelines(self):
  return self.__all_rules.keys()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a function that already provides all guidelines and their rules.

 def all_guideline_rules(self) -> DefaultDict[str, List[Dict[str, str]]]:
        return self.__all_rules

Do you think it is enough or it would be better to have a dedicated guideline name function?

@@ -70,7 +71,11 @@ def __init__(self):
if 'CC_TEST_LABELS_DIR' in os.environ:
labels_dir = os.environ['CC_TEST_LABELS_DIR']

guidelines_dir = os.path.join(self._data_files_dir_path,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about Path usage applies here. It is the recommended way to handle paths in python3.

"checkerName": checker_name,
"severity": self._context.checker_labels.severity(
checker_name).lower()
} for checker_name in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is really hard to understand. Can't you precompute a dict of checker_name -> checker labels instead of iterating it every time in the third nested cycle?

@@ -89,7 +89,7 @@ export default {
},
props: {
tag: { type: String, default: "span" },
numGood: { type: Number, required: true },
numGood: { type: Number, default: 0 },
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

required: true is not needed anymore?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is not needed and there are some cases when we need to show just the numBad count chip.

@@ -412,6 +413,15 @@ export default {
}),
},

watch: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What issue does this auto refresh solve?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is useful when the url changes and we want to update the report filter as well. For example, a user restricts runs that are not natively analyzed by CodeChecker 6.24 or above.

guideline_name = guideline_data.get("guideline")
rules = guideline_data.get("rules")

all_rules[guideline_name].extend(rules)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some format checking of the YAML file could be useful. The format checker function of checker label file may give some hint:

def __check_json_format(self, data: dict):

{
"guideline1": [
{
"rule_id": ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a dictionary where rule_id is a key:

"guideline1": {
  "rule_id1":  {
    "rule_url": ...
    "title": ...
  },
  "rule_id2": { ... }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change 📄 Content of patch changes API! config ⚙️ GUI 🎨 label-tool 🔖 Related to tooling that manages the analyzer/checker label configuration new feature 👍 New feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants