Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan Git Diff #1683

Open
TimPaulaskasDS opened this issue Nov 26, 2024 · 2 comments
Open

Scan Git Diff #1683

TimPaulaskasDS opened this issue Nov 26, 2024 · 2 comments
Labels
USER STORY New feature or request

Comments

@TimPaulaskasDS
Copy link

Please add a feature to scan changes of files in a Git commit and provide output on the added/modified lines instead of the entire file. We have 3,300 existing code files, and we want to ensure all new changes are compliant while giving our development team the time to remediate existing code files. As the tool currently works, it scans the entire file regardless of what lines were changed, and if any issues are found in the file that exceeds the severity threshold, it exists with an error code and fails the PR check.

In order for this to work as we need it to, we have to scan the entire file without a severity threshold, compare the output to the git diff, and check whether the lines modified and the lines reported are the same. If so, we throw an error.

ChatGPT suggests something like:

git diff <commit1> <commit2> --unified=0 > diff_output.txt

and

import re

def parse_diff(diff_file):
    modified_lines = {}
    with open(diff_file, 'r') as file:
        for line in file:
            if line.startswith('@@'):
                match = re.search(r'\+(\d+)(?:,(\d+))?', line)
                if match:
                    start_line = int(match.group(1))
                    count = int(match.group(2)) if match.group(2) else 1
                    for i in range(start_line, start_line + count):
                        modified_lines[i] = True
    return modified_lines

def match_analyzer_issues(analyzer_file, modified_lines):
    issues = []
    with open(analyzer_file, 'r') as file:
        for line in file:
            match = re.search(r'(\w+\.cls):(\d+):(\d+)', line)
            if match:
                file_name = match.group(1)
                line_number = int(match.group(2))
                if line_number in modified_lines:
                    issues.append(line.strip())
    return issues

# Parse diff and analyzer output
modified_lines = parse_diff('diff_output.txt')
issues = match_analyzer_issues('analyzer_output.txt', modified_lines)

# Output matching issues
if issues:
    print("Matching issues:")
    for issue in issues:
        print(issue)
else:
    print("No matching issues found.")

Also, it would be beneficial if the above works to provide a cookbook of DevOps implementations on how the tool is used.

@TimPaulaskasDS TimPaulaskasDS changed the title Include in PR Scan Git Diff Nov 26, 2024
@jfeingold35
Copy link
Collaborator

@TimPaulaskasDS , thanks for the feature request. We understand the desire to only see violations associated with new changes to the code. The team will be having an internal discussion about this use case to identify possible solutions, and may report back here after that discussion, so stay tuned. In the meantime, we appreciate your patience.

@stephen-carter-at-sf stephen-carter-at-sf added the USER STORY New feature or request label Nov 26, 2024
Copy link

git2gus bot commented Nov 26, 2024

This issue has been linked to a new work item: W-17319050

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
USER STORY New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants