Mend SCA Parser update #11395

testaccount90009 · 2024-12-09T19:47:42Z

EDIT - I will publish a different PR with multiple Parser files for one Parser similar to Sonarqube structure. Doing them all in one file is not neat and driving me crazy :). Code findings in API 3.0 / Platform for Mend, SAST respectively.

Minor Platform SCA and Legacy SCA edits.

Part one of this PR is to update 'Locations' and 'File Paths' to try and manage a cleaner way to extract the Path for legacy Mend SCA format vulnerabilities as mentioned here - #11340

Part two of this PR is to remove Locations from Description in the Mend Platform SCA addition I most recently added in 2.40.0.

EDIT - See Edit statement above, but this 'Part 3' for SAST will be contributed after I refactor the parser to have separate files for each, though still retaining one single Parser option. Doing this all in one file is not ideal.
Part three of this PR is to add Mend Platform SAST / 3.0 (Code Security Findings / Vulnerabilities) formatted Findings.

The idea is to create one parser that can accomplish:
-Original Mend Parser SCA Findings preservation. I do not want to update or edit this here much, since people may already be using this in the wild. Previously I added an edit to the Locations / File Paths, but due to the issue linked above; it may negatively impact teams who have a large list of file paths / locations with their Findings from the Mend Legacy SCA (non-platform) output / legacy 1.0 API formatted output.

-New Platform SCA parser. This was implemented in my most recent PR affecting Mend parser.py, however an oversight of adding the Filepath to the Description has me wanting to refactor the Location out of the Description, as to not affect deduplication since Description and Title are used.

EDIT - See above Edits, this isn't in this PR at this moment.
-New Platform SAST parser. This will be the 'meat' and main portion of this PR. Code SAST Findings in Mend should be able to be identified and structured as a result of changes in this PR.

Checklist

This checklist is for your information.

Extra information

Please clear everything below when submitting your pull request, it's here purely for your information.

Moderators: Labels currently accepted for PRs:

Import Scans (for new scanners/importers)
enhancement
performance
feature
bugfix
maintenance (a.k.a chores)
dependencies
New Migration (when the PR introduces a DB migration)
settings_changes (when the PR introduces changes or new settings in settings.dist.py)

Contributors: Git Tips

Rebase on dev branch

If the dev branch has changed since you started working on it, please rebase your work after the current dev.

On your working branch mybranch:

git rebase dev mybranch

In case of conflict:

 git mergetool
 git rebase --continue

When everything's fine on your local branch, force push to your myOrigin remote:

git push myOrigin --force-with-lease

To cancel everything:

git rebase --abort

Squashing commits

git rebase -i origin/dev

Replace pick by fixup on the commits you want squashed out
Replace pick by reword on the first commit if you want to change the commit message
Save the file and quit your editor

Force push to your myOrigin remote:

git push myOrigin --force-with-lease

Code findings in API 3.0 / Platform for Mend, SAST respectively.

dryrunsecurity · 2024-12-09T19:48:02Z

DryRun Security Summary

The pull request enhances the MendParser class by improving vulnerability information extraction, handling report length limits, and filtering inactive findings to provide more detailed and relevant security scanning results.

Expand for full summary

Summary:

The code changes in this pull request focus on improving the functionality and security-related aspects of the MendParser class, which is responsible for parsing the output of the Mend security scanning tool. The changes include:

Enhancing the vulnerability information provided in the parser's output, such as including the vulnerability ID, affected library, and more detailed steps to reproduce.
Improving the handling of the "locations" field in the Mend scan report to ensure that the output does not exceed the maximum length allowed for the steps_to_reproduce field in the Finding model.
Filtering out inactive findings from the list of reported vulnerabilities, ensuring that only relevant and active issues are presented.

These changes are positive from an application security perspective, as they aim to provide security teams with more detailed and better-formatted vulnerability information, which can help in the effective prioritization and remediation of identified security issues.

Files Changed:

unittests/tools/test_mend_parser.py:
- The unit tests for the MendParser class have been updated to reflect the changes in the parser's output, including more detailed vulnerability information in the test assertions.
- The tests now use different input files to validate the parser's handling of multiple vulnerabilities.
dojo/tools/mend/parser.py:
- The _build_common_output function has been updated to handle the "locations" field in the Mend scan report more effectively.
- The code now truncates the "locations" field if it exceeds the maximum allowed length for the steps_to_reproduce field in the Finding model.
- The function also filters out inactive findings from the list of reported vulnerabilities.

These changes demonstrate a focus on improving the security-related functionality and output of the MendParser class, which is an important component in the overall vulnerability management process.

Code Analysis

We ran 9 analyzers against 3 files and 0 analyzers had findings. 9 analyzers had no findings.

View PR in the DryRun Dashboard.

Remove locations / path from Description of SCA Platform output and instead implement locations in steps_to_reproduce.

Updating value to a placeholder severityRating right now of 2.143. Still working on this and the cvssv3 assertion values.

…d to restructure format slightly for SAST

testaccount90009 · 2024-12-09T22:49:50Z

To be clear, this is only for SCA and fixing Locations + removing the Library Path from the description to avoid being used in deduplication.

The mentioned SAST changes will be in a future PR after I refactor and clean up this so I can break out a single parser into multiple files, instead of attempting to code everything in one file.

I will refactor like this structure -
https://github.com/DefectDojo/django-DefectDojo/tree/dev/dojo/tools/sonarqube

Instead of one single parser.py file attempting to construct each output case.

testaccount90009 · 2024-12-10T17:31:18Z

I am testing these changes in my dev stack right now.

…atform similar to SCA Legacy so Findings can be structured similarly

testaccount90009 · 2024-12-10T18:04:29Z

After testing import on Legacy SCA and Platform SCA:

Legacy - These changes fix the varchar 4000 issue for the file paths field for legacy SCA vulns, while adding it into steps_to_reproduce successfully. The file_paths is no longer used for Legacy SCA and can no longer error out on the var char 4000 limit for that field in the DB/model.

Platform - Adding logic for steps_to_reproduce to retrieve the component.path and Transitive dependencies should now include the component -> path as the steps_to_reproduce. If the library and vulnerability is Direct, then Path is not in the Mend SCA Platform output json file, only for Transitive ones for some reason. Maybe they (Mend) will update this at some point, but it's still good to gather the component path the vulnerability is in and whenever they get around to adding Direct dependencies with the component path data, it will be available in the Finding in DefectDojo with this parser change.\

I am pretty sure this is the final Legacy / SCA Platform parser change I am setting out to accomplish, before my next PR which will break this out similar to a structure as defined in the sonarqube tool parser. One parser file to handle the context of multiple schemas is not the cleanest approach, and after talking with my team - we agreed that before I introduce the Platform SAST parser capabilities, it would be best to restructure the Mend parser files to be similar to that of SonarQube and making the logic a bit easier to follow, when broken out.

…it's in impact as well Removing redundancy component dependency type from description since it's in impact as well

Remove duplicate join to fix steps_to_reproduce formatting

testaccount90009 · 2024-12-12T17:45:43Z

It's worth noting that in the SCA Platform uploads, some of the json files are missing component -> path and cannot pull the location into steps to reproduce field. This is a bug with Mend that I am bringing up with their support team, as their Platform UI does not show the location either - whereas their Legacy one does... This means it's there, just some kind of bug on their backend for providing that Location data into the Platform UI + json output. Assuming Mend fixes Direct dependencies to have a Location (only works for Transitive, pulling in the Direct dependency via dependencyFile) then the steps to reproduce will eventually start picking up the Locations of Direct dependencies, once Mend identifies a fix.

In a nutshell: if Mend scanned something and identified a library - it has to know where the location is (whether direct match or dep file inclusion). So by design, that should be included and since it's in the Legacy UI + json, it's possible -- so something in their Platform is broken.

mtesauro

Approved

testaccount90009 · 2024-12-18T21:51:10Z

I have just encountered a bug in this parser for some edge cases. I am looking into it now. The error / bug mentioned is Internal Server Error, 500, and when I check Logs it's due to impact and Direct / Transitive sometimes being empty with the component -> dependencyType. I am not sure why this data is missing from Mend, though I have a meeting with them this Friday - so I will bring up this bug and the other one I have mentioned with the missing Locations.

I have a fix being tested for this right now. I think it's just adding on line 66 a , "" after the node["component"].get("dependencyType") -- since sometimes that is not always populated for each Finding, due to what I just described above.

I'll test on my stack and my imports first, then I'll push the change here.

testaccount90009 · 2024-12-19T00:08:16Z

I can confirm the 500 internal server error no longer occurs with these recent changes.

cneill · 2024-12-19T20:59:49Z

dojo/tools/mend/parser.py

+            if locations and len(", ".join(locations)) > 3999:
+                locations = [loc[:3999] for loc in locations]


Since you're now populating the impact field instead of the file_path field with its max character limit, I wouldn't truncate these paths. This also truncates each individual path such that 2 paths, one with 4000 characters and the other with 10, would still exceed the 4000 character limit when joined together into a single string.

And more of a side comment for future reference: when we truncate anything, we want to add ellipses (...) to indicate to the user that not all data from the input file is present.

@cneill I believe impact is only for Direct vs Transitive information and you are referencing my switch of steps_to_reproduce to join the locations.

You are right though, I should clean this code up and make it better.

cneill · 2024-12-19T21:02:20Z

dojo/tools/mend/parser.py

@@ -188,7 +192,8 @@ def _build_common_output(node, lib_name=None):
                dynamic_finding=True,
                cvssv3=cvss3_vector,
                cvssv3_score=float(cvss3_score) if cvss3_score is not None else None,
-                impact=impact,
+                impact=impact if impact is not None else None,
+                steps_to_reproduce="**Locations Found**: " + ", ".join(locations) if locations is not None else None,


I would suggest separating these paths with newlines (\n). It can be harder to tell where one path ends and the next begins when they're all strung together with commas, especially if there are spaces/commas in those paths.

Newline paths break table formatting - so I avoid those where possible. If the goal is to send these to Jira, a newline separator will break any grouping in a table. I encountered this with the ASFF parser modifications I made, so I'd like to continue to use a comma instead of a newline.

adding cneil suggestion for truncating locations

testaccount90009 · 2024-12-20T18:03:07Z

Just to add - there is a bug with Mend Platform API vs UI showing discrepancies and inconsistencies in the data. I have raised a support case and it is being worked on by Mend engineering teams.

For example, projects are showing different Findings data between the UI and API. API has "more" data and Findings compared to the UI which seems to be missing some (unless the API is just wrong and getting data from somewhere else on their backend).

I will follow up either here or in Slack when my support case for data inconsistencies is updated.

Maffooch

Holding until there is a resolution from the Mend team and this does not accidentally get merged

testaccount90009 · 2024-12-20T22:43:00Z

@Maffooch I have met with our rep who has escalated the case to engineering internally within the Mend team. I provided a full recorded screenshare and 2 sample projects showcasing differences between their Mend Platform UI and API and they are aware + working on a fix. Given the holiday season, I expect a bit of a delay unfortunately.

I'll comment back with their resolution, hopefully it will get fixed quickly. I will validate our own projects and the integrity of data between UI and API, then verify here.

Maffooch · 2024-12-20T22:45:45Z

@testaccount90009 Thank you for your work on this. 😄

testaccount90009 · 2024-12-21T06:05:22Z

TL:DR - They're aware of their inconsistent data which is what caused this issue - I added a check to the parser code to only grab findingInfo -> status == ACTIVE findings. The parser file for multiple-findings.json contains totalItems = 11, but should only be 5 Active. I've updated the unit test to reflect that as well.

I will run a test for this fix locally on my dev stack on Monday and confirm if this fixed the issue or not.

Per Mend: Legacy UI and API show matched and consistent data with no concept of a finding being remediated. Mend does an override for a scan inventory (though you can specify append) which removes all of the libraries from the inventory and the vulnerabilities related to those libraries.

However... in Mend Platform UI this is also true, but NOT for the API... The API can return remediated 'LIBRARY_REMOVED' (findingInfo -> status) Findings and vulnerabilities. Why their UI doesn't show this - no clue.. Why their API docs don't document this - no clue.. Their API and UI are still inconsistent (UI should show remediated Findings then if they're sticking around for history), so I am going to ask Mend to clarify their 3.0 API Documentation, as well as update their UI to show "LIBRARY_REMOVED" Findings as well.

Quite frankly, I'd prefer if their 3.0 Platform API just removed the findings from the json 'getProjectSecurityDependencyFindings' call, but I guess they feel the need to keep those around in the API (but not present them in the UI)... A head scratcher for sure!

…status

testaccount90009 · 2024-12-21T20:18:14Z

I will test this in my dev stack on Monday, but I believe this is an iteration of the parser that can parse ACTIVE only Findings. I also updated the unit tests to make sure title was coming across correctly.

Add unit test json files for Mend Platform - SAST Findings

f192eb6

Code findings in API 3.0 / Platform for Mend, SAST respectively.

github-actions bot added the unittests label Dec 9, 2024

testaccount90009 added 3 commits December 9, 2024 11:48

Update test_mend_parser.py

52e9388

Fix newline eof

7d9b037

Update parser.py

69de822

Remove locations / path from Description of SCA Platform output and instead implement locations in steps_to_reproduce.

testaccount90009 mentioned this pull request Dec 9, 2024

Bug - Legacy Mend Parser - Locations - type character varying(4000) error #11340

Closed

3 tasks

github-actions bot added the parser label Dec 9, 2024

testaccount90009 added 3 commits December 9, 2024 11:59

Update test_mend_parser.py

f014e94

Updating value to a placeholder severityRating right now of 2.143. Still working on this and the cvssv3 assertion values.

Update test_mend_parser.py

6aa852e

Refactoring this for SCA only and SAST will be an additional PR - nee…

6ef3e18

…d to restructure format slightly for SAST

testaccount90009 changed the title ~~Mend Platform SAST + SCA Parser update~~ Mend SCA Parser update Dec 9, 2024

testaccount90009 added 3 commits December 9, 2024 14:59

Update parser.py

bee60ea

Update parser.py

893edc9

Update parser.py

128698d

testaccount90009 added 3 commits December 10, 2024 09:55

fix loc 399 to 3999 for locations + add steps_to_reproduce for SCA Pl…

9290fae

…atform similar to SCA Legacy so Findings can be structured similarly

Fix spacing, remove trailing comma

99323d6

Fix space

34d8364

testaccount90009 added 9 commits December 10, 2024 10:08

Removing redundancy

9d50ad5

Fix the join on locations for Platform SCA

122fbbc

Removing redundancy component dependency type from description since …

9ab5fbb

…it's in impact as well Removing redundancy component dependency type from description since it's in impact as well

Make Impact more readible

f179732

Update parser.py

866ec7a

Update format for unit test

3e0e95f

Fix case on unit test

1357de4

Remove duplicate join to fix steps_to_reproduce formatting

ff0725c

Remove duplicate join to fix steps_to_reproduce formatting

Fix join for locations and steps_to_reproduce

d4ac0e2

mtesauro approved these changes Dec 15, 2024

View reviewed changes

Maffooch approved these changes Dec 17, 2024

View reviewed changes

blakeaowens approved these changes Dec 18, 2024

View reviewed changes

testaccount90009 added 2 commits December 18, 2024 14:09

fix 500 internal server error bug, add else None to impact for edge case

a472fac

fix typo

7e5fe44

cneill reviewed Dec 19, 2024

View reviewed changes

adding cneil suggestion for truncating locations

7042699

adding cneil suggestion for truncating locations

testaccount90009 requested a review from cneill December 20, 2024 17:39

Maffooch requested changes Dec 20, 2024

View reviewed changes

testaccount90009 added 2 commits December 20, 2024 21:55

implement suggestion from Mend engineers

34ab714

fix new line

df287f7

testaccount90009 added 11 commits December 20, 2024 22:15

Update parser.py

7a935fe

Update parser.py

7eda867

fix indentation

21f1bfa

fix finding_info

d4196e9

attempt to fix conditional logic for only getting ACTIVE findingInfo …

5ad78cb

…status

attempt to fix cve and title for new component logic

1d0b778

update unit test for new title check

ed9a18c

fix whitespace before comma

4c1e25c

add unit test case for legacy title

0670477

fix duplicate conditional - add into tree_node logic

b7adf09

add title logic back in

1c833af

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mend SCA Parser update #11395

Mend SCA Parser update #11395

testaccount90009 commented Dec 9, 2024 •

edited

Loading

dryrunsecurity bot commented Dec 9, 2024 •

edited

Loading

testaccount90009 commented Dec 9, 2024 •

edited

Loading

testaccount90009 commented Dec 10, 2024

testaccount90009 commented Dec 10, 2024

testaccount90009 commented Dec 12, 2024

mtesauro left a comment

testaccount90009 commented Dec 18, 2024

testaccount90009 commented Dec 19, 2024

cneill Dec 19, 2024

testaccount90009 Dec 20, 2024

cneill Dec 19, 2024

testaccount90009 Dec 20, 2024

testaccount90009 commented Dec 20, 2024

Maffooch left a comment

testaccount90009 commented Dec 20, 2024

Maffooch commented Dec 20, 2024

testaccount90009 commented Dec 21, 2024

testaccount90009 commented Dec 21, 2024

		if locations and len(", ".join(locations)) > 3999:
		locations = [loc[:3999] for loc in locations]

Mend SCA Parser update #11395

Are you sure you want to change the base?

Mend SCA Parser update #11395

Conversation

testaccount90009 commented Dec 9, 2024 • edited Loading

Contributors: Git Tips

Rebase on dev branch

Squashing commits

dryrunsecurity bot commented Dec 9, 2024 • edited Loading

DryRun Security Summary

Code Analysis

testaccount90009 commented Dec 9, 2024 • edited Loading

testaccount90009 commented Dec 10, 2024

testaccount90009 commented Dec 10, 2024

testaccount90009 commented Dec 12, 2024

mtesauro left a comment

Choose a reason for hiding this comment

testaccount90009 commented Dec 18, 2024

testaccount90009 commented Dec 19, 2024

cneill Dec 19, 2024

Choose a reason for hiding this comment

testaccount90009 Dec 20, 2024

Choose a reason for hiding this comment

cneill Dec 19, 2024

Choose a reason for hiding this comment

testaccount90009 Dec 20, 2024

Choose a reason for hiding this comment

testaccount90009 commented Dec 20, 2024

Maffooch left a comment

Choose a reason for hiding this comment

testaccount90009 commented Dec 20, 2024

Maffooch commented Dec 20, 2024

testaccount90009 commented Dec 21, 2024

testaccount90009 commented Dec 21, 2024

testaccount90009 commented Dec 9, 2024 •

edited

Loading

dryrunsecurity bot commented Dec 9, 2024 •

edited

Loading

testaccount90009 commented Dec 9, 2024 •

edited

Loading