Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug - Legacy Mend Parser - Locations - type character varying(4000) error #11340

Closed
1 of 3 tasks
testaccount90009 opened this issue Nov 27, 2024 · 7 comments
Closed
1 of 3 tasks
Labels

Comments

@testaccount90009
Copy link
Contributor

testaccount90009 commented Nov 27, 2024

Slack - https://owasp.slack.com/archives/C2P5BA8MN/p1732583494245009

It's possible when getting Mend Legacy SCA JSON Project vulnerabilities from 1.4 API that Locations will contain many values and error out the Locations 4000 char limit based on the current join. I will make a slight change to the Mend parser that will hopefully reduce the likelihood of this happening.

If you check Mend unit test and scan files, you can see the difference between legacy and platform json schemas. Legacy consists of all Vulnerabilities with a Library as an attribute. Platform consists of all Libraries with vulnerabilities as an attribute.

This means that for Platform the Locations will not be the same style as Legacy Locations.

To confirm, I do not encounter the 4000 char limit error for Locations on Platform schema style imports - only the Legacy style.

Steps to reproduce
Steps to reproduce the behavior:

  1. Download legacy* (non-platform) Mend SCA scan containing many Locations
  2. Attempt to import scan file
  3. Error type character varying(4000).

Expected behavior
Upload the report

Deployment method (select with an X)

  • Docker Compose
  • Kubernetes
  • GoDojo

Environment information

  • Latest DD version, Ubuntu

I will raise a PR to make the change that should solve this fix, but I welcome feedback from the community on the proposed change of 'path' to be replaced with 'dependencyFile' and this should impact only the Legacy parser.

However, instead of 'path' referencing the offending .jar file, it will instead reference the 'dependencyFile' which is typically build.gradle, as an example. When taking into consideration 'Vulnerabilities' in SCA and Direct vs Transitive inclusions -- this can be tricky. Technically the build.gradle is where the issue exists, but the vulnerability is in the offending .jar file.

@mtesauro
Copy link
Contributor

mtesauro commented Nov 28, 2024

@testaccount90009 Based on my reading of ☝️ especially the last paragraph, it seems like choosing to use dependencyFile instead of path is likely to produce less accurate data in DefectDojo:

Technically the build.gradle is where the issue exists, but the vulnerability is in the offending .jar file.

Having read the above and the Slack posts, I think it would be better to reduce the path to a size that fits the DB's 4000 varchar limit plus put the full details into something that can accept more data and doesn't tend to be used for deduplication.

For example, you could remove characters off the front of the path until the path is within the DB limit - or you can get fancy and take off a few more characters so you can add ... to the resulting string. Something like the pseudo code:

If the length of path is > 4000

  • Trim all but the last 3997 characters
  • append '...' to the resulting string from above
  • store this in location for the finding

Additionally add the unmodified value of path to Finding's steps_to_reproduce which is a text type in the DB so the 4k varchar isn't an issue. You can do something nice like the below in steps_to_reproduce:

The vulnerability was found at this location:
/some/really/long/path/that/is/crazy/long/and/more/than/4000/characters
The above was located in the following dependency file:
/path/to/build.gradle

This preserves data from the parsed file while:

  • not making the DB field (location in this case) bigger
  • not adding potentially a long string to something like description which is used somewhat frequently in dedup hashcodes

@testaccount90009
Copy link
Contributor Author

@mtesauro hope your holidays went well!

I think those are great suggestions and I will work to implement the Location in the steps_to_reproduce as mentioned. With the legacy SCA output it seems that a vuln can exist in multiple locations and adding to the 'Locations' does not cleanly map the comma separated values into something human readable in the UI.

I'll work on a PR with the changes - at the same time I add SAST for the Mend Platform, now that I'm back from vacation.

Thank you again for those suggestions!

@testaccount90009
Copy link
Contributor Author

@mtesauro I have a PR I am working on that will fix the Locations and pass to steps_to_reproduce instead.

This will likely be accompanied by Mend Platform SAST logic for the parser as well. I have a no finding, one finding, and many findings set of json files that have been scrubbed and I will provide those + unit tests after I figure out my GitHub errors for trying to fork / clone / check out the dev branch of DefectDojo.

In that PR I will link this Issue + the Slack discussion, and tag you if it's ok.

@testaccount90009
Copy link
Contributor Author

#11366 This issue / bug is what I described in my latest comment with encountering errors cloning the repo onto a Windows system.

@mtesauro
Copy link
Contributor

mtesauro commented Dec 4, 2024

@testaccount90009 Yeah, sorry that that impacted you - as I noted in that issue, we've got a PR we'll merge and get in the next release. Sorry to delay you on your PR work.

@testaccount90009
Copy link
Contributor Author

@mtesauro no problem sir, I appreciate you and team! I believe I can continue to address this now. You are rockstars, thank you! I should have a new PR submitted soon that I'll link here to fix the 4000 char limit and do something as you've mentioned.

@testaccount90009
Copy link
Contributor Author

#11395

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants