-
-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Occurrence line numbers are off #1670
Comments
Interesting. Something is terribly wrong somewhere. There are no properties named "Namespaces", so I would expect no occurrence evidences for these. |
A BOM I generated yesterday had similar oddities, although I am failing to reproduce it now. I used the same cdxgen command as above, but it was run on the repository in a state where the project was already built (i.e. What's interesting in this one is that it includes occurrences in {
"location": "target/generated-sources/org/cyclonedx/proto/v1_6/Advisory.java#124"
} The Again, I'm failing to reproduce this now, but perhaps sharing the BOM helps. |
Ah, that makes sense. Ok so when I nuke the atom and slices files, the linked BOM is reproducible. Should I raise a separate issue for the erroneous occurrence assignments, e.g. on |
No problem. Looking into this issue now. Thank you so much for checking! |
@nscuro Thank you so much for flagging this issue.
This is correct behaviour! Those enums and internal types are used in annotations and logging functions in the code base. If there is a vulnerability in alpine-common, we need to track all the internal types that are passing untainted via those external libraries. While occurrence evidence is comprehensive (over-tainted), reachables slices is precise (see attached zip).
Noticed that you were running git clone https://github.com/DependencyTrack/hyades-apiserver.git cdxgen-hyades-apiserver
cd cdxgen-hyades-apiserver
git reset --hard cf2744a829bf97d61fe42c80a019d52e5fb56098
mvn compile
# Run cdxgen in deep mode.
cdxgen -t java --deep -o bom.json $(pwd)
# Run atom in java mode
# During reachables slicing, bom.json would get used to compute the purls
# When we run reachables slicing first, app.atom would include data dependencies.
# Such a rich atom file is useful for both reachables and usages slicing.
<atom dir>/atom.sh -J-Xmx24G reachables -l java -o app.atom -s reachables.slices.json $(pwd)
# Usages slicing will be faster since the app.atom file would get reused.
<atom dir>/atom.sh -J-Xmx24G usages -l java -o app.atom -s usages.slices.json $(pwd)
# Now run evinse
evinse -l java -i bom.json -o bom.evinse.json
Please review the attached zip file and let me know your thoughts. |
Thanks @prabhu, appreciate the thorough response.
That makes sense, however I believe it can be confusing when presented this way in occurrences (as seen by my own misunderstanding). Here I'd much rather expect "obvious", coarser usages of the library in question, perhaps even limited to one occurrence per file. As you say, reachables (or callstack info in CycloneDX lingo) are where the detail is / should be. I am a total noob in this area so please take what I say with a big grain of salt.
Ah, classic case of RTFM! This is on me, I should've checked the docs first. Thanks for the hint, I'll give this a try.
Is this problem just about slowness, or will it plainly not work using the |
@nscuro, it's a good feedback. Usages slices specification is designed with semantics in mind. However, translating that rich structure into a single array for occurrences evidence limits its potential, but serves some higher level use-cases such as identifying a heat-map or training ML models that have limited context window, and so on. The java-node situation definitely could be improved. Just needs someone to run debuggers and investigate why the java process is unable to utilize all cores and threads. There is a default timeout of 10 minutes, so often reachables slicing doesn't finish, requiring this 4-step work around. |
With PR #1672, I have improved the line numbers. It's gotten a bit coarse now for Java and Jar types, so should lead to less confusions around line numbers at least. "occurrences": [
{
"location": "src/main/java/org/dependencytrack/auth/Permissions.java#70"
},
{
"location": "src/main/java/org/dependencytrack/common/ClusterInfo.java#62"
},
{
"location": "src/main/java/org/dependencytrack/common/HttpClientPool.java#123"
},
{
"location": "src/main/java/org/dependencytrack/common/HttpClientPool.java#86"
}, I hope this helps! |
Running cdxgen with
research
profile on thehyades-apiserver
repository yields occurrences with incorrect line numbers.Result: bom-formatted.json
For the component
alpine-common
, the first 3 occurrences listed are:Permissions.java
is the definition of an enum field.ClusterInfo.java
only has85
lines.HttpClientPool.java
only has 137 lines.Perhaps there is some sort of offset which is miscalculated?
The text was updated successfully, but these errors were encountered: