-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changes necessary to validate against SPDX jsonschema #197
Conversation
Thank you ++ This works. ... Great and honored to see you drop by! |
@pombredanne OK to add Python 3 type annotations? OK to format with black? |
@pombredanne I think this could be merged. It shoud be able to read/write json that is valid according to the spdx jsonschema, and it should be able to write multiple checksums. Still open is to read multiple checksums, but that further complicates the code. It would also be nice to add a test that validates the output against the spdx spec using the jsonschema Python package. |
I am not too keen but OK wrt. Python 3 type annotations and yes black is fine, but it would be best in a completely separate PR as this would create a "wall" in the history |
Thanks! I will need a few days before I can review. |
I am not sure why the macOS tests fail... but Circle CI is a bit flaky at times |
macos tests are failing due to using a now-unsupported version of xcode: See #202 for a fix |
@pombredanne let us know if you require any updates to this - it would be useful to get it merged! Thanks |
Signed-off-by: Daniel Holth <[email protected]>
Signed-off-by: Daniel Holth <[email protected]>
@dholth can you rebase this PR to the latest in the main branch? This should fix the CircleCI errors. |
CircleCI should be fixed once #226 is merged and this PR is rebased on top of the changes. |
@dholth Will you have time to rebase this PR on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think adding full package objects to documentDescribes
is a good idea, since it contradicts the spec
package["sha1"] = checksum["checksumValue"] | ||
break | ||
|
||
document["documentDescribes"] = [{ "Package": package} for package in packages ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not comply with the json schema, which just has an array of spdx ids here: https://github.com/spdx/spdx-spec/blob/development/v2.3.1/schemas/spdx-schema.json#L219-L226
Even if it's just a temporary internal data structure, I find this highly confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's very intentional, this PR didn't invent flatten/unflatten as it is a pain to follow links around when the document is in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the point of flattening the structure in this case though. The only instance where the value is accessed is this part where the packages are parsed, and this can be done just as well without modifying the data structure first. In fact, it is already done this way (although the fact that this logic is technically duplicated should be fixed).
If resolving links becomes annoying, I would find it more natural to add a utility method that handles it instead of introducing an additional intermediate data structure. This will come with a performance impact if called many times, but I would be surprised if this was an issue.
Additional remark: The current structure only considers files that are included in packages. To my understanding, files should also be allowed to be independent of any package (this might be related to the python tools still being out of date though, it seems like snippets are not considered at all, for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One additional comment: My understanding so far is that in a json document, any SPDX-Document DESCRIBES SPDX-Package
relationship would be omitted and instead the package id would appear in documentDescribes
(similar for files and snippets). This would imply that during parsing, we have to create new relationships for elements of documentDesccribes
. I don't see that happening anywhere.
Relevant discussion: #242. Let's see if I'm wrong here 😅
@dholth As you already pointed out this PR doesn't handle parsing or other file formats. Do you plan on working on these things more or are you okay with me working on them? |
I would be happy for anyone to pick this up, thanks! |
Thanks for your work @dholth! I openend another PR which continues this one and will close this one. |
These changes were necessary to validate against the SPDX jsonschema https://github.com/spdx/spdx-spec/blob/development/v2.2.2/schemas/spdx-schema.json and to include multiple checksums.
This is a work in progress. No attention has been given to parsing or non-json/yaml formats.
Let me know if this approach is acceptable for the project.
Fixes #184