Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding optional RPM summary to SBOMs #762

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

a-ovchinnikov
Copy link
Collaborator

@a-ovchinnikov a-ovchinnikov commented Dec 4, 2024

This change adds an option to include RPM summary in a SBOM.

Maintainers will complete the following section

  • Commit messages are descriptive enough
  • Code coverage from testing does not decrease and new code is covered
  • Docs updated (if applicable)
  • Docs links in the code are still valid (if docs were updated)

Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:

@a-ovchinnikov a-ovchinnikov requested review from eskultety and brunoapimentel and removed request for eskultety December 4, 2024 10:57
cachi2/core/models/property_semantics.py Outdated Show resolved Hide resolved
@@ -7,6 +7,7 @@
PropertyName = Literal[
"cachi2:bundler:package:binary",
"cachi2:found_by",
"cachi2:rpm_summary",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use a custom cachi2 property or perhaps the description field for a CycloneDX Component instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. I'll need to think about it more. A Property looks the least invasive way of doing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide to go the component.description path, we might be able to map it to SPDX similarly:
https://spdx.github.io/spdx-spec/v2.3/package-information/#718-package-summary-description-field

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW properties can be straightforwardly converted to annotations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use a custom cachi2 property or perhaps the description field for a CycloneDX Component instead?

Description feels more natural, but then we need to sit down and define what description is expected to represent a particular PM backend's component to remain consistent across all backends. In that case, also for consistency reasons, we should also introduce the same functionality to the rest of the backends in a short timeline span in order not to leave the rest of PMs behind; note that a custom property IMO feels like less of a problem for consistency than making use of an existing CycloneDX field suited directly for this purpose.

Copy link
Member

@eskultety eskultety Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taylormadore would you care enough to create an issue on investigating the possibility of adopting a more general approach covering all backends via the built-in description field? I like the idea in general and I feel we should invest some time into exploring the possibility before ruling it out completely.

@@ -85,6 +86,7 @@ def _query_rpm_fields(file_path: Path) -> dict[str, str]:
"version=%{VERSION}\n"
"release=%{RELEASE}\n"
"arch=%{ARCH}\n"
"summary=%|SUMMARY?{%{SUMMARY}}:{}|\n"
Copy link
Member

@eskultety eskultety Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes the summary is incredibly useless. Thinking ahead, I think we're more likely to find descriptions rather than summaries provided by other PM backends/services in case we want to be consistent, so considering the pros/cons of using 'DESCRIPTION' here instead could be food for thought.
Edit: going with SUMMARY only is also an acceptable outcome for me, just to be clear.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a summary is a resolution for a specific use-case where a user needs this field in a SBOM for feeding some downstream tooling. It does seem to be rather meaningless at times, but that's was the request. I don't think it needs to be present in a SBOM at all, at least not as long as SBOMs are treated as evidences of artifact contents, but also don't think it is a problem as long as it can be turned off.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a summary is a resolution for a specific use-case where a user needs this field in a SBOM for feeding some downstream tooling

Exactly and so I've been wondering (tied somewhat to my earlier comment on the flag existence) how we imagine things to continue working going in the future, more specifically, if this is a downstream-only request then we really should keep the upstream code base lean and maintain a downstream fork exactly for this purpose which used to be the common practice everywhere else apart from the Openshift/k8s world. IOW we should evaluate every such request and decide not only what implications it has, but most importantly whether it makes sense for the project in general, because I don't think a flag for just about anything is a way to a sustainable and maintainable future given how many (in general) combinations you'll end up with and toggling particular SBOM output segments isn't really behaviour configuration per se.
That said, this PR isn't the right place to start having these discussions and so I won't stand in the way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@a-ovchinnikov IIUC https://rpm-software-management.github.io/rpm/manual/tags.html#base-package-tags SUMMARY is a mandatory field so you should be able to safely assume it's always present and hence treat it the same way as VERSION or RELEASE.

@a-ovchinnikov a-ovchinnikov marked this pull request as ready for review January 2, 2025 18:41
@a-ovchinnikov a-ovchinnikov changed the title WIP: Adding optional RPM summary to SBOMs Adding optional RPM summary to SBOMs Jan 2, 2025
This change adds an option to include RPM summary in a SBOM.

Signed-off-by: Alexey Ovchinnikov <[email protected]>
@brunoapimentel
Copy link
Contributor

/retest

Copy link
Member

@eskultety eskultety left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One major comment, otherwise we're pretty much set for merging, one last respin will be needed.

@@ -58,6 +60,8 @@ def from_properties(cls, props: Iterable[Property]) -> "Self":
pip_package_binary = True
elif prop.name == "cachi2:bundler:package:binary":
bundler_package_binary = True
elif prop.name == "cachi2:rpm_summary":
rpm_summary = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line here will cause this feature to never actually format the property to the output. I'd suggest adding an integration test case for this just to be sure we're covered.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tangential nitpick - since this is a custom property only relevant to RPMs and only ever used with RPM components, do we need the additional slight redundancy in the the property name - "rpm_summary" instead of just "summary"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, good catch, thank you!

do we need the additional slight redundancy?

I believe yes, it slightly pollutes SBOMs, but at the same time makes it easier to comprehend where it belongs when reading Properties code.

@@ -85,6 +86,7 @@ def _query_rpm_fields(file_path: Path) -> dict[str, str]:
"version=%{VERSION}\n"
"release=%{RELEASE}\n"
"arch=%{ARCH}\n"
"summary=%|SUMMARY?{%{SUMMARY}}:{}|\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@a-ovchinnikov IIUC https://rpm-software-management.github.io/rpm/manual/tags.html#base-package-tags SUMMARY is a mandatory field so you should be able to safely assume it's always present and hence treat it the same way as VERSION or RELEASE.

@@ -216,6 +216,7 @@ class RpmPackageInput(_PackageInputBase):
"""Accepted input for a rpm package."""

type: Literal["rpm"]
add_rpm_summary: bool = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick - it's part of RpmPackageInput already so the 'rpm' part of the option name is redundant, but at the same time the name itself doesn't indicate this is an SBOM only flag - what about sbom_inlude_summary OR include_summary_in_sbom? What I'm after is some kind of naming pattern that could be reasonably reusable for other such flags we might be required to add, IOW include_XYZ_in_sbom.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants