Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yarn: SBOM components #739

Merged

Conversation

slimreaper35
Copy link
Member

Maintainers will complete the following section

  • Commit messages are descriptive enough
  • Code coverage from testing does not decrease and new code is covered
  • Docs updated (if applicable)
  • Docs links in the code are still valid (if docs were updated)

Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:

Copy link
Collaborator

@a-ovchinnikov a-ovchinnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple small questions/nitpicks. Good to go once those are resolved.

cachi2/core/package_managers/yarn_classic/resolver.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/yarn_classic/resolver.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/yarn_classic/main.py Outdated Show resolved Hide resolved
Copy link
Member

@eskultety eskultety left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Straightforward, but some details need a bit more polishing.

@slimreaper35 slimreaper35 force-pushed the yarn-classic-sbom branch 2 times, most recently from 69282bd to d029607 Compare December 9, 2024 07:41
Copy link
Collaborator

@a-ovchinnikov a-ovchinnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple of small non-blocking questions and observations.

@slimreaper35 slimreaper35 force-pushed the yarn-classic-sbom branch 2 times, most recently from c53985e to ffb717c Compare December 12, 2024 09:46
@slimreaper35 slimreaper35 force-pushed the yarn-classic-sbom branch 7 times, most recently from 42d18a7 to 272037e Compare December 16, 2024 14:06
@@ -44,7 +44,7 @@ class _UrlMixin:


@dataclass
class _RelpathMixin:
class _PathMixin:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It probably makes sense to squash this into the prior commit where relpath -> path

The class is very often interpreted as `algorithm:hash`. Let's make use
of builtin `__str__` magic method and return the same format.

Signed-off-by: Michal Šoltis <[email protected]>
To use `RootedPath` as an attribute inside a class that inherits from
pydantic base model, we have to use (for example) dataclasses instead.

In pydantic, the `BaseModel` performs validation and transformation
during object initialization, which differs from Python's dataclass
behavior [1].

This could lead to transformations of the object, so the information
about root and subpath from root attributes inside `RotoedPath` would be
lost. Example:

class Example1(BaseModel):
    rooted_path: RootedPath

@DataClass
class Example2:
    rooted_path: RootedPath

tmp = RootedPath("/tmp")
rooted_path = tmp.join_within_root("a/b/c")

e1 = Example1(rooted_path=rooted_path)
e2 = Example2(rooted_path=rooted_path)

print(e1.rooted_path.subpath_from_root)  # .
print(e2.rooted_path.subpath_from_root)  # a/b/c

Note: All yarn classic package classes have dataclass decorator now.

---
[1]: https://docs.pydantic.dev/latest/concepts/dataclasses/

Signed-off-by: Michal Šoltis <[email protected]>
- rename the class to `PathMixin`
- use name `path` instead of `relpath
- change type of `path` to `RootedPath`

This change prepares yarn classic package classes for purl generation.
File, Link, and Workspace packages needs to have git URL. For that we
need to use `root` attribute of `RootedPath` as the source directory,
where git origin will be used. Plus, `subpath` of the package that will
be taken from `subpath_from_root` attribute.

Signed-off-by: Michal Šoltis <[email protected]>
Each "yarn classic" package type has a property that returns the package
URL based on its attributes and community PURL specification [1].

All package types share the same base -> name, version, and type which
is set to "npm" ("yarn" does not exist).

- `FilePackage`, `WorkspacePackage`, `LinkPackage` have in addition
  subpath component (extra subpath within a package, relative to the
  package root) and version control system URL [2] that comes from the
  source directory origin URL, e.g. processed repo

- `UrlPackage` has one extra qualifier -> its URL as it is defined

- `GitPackage` has one extra qualifier -> package version control system
  URL with a specific syntax [2]

- `RegistryPackage` has two extra qualifiers -> repository_url (default
  repository/registry for "npm" is https://registry.npmjs.org so
  alternative registries such as https://registry.yarnpkg.com should be
  qualified via the qualifier) [3], [4] + the checksum of the package
  converted from Subresource Integrity representation

  Note: The `url` attribute is the exact url that points to the tarball
  in the respective registry. It's important to preserve the attribute
  as it is because of detection of collision in the offline mirror,
  where the name of tarball is needed.

---
[1]: https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst
[2]: https://github.com/spdx/spdx-spec/blob/cfa1b9d08903/chapters/3-package-information.md#37-package-download-location-
[3]: https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#npm
[4]: https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst#known-qualifiers-keyvalue-pairs

Signed-off-by: Michal Šoltis <[email protected]>
These functions could be reused somewhere else. Let's move them to
existing utils.py module. They can't be imported from the main.py module
due to circular import error.

Signed-off-by: Michal Šoltis <[email protected]>
The result of the function `resolve_packages` is a chain object that
behaves similarly as iterator, therefore can only be iterated over once.
The packages variables must preserve for
`verify_offline_mirror_collision` and later for creating SBOM
components.

Signed-off-by: Michal Šoltis <[email protected]>
To make the SBOM even more accurate, we should use the name that comes
from the package respective package.json file either from the cached
tarball or a directory. User could potentially, call these non-registry
dependencies whatever they want and the name in the package.json could
be completely differnet.

Similar behavior is already implemented for yarn berry [1].

---

[1]: https://github.com/containerbuildsystem/cachi2/blob/main/cachi2/core/package_managers/yarn/resolver.py#L356

Signed-off-by: Michal Šoltis <[email protected]>
After a successful pre-fetching of all packages, report all downloaded
packages as components in the final SBOM.

Create the `Component` object from each package based on package
attributes.

Dev packages should have `cdx:npm:package:development` property, that is
added to the component if package is marked for development -> `dev`
attribute is set to True.

Move the rest of the unit test logic to `test_fetch_yarn_source` from
its predecessor in yarn-berry implementation.

Signed-off-by: Michal Šoltis <[email protected]>
@slimreaper35 slimreaper35 added this pull request to the merge queue Dec 17, 2024
Merged via the queue into containerbuildsystem:main with commit e8eca60 Dec 17, 2024
16 checks passed
@slimreaper35 slimreaper35 deleted the yarn-classic-sbom branch December 17, 2024 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants