-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
yarn: Forbid zero installs workflow #379
yarn: Forbid zero installs workflow #379
Conversation
Note that this PR was rebased on top of #377 and #376 in order for tests to pass and to be testable locally, but naturally conflicts will arise with other PRs on review. Also, only the last 2 commits are relevant, the rest is just the rebase. Also, the fact I used an RFC prefix to spark a discussion on this topic and did so rather than marking this as a draft since it looks like the concept of drafts is used to mark something as NOT READY for review in this project which isn't the case here. |
eccce8a
to
20e0ecd
Compare
since RFC:
|
20e0ecd
to
95c63ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach LGTM
If it were me, and I weren't planning to ever merge this (I assume that's the case, given that the usual PR template wasn't used), I'd open a PR against my fork for the team to comment on, which could then be turned into a PR against 'main' in this repo once it was actually ready. P.S. We usually use 'draft' exactly to mean "this is ready to review, but NOT merge", e.g. to get early comments on design etc. |
I do plan on merging this, sorry, my bad for messing this up when dropping the RFC by not putting the PR template back in, fixed now!
Note that by doing ^that in upstream context you're decreasing the exposure of your proposed changes to a selected group of people who may be tracking your fork (or notify them via private means) instead of immediately getting input from all interested parties in the community, so again, it was my mistake for not putting the PR template back in, but other than that all RFC discussions ought to happen within the project's main repo if it uses PR/MR based workflow or on the official mailing list if the project uses mail-based workflow simply because personal forks in most cases are just a playground to experiment (e.g. with Git<forge> features) and to fine-tune one's changes against the main repo. |
A project can make use of zero installs a couple of different ways: - offline cache vs - exploded node_modules tree The value of 'nodeLinker' [1] can help us in determining if a project uses any of the above and hence whether a project does use zero installs or not. We're mainly interested in the getter and the setter is only introduced for consistency reasons as we likely won't need to set this when overriding user Yarn configuration. [1] https://v3.yarnpkg.com/configuration/yarnrc#nodeLinker Signed-off-by: Erik Skultety <[email protected]>
Although Yarn docs [1] suggests usage of the 'pnp' linker mode combined with offline mirrors in shape of .yarn/cache directory populated with ZIP archives, the page also mentions alternative way of achieving the same thing - expanded 'node_modules' file tree. The thing with node_modules directory is that there are 2 distinct ways of appearing in the repository: 1) using the 'pnpm' nodeLinker install setting 2) using the 'node-modules' nodeLinker install setting Why is the nodeLinker setting relevant at all? Because for local cachi2 usage where the environment, and more specifically the git repository might not be completely in a clean and pristine condition, in other words the user may have some forgotten node_modules directory in the repository which isn't even checked in and we'd still fail a build request. Therefore, the linker setting can tell us what we should be looking at when evaluating whether a given project does or doesn't use zero-install workflow and modify the check accordingly. [1] https://v3.yarnpkg.com/features/zero-installs Signed-off-by: Erik Skultety <[email protected]>
95c63ab
to
6eb8f8c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
6eb8f8c
to
aa5f548
Compare
The concept of zero installs, i.e. no install needed (git clone is sufficient), is inherently flawed for a number of reasons: - taking over maintenance (by the means of manual updates) of a project's dependencies by baking their sources in to the given project's repository - creating unnecessary bloat (often in form of binary formats) in the repository - moving the trust in package contents from the official packaging tooling and official public registries to a given project which doesn't really solve the biggest security problem of many public packaging repositories - unvetted contents just to mention a few. In context of Yarn what the above would mean is checking in dependencies' ZIP files into the repository. While that may sound like an acceptable use case since Yarn can verify integrity of the ZIP archives, some dependencies (due to e.g. post-install scripts) may end up being unpacked into a .yarn/unplugged directory, effectively creating an exploded node_modules/ dependency tree hierarchy inside the repository which would be needed for the zero install use case to work. However, we would have to employ a complex methodology (still preventing arbitrary code execution) of reliably verifying such dependencies in order to produce an accurate SBOM. Since we already reject projects containing 'node_modules' directory inside the repository for NPM, we can use it as a precedent here. The whole situation would be different if Yarn provided a mechanism to verify integrity of 'unplugged' contents the same way it does it for ZIP files, but unfortunately it doesn't [1]. As a result of this patch some test variants involving the zero-install use case which no longer applies have been adjusted accordingly and dedicated test cases dealing with zero installs were added. [1] Even if one sets the 'immutablePatterns' [2] YarnRc configuration option to something like '**/.yarn/unplugged/**' Yarn doesn't seem to care about the glob pattern unless the whole .yarn/unplugged/<unzipped_package>/node_modules/<package> subdirectory of a given unplugged package tree would end up being removed in which case Yarn finally notices and throws a immutable cache error: The checksum for **/.yarn/unplugged/**/* has been modified by this install, which is explicitly forbidden [2] https://v3.yarnpkg.com/configuration/yarnrc#immutablePatterns Signed-off-by: Erik Skultety <[email protected]>
aa5f548
to
b7cfc5d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
/retest |
0a91337
The concept of zero installs is inherently flawed in that one has to either store binary ZIP files or an exploded node-modules dependency tree inside their project repository. While the former may sound like an acceptable use case from cachi2's hermetic build POV because Yarn provides us with a way to check their integrity, the latter surely doesn't (in fact we refuse projects having node-modules in the repository).
That said, even with the former case some dependencies may specify post-install scripts which will cause such a dependency to be unpacked and expanded under .yarn/cache/unplugged which in simple terms is nothing else than again an expanded node-modules tree which we have to treat the same way as we do for NPM - reject it - because such zero installs would not work without these unplugged packages (if any dependency needed to be unpacked) and at the same time we cannot verify them reliably.
Note that even if one uses the 'immutablePatterns' [1] YarnRc configuration
option with
**/.yarn/unplugged/**
or even**/.yarn/**
Yarn somehow ignores the pattern unless the whole.yarn/unplugged/<unzipped_package>/node_modules/<package>
subdirectory of a given unplugged package tree was completely removed in which case Yarn finally notices and throws an immutable cache error:The whole glob pattern handling in Yarn happens here:
Given that Yarn doesn't provide us with a mechanism to verify integrity of the unplugged packages we're facing essentially the same problem as with
node_modules
handling in NPM which we explicitly forbid. Needless to say that Yarn also provides a backwards compatible way with the NPM ecosystem by creating thenode_modules
directory if thenodeLinker
setting in the RC config was set tonode-modules
and hence the check detecting zero-installs workflow was updated accordingly to reflect the different configuration values for the setting.[1] https://v3.yarnpkg.com/configuration/yarnrc#immutablePatterns
Maintainers will complete the following section
Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:
/ok-to-test
(as is the standard for Pipelines as Code)