Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIP for git based dependencies #321

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

dominykas
Copy link

@dominykas dominykas commented Nov 22, 2023

This is an attempt to revive #214, hopefully leading to a revival of helm/helm#11258 down the road.

  • I took the original draft and squashed it into a single commit, while also renaming the file to avoid conflicts
    • I left the attribution to the original authors (incl. @rally25rs, who authored a PR which contained most of the information here), even though after my editing there's little of that original content left
  • I updated the structure to follow the templates
  • I rewrote most parts to clarify language and hopefully add more context
  • I updated the specification for the git based chart URLs to follow the same convention (VCS+protocol) as npm and Python (this was requested in earlier reviews)
  • I updated the dependency repository URL to allow sub-folders (this was also requested in earlier reviews) (still to be done)

There are some things I don't know yet, but I hope these can be resolved during the review process here:

  • whether the implementation recommendation makes sense (I haven't looked at the original code in detail just yet)
  • are there any implications due to version not being semver
  • are there any implications due to version not necessarily matching the version in the Chart.yaml of the dependency

cc @No9 - you asked for it ;) thanks for your help!

yxxhero and others added 4 commits November 22, 2023 15:44
Signed-off-by: yxxhero <[email protected]>

fix typo

Signed-off-by: yxxhero <[email protected]>

Update hip-0015.md

Signed-off-by: yxxhero <[email protected]>

mv hip 0015 to 0016

Signed-off-by: yxxhero <[email protected]>
Signed-off-by: Dominykas Blyžė <[email protected]>
dominykas added a commit to dominykas/helm that referenced this pull request Dec 4, 2023
dominykas added a commit to dominykas/helm that referenced this pull request Dec 4, 2023
@dominykas dominykas mentioned this pull request Dec 4, 2023
20 tasks
Copy link
Member

@gjenkins8 gjenkins8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, git protocol support for chart dependencies seems like a good idea to me. I think it for the most part would be helpful for Chart develop UX. I put mostly questions about the details.

The only non-detail concern I think is how user's would need to manage dependencies for testing vs non-testing use-cases.

hips/hip-00NN.md Outdated
```
dependencies:
- name: "<dependency name>"
repository: "<protocol>://[<user>[:<password>]@]<hostname>[:<port>][:][/]<path>"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should allow (read: explicitly disallow) username / password in Chart.yaml dependency URL? Charts are generally intended for distribution, and credentials in things intended for distribution isn't great. Helm might want to proactively protect users here.

If we decide to allow. I think a lint rule would be desired.

Generally, the git-credential process (as invoked by git clone) I think would be the preferred method for git to gain access to credentials I think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll admit I just copy/pasted this from npm documentation.

I agree that it should be removed: 888817f

That said, I think it would just work, because whatever works for git clone should work in repository. Do you think it should be explicitly validated and forbidden, or is leaving it undocumented enough?

The implementation code which documents the usage does recommend using a credentials helper: https://github.com/dominykas/helm/blob/e4b4b5e1811af0e18f28a7e4efe3f4b01ebebf59/cmd/helm/dependency.go#L96-L101 (very early wip).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll admit I just copy/pasted this from npm documentation.

😆


I think we should decide to do one of:

  1. explicitly disallow (via validation in Helm; I think the validation would be trivial to implemt: u := url.Parse() + u.User().Password() == "")
  2. not document, and have a lint rule to warn the user they are potentially including sensitive credentials

HIP IMHO should document/detail which decision we decide

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note to explicitly forbid u/p in the implementation: 7b4771b - it is indeed easier to relax constraints down the road, based on user feedback.

hips/hip-00NN.md Outdated
version: "<commit-ish>"
```
where:
- `<protocol>` is one of `git`, `git+ssh`, `git+http`, `git+https`, or `git+file`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could generally specify git[+<subprotocol>]:// should be handled by git? Rather than explicitly listing the above set of protocols?

I don't think an explicit list is necessary bad. It's not like new transports will be added often. But similarly curious if that could/should be avoided.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded to take git[+<subprotocol>] into account, but still listing the protocols that git supports as an example. I figure the rest should really be tackled via documentation? Added a note under "How to teach this": 0f6cd2f

When Helm is installing a dependency from git, it should:

- create a temporary directory
- clone the repo at the specified branch/tag into the temp dir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume, this will executable mean executing git as a subprocess? Which does also mean git chart dependencies will presume/require the existence of a working git installation.

I think worth calling this out explicitly (beyond the mention in the "security implications" section). While it might be kinda obvious, it is also an important detail of course.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be included in the CLI docs, and should be mentioned in the website docs - is it worth mentioning this explicitly in the HIP itself?

Copy link
Member

@gjenkins8 gjenkins8 Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think worth calling out in the HIP. That Helm will require a working git installation to invoke (via subprocess) in order for a Helm to utilize a charts git dependencies without error.

And that last part is also kinda important I guess. A non-functional/missing git installation would cause Helm to error (plenty of folk use contaieners, with minimal dependencies ie. no git installed. And often without git credentials). The HIP IMHO should detail that Helm would need to error in this case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the note: 4fc29f1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, thanks


## Rejected ideas

- An [earlier draft for solving the issue](https://github.com/helm/community/pull/214) suggested using URLs like `git://https://...`. There were comments about that approach in the reference implementations, with the suggestion to use the conventions which are already established in other ecosystems.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, the established convention (as mentioned appears to be e.g git+https://...


When Helm is installing a dependency from git, it should:

- create a temporary directory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the process that other similar systems follow? (go modules, npm, pipenv, etc. And the mentioned plugins?) In that, we should probably copy prior-art here. Rather than re-discovering already solved problems?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know for certain that npm uses a temporary folder for git clone during the npm install process (but it also has other special behaviors around git based deps, that are not applicable to Helm).

I checked the ~/go/pkg folder and it does not seem to have the .git folders there, so at least that implies that some sort of an intermediate method must have been used to perform the cloning?

A quick scan of pipenv code also implies a temp folder: https://github.com/pypa/pipenv/blob/da751e1f5568c72e013af9d977ad54c26e924d0e/pipenv/utils/dependencies.py#L749 (although they also have a special behavior for git+file which might be interesting to learn from).


If the question is about the overall approach - I think at least npm follows a very similar approach, from what I've seen - the end result after cloning does go through their npm pack (I think) process, because the final output in the node_modules folder looks the same way as if it had been npm published to the registry.


I'm not familiar enough with the process and the implementation on the Helm side just yet, but it might make sense to do a helm package instead of treating it as if it's a file:// dependency - not sure about any additional steps it performs. It seems the original implementation PR does it's own tarball build, not sure why just yet - still reviewing it.

When Helm is installing a dependency from git, it should:

- create a temporary directory
- clone the repo at the specified branch/tag into the temp dir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should also do git clone --branch <version> --depth=1? (or whatever the best options for a minimal as possible clone are). But see note about copying prior-art :).

Additionally, submodules come to mind. If the target git repository uses submodules, should these be supported? (how?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the submodules be left out of the initial implementation? If anything - that would make the initial delivery faster.

Good point on shallow cloning. Added a note: 0f6cd2f

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the submodules be left out of the initial implementation?

IMHO sure (as long as it is documented). I guess 99%+ of use cases would be satisfied without submodules, so implementation/support could happily be deferred.


There are existing ways to achieve installation of charts without using a registry, however they are not very user-friendly and require additional tooling.

- A Helm chart repository is effectively an `index.yaml` with links to downloads. Maintaining such a repository does create a burden of scripts and automation (e.g. Github Actions). This is not always feasible for smaller projects. It also does not really offer an obvious and readily available way of testing pre-releases.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: OCI support does alleviate some of these issues with needing to maintain a chart repository. But doesn't really help for e.g the "testing" use case below.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, and Github itself supports OCI (although I haven't tried publishing charts into it...) - it would be great if there was a shared workflow or a standardized Github action to publish Helm charts for maintainers to use...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or a standardized Github action to publish Helm charts for maintainers to use

Good point. I will double check nothing exists, and add an issue on Helm if not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/appany/helm-oci-chart-releaser/tree/main -- there does appear to be at least this (community based) action


## Specification

The `Chart.yaml` should support the following format for `dependencies`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One curiosity I think we need to think about. Is for the "testing" use case, a user might want to refer to a git repository as a dependency. But they probably want to refer to a registry/OCI repository for non-testing.

The workflow of manually updating dependencies post-testing is possible of course. But it can be awkward. UX could be improved later too of course.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've definitely experienced this problem when using npm. In the end, we wrote tooling which does the dependency overrides in CI (via params and conventions), so that we don't have to commit things like this into git, as well as validators to fail release builds in case they find git based dependencies.

Would it make sense to print a warning during helm dep build / helm dep up? I don't think it's done for the file:// based ones, although the use cases are probably different.

helm package could also pay attention to that and require a flag, although I'm not sure if the current implementation even validates what's under chart/* before doing the work? I just deleted a file inside the tgz under charts and it didn't flinch.

The "How to teach this" section does mention that the docs should explicitly recommend registries over git - probably a place for some notes there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to print a warning during helm dep build / helm dep up?

I think this is more of a lint scenario. But curious to what others thoughts might be...

To me, it is not explicit that a git dependency can not be used for production. But that you probably should not (especially not a mutable reference).

OTOH, we could decide to be cautious I think. e.g. helm push/package could refuse to publish a chart with a git dependency. And then folk can argue if that should be relaxed. Once we allow something, the horse has bolted, and we can't rescind that behavior (even if we later think it is a terrible idea).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that a linting rule should print a warning - adding that to the HIP: c16a922

I'm unconvinced about failing to package/push. I can see how a warning could be useful, but having to use a flag to publish might break people's CI scenarios? Is there an established way to solicit more feedback on this?

Copy link
Member

@gjenkins8 gjenkins8 Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unconvinced about failing to package/push. I can see how a warning could be useful, but having to use a flag to publish might break people's CI scenarios?

My sole thought is that it is easy to be cautious initially, and error on publish/push with git dependencies. Then upon feedback/later review open that publishing/pushing with git dependencies. If Helm initially does the reverse: and allows publishing with git dependencies, then removing that functionality later isn't easily possible (it is a "one-way door").

Is there an established way to solicit more feedback on this?

Lets see what other reviewers of this HIP think. The above for me is an option, not a requirement.


- create a temporary directory
- clone the repo at the specified branch/tag into the temp dir
- treat the cloned git repo similar to a `file:///path/to/temp/dir` style requirement; use `chart.LoadDir` to load that directory (which in turn applied the logic for filtering the files through `.helmignore`) and archives it to `charts/`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should double check the packaging behavior will work correctly here. Helm in places assumes (iirc) that charts with the same (semver) version, represent the same artifact. Probably charts fetched from a git branch will have changes, but retain the same semver version. And this assumption would be invalid.

Copy link
Author

@dominykas dominykas Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any clues for specifics? Happy to go look, but the code base has the string "version" in a lot of places and a quick scan did not bring results 😅

I can see it maybe might make a difference for deployment/rollback, but that's probably not applicable when used as a dependency?

It would also have to include the registry for comparison, would it not? There's also the checksum in the lock file to validate sameness? And probably the same issue would apply to file:// dependencies already? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me investigate, and update with some links.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(still need to look here)

While we may or may not explicitly forbid this format, it is probably a good idea to not officially document it, so as to avoid the proliferation bad practices and accidental leakage of secrets.

Signed-off-by: Dominykas Blyžė <[email protected]>
@dominykas dominykas force-pushed the git-dependencies branch 2 times, most recently from 5152d09 to a7d6196 Compare December 13, 2023 11:19
As the protocols are handled by git, it's probably better to define them as something that's supported by `git clone`.

Signed-off-by: Dominykas Blyžė <[email protected]>
@dominykas
Copy link
Author

dominykas commented Dec 15, 2023

I'm also now thinking about sub-dependencies... Should there be any behaviors prescribed if a dependency is installed from a registry, but it contains a sub-dependency which was originally installed from git?

On npm git dependencies are a major security risk, but dependencies are vendored in Helm, so it's maybe not a concern?

There's also the case of git-based dependencies having sub-dependencies (some of which may also be from git?) - meaning that we probably need to do a recursive helm dep up inside the cloned folders? Which could also result in some infinite loops, if someone puts in a circular dependency that way? Does this need to be solved in the HIP? I'm starting to get worried about the scope a little bit - I wonder if it's possible to move forward with an MVP implementation behind a flag and get some broader feedback?

dominykas added a commit to dominykas/helm that referenced this pull request Dec 15, 2023
dominykas added a commit to dominykas/helm that referenced this pull request Dec 15, 2023
dominykas added a commit to dominykas/helm that referenced this pull request Dec 15, 2023
dominykas added a commit to dominykas/helm that referenced this pull request Dec 15, 2023
dominykas added a commit to dominykas/helm that referenced this pull request Dec 15, 2023
dominykas added a commit to dominykas/helm that referenced this pull request Dec 15, 2023
dominykas added a commit to dominykas/helm that referenced this pull request Dec 15, 2023
dominykas added a commit to dominykas/helm that referenced this pull request Dec 15, 2023
dominykas added a commit to dominykas/helm that referenced this pull request Dec 20, 2023
This is to minimize the risk of credential leaks, see discussion: helm/community#321 (comment)

Signed-off-by: Dominykas Blyžė <[email protected]>
hips/hip-00NN.md Outdated
```
dependencies:
- name: "<dependency name>"
repository: "git[+<subprotocol>]://<hostname>[:<port>][:][/]<path>"
Copy link
Author

@dominykas dominykas Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized, that the <path> here is really just the repository path - it is not the sub-path for the folder inside the repository (something that's been requested in some PR reviews as a feature).

There doesn't seem to be a common pattern established:

I kind of like pip's approach - url fragments are a good way to go, although npm would not be able to implement that (should they chose to add support), because they use the url fragment to point to the git branch/tag, i.e. there's going to be discrepancies across ecosystems, so there's no obvious choice here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposing to go ahead with pip approach: b61033b

Draft implementation is really simple: dominykas/helm@0c5734e

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposing to go ahead with pip approach: b61033b

seems good to be at a glance


Charts that start using the new format will effectively be changing their minimum required Helm version, i.e. they would be introducing breaking changes and should bump their major version.

## Security implications
Copy link
Author

@dominykas dominykas Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git allows symlinks that can point to anywhere. Helm resolves symlinks and packages up the file contents with a printed warning. Is that an additional security risk?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a risk to me. Ideally Helm already might not package symlinks (I didn't check)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helm does support resolving symlinks during helm package: helm/helm#8540 - seems that will be done by design, and it should print a warning.

So does helm template ..

However the symlinks are ignored if they exist inside the tarball itself (i.e. if you craft the tarball manually, and include a symlink, then it will not be followed) - I hope that's by design too 😁

I've switched the code in the implementation to effectively use the same code path as helm package (primarily to make sure that if you're installing a chart from git, it would be have as close as possible to it have been helm packaged and published into a registry). This means that symlinks get followed and resolved during helm dep up, but you do get a warning:

walk.go:74: found symbolic link in path: /var/folders/j3/8m3y8q4176118dng0hvdqfh00000gp/T/helm-git-3323594338/templates/secret.yaml resolves to /Users/Dominykas_Blyze/devel/experiments/helm-test/templates/secret.yaml. Contents of linked file included and used

Given that helm package's warning was deemed to be enough - I figure this is also enough? I will add an explicit note under "security implications" to explain this - and possibly that should be left at that?

dominykas added a commit to dominykas/helm that referenced this pull request Feb 6, 2024
dominykas added a commit to dominykas/helm that referenced this pull request Feb 6, 2024
dominykas added a commit to dominykas/helm that referenced this pull request Feb 6, 2024
This is to minimize the risk of credential leaks, see discussion: helm/community#321 (comment)

Signed-off-by: Dominykas Blyžė <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants