Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize git_repository_url for better parsing #622

Open
aliceinwire opened this issue Jan 11, 2025 · 9 comments
Open

Standardize git_repository_url for better parsing #622

aliceinwire opened this issue Jan 11, 2025 · 9 comments

Comments

@aliceinwire
Copy link
Member

kci-dev is currently matching repository git configurations to kcidb git_repository_url
The problem is that there is no way fro kci-dev to know if the url as been saved with git:// or http:// or https://
in same rare case the url could have a guest username and password
I propose to standardize git_repository_url in kcidb to the currently most used protocol https:// without any authentication. this could be done easy from the result committer by cleaning the url to be sent or even directly by kcidb with something similar to what kci-dev is doing kernelci/kci-dev#76 (comment)

ref PR: kernelci/kci-dev#76

@spbnick
Copy link
Collaborator

spbnick commented Jan 13, 2025

Thank you for the report, @aliceinwire!

We standardized the URL quite a while ago: https://github.com/kernelci/kcidb-io/blob/308e85f914b687a5544c41bec7c86a752dec8949/kcidb_io/schema/v04_05.py#L210-L224

Basically, the description there is trying to say: "send us the shortest possible HTTPS URL" (that is, e.g. if credentials are not needed, drop them, use shortest path, etc.). And "If that's not available, send us the shorted Git URL". That should cover everything (unless we really need unencrypted HTTP). The problem is of course to get everyone to comply correctly.

@aliceinwire
Copy link
Member Author

aliceinwire commented Jan 14, 2025

kcidb have some benefit from knowing the repository protocol?
Having some URL with git and some URL with HTTPS can be deceiving as most repository usually offers both. why not having kcidb just internally sanitize each URL by defaulting to https?

@spbnick
Copy link
Collaborator

spbnick commented Jan 14, 2025

I would love to have all of them HTTPS, but we had to add an exception in case it is not available. One of the maintainer repos is only available over git://. KCIDB doesn't have any benefits from knowing the repo protocol, except having a URL which could actually be used.

Having some URL with git and some URL with HTTPS can be deceiving as most repository usually offers both.

That's why we have the "Use git://, only if https:// is unavailable" rule. The rules are aimed at producing preferred and unique repo URLs. There's still discrepancy in the use of trailing slash, but we can deal with it later.

@spbnick
Copy link
Collaborator

spbnick commented Jan 29, 2025

@aliceinwire, is this a satisfactory approach for you, can we close this?

@aliceinwire
Copy link
Member Author

there is still no way for know if is git:// or https:// from kci-dev

@spbnick
Copy link
Collaborator

spbnick commented Jan 29, 2025

Hmm, I'm not sure what that exactly means and why kci-dev can't get the URL from KCIDB, but I would say if you simply default to https:// then all except one repo will work, AFAIK. And even then, for the past six months we only had HTTPS URLs (as they're the preference): https://kcidb.kernelci.org/d/home/home?from=now-6M&to=now&timezone=browser&var-datasource=edquppk2ghfcwc&var-origin=$__all&viewPanel=panel-4

@aliceinwire
Copy link
Member Author

aliceinwire commented Jan 30, 2025

kci-dev is getting the url from the git config file and standardize clean by default whatever link to "https://" without possibility to know how the link is saved on kcidb

what is the only link exception? couldn't find it from your link
maybe we can just enforce "https://" from now on?

@spbnick
Copy link
Collaborator

spbnick commented Jan 30, 2025

I think support for this was actually requested by KernelCI, because they had this repository in the configuration: git://git.armlinux.org.uk/~rmk/linux-arm.git, which was last tested on June 15, last year. This repo has only that URL exposed. It might be possible to change that by talking to the owner, Russel King. @broonie might know more.

I think you can safely hardcode this to https:// in kci-dev, at least until Russel finds out about it :D

Regarding changing the schema to require https://, we can do that, and I can help you, given we figure it out with Russel. We would need a major version bump, though, as that's a breaking change. Considering that we might want bundling that change with a queue of other breaking changes I was planning. Perhaps we can start a list of those.

@broonie
Copy link
Member

broonie commented Jan 30, 2025

I don't know off hand about rmk's repo or the hosting, I know most people running servers are happier with https:// than git:// these days since it's a bit friendlier to host but it does require some setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants