-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GitHub tag matching #103
base: main
Are you sure you want to change the base?
GitHub tag matching #103
Conversation
I cannot make the unittest work. In
How and why does this test work? As a user, I would not feel good about getting |
Found it. The order in which the unittest sets up its mocks does not match the mock naming. |
* fixed mocking order and now mocks old and new implementation
…the result of 'poetry install'
5c9acdb
to
4730fa3
Compare
Some functionality has been moved to protected methods in order to split the task into smaller, more focused parts. There is only one new public method: The core of this PR are the lines in https://github.com/sw360/capycli/blob/martin/fix-github-tag-matching/capycli/bom/findsources.py#L278-L290: While the current approach first fetch all tags that belong to a specific project and then passes the full list to The most important additions are lines 290 and following. If The logic to create the candidates is (to some extend) the inverse of The algo then looks for each candidate in the current result-page and if that local lookup does not yield a match, then the algo queries the GitHub API and specifically asks if a tag with the candidate's name exists. If we can find a match through either of these two lookups, we use that match and stop the search. With my BOMs, I notice a tremendous speedup. On average the guessing part finds a positive match immediately on the first results page. If it doesn't the API query is successful. Using my BOMs, the algo never fetches the second page of tags from GitHub. |
* I dubbed the original implementation verstion_to_github_tag, but on success it would actually return a source url, not a GitHub tag. => rename in allusion to get_matching_tag() which it aims to replace * moved tag guessing heuristic to its own method _gen_tags() * introduced TagCache to avoid throwing the same bad guesses at the GitHub API over and over again. It is used transparently in _gen_tags(). This means it is perfectly viable for _gen_tags() to return an empty list. * also, addressed the mypy shenanigans
9116a54
to
5dd3cbb
Compare
This PR addresses #99 and introduces code intended to replace the current combination of get_github_info() followed by get_matching_tag() which exists only in capycli.bom.findsources.
This approach first tries to match a tag using the original get_matching_tag(). If all the guessing does not yield any results, the algo implicitly falls back to analyzing each tag with get_matching_tag().