Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: analysis result caching #5186

Closed
fviernau opened this issue Mar 24, 2022 · 6 comments
Closed

RFC: analysis result caching #5186

fviernau opened this issue Mar 24, 2022 · 6 comments
Labels
analyzer About the analyzer tool

Comments

@fviernau
Copy link
Member

fviernau commented Mar 24, 2022

The result of the analysis can change for the following reasons:

  1. A first level dependency has been added / removed / changed
  2. The version constraints resolve differently
    • versions not fixed, new release of a (transitive) dependency
    • tooling update, change in the heuristic to resolve versions
  3. change in the ordered list of artifact repositories

So, dependency trees may change between two analyzer runs for the exact same source tree.
In order to seed-up the average analysis duration (for CI/CD), the analysis result could be cached.
Therefore it seems like 1. and 3. could be used as cache key, roughly speaking:
If first level dependencies and repositories didn't change, then use the result from the cache if
it doesn't the entries' age doesn't exceed a configured max age.

@sschuberth
Copy link
Member

If first level dependencies and repositories didn't change, then use the result from the cache

How does that guard against your 2a) case?

Also, for analyzers that use CLI tools, a different version of that tool might have an effect on the version resolution.

@sschuberth sschuberth added analyzer About the analyzer tool new feature labels Mar 24, 2022
@fviernau
Copy link
Member Author

How does that guard against your 2a) case?

It doesn't. My thoughts were: when reviewing compliance you need to define how old your analyzer result can be at most.
I guessed that one would say something like: "if (direct) dependencies didn't change, then the analysis can be X amount of time old". I proposed to translate X then into the max cache age and point #1 and #2 into cache key. This idea could be too specific and a more generic approach could be needed.

Also, for analyzers that use CLI tools, a different version of that tool might have an effect on the version resolution.

Right, would it fit into 2.b?

@sschuberth
Copy link
Member

Also, for analyzers that use CLI tools, a different version of that tool might have an effect on the version resolution.

Right, would it fit into 2.b?

Yes.

@sschuberth
Copy link
Member

I generally like the idea of analyzer result caching, but I wonder whether we should limit ourselves to simple cases first, e.g. cases where a lockfile is present, and simply use the hash of the lockfile as the key for cache lookup.

@sschuberth
Copy link
Member

Maybe another option could be to look into the direction of #8361.

@sschuberth
Copy link
Member

Closed as part of backlog grooming. Feel free to comment if you would like to contribute to this.

@sschuberth sschuberth closed this as not planned Won't fix, can't repro, duplicate, stale Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analyzer About the analyzer tool
Projects
None yet
Development

No branches or pull requests

2 participants