-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Hashing seeding algorithm #3148
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few early general comments
Examples/Io/Csv/include/ActsExamples/Io/Csv/CsvBucketWriter.hpp
Outdated
Show resolved
Hide resolved
f36c928
to
59d6f89
Compare
e05b5bd
to
d5b7f83
Compare
Jeremy pinged me privately on it, but I thought it would make more sense to answer here, publicly. What you forgot about updating / extending is this file: https://github.com/acts-project/acts/blob/main/cmake/ActsConfig.cmake.in You are now introducing a new external, which Since in this PR's setup Annoy would always be needed, you should add find_dependency(Annoy) without any Actually, the logic of only calling But that's a separate issue. For this PR, just add |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3148 +/- ##
==========================================
- Coverage 47.66% 47.43% -0.23%
==========================================
Files 509 511 +2
Lines 29425 30053 +628
Branches 14131 14561 +430
==========================================
+ Hits 14026 14257 +231
- Misses 5285 5326 +41
- Partials 10114 10470 +356 ☔ View full report in Codecov by Sentry. |
3d290d4
to
8a2072d
Compare
8013201
to
4056c17
Compare
b032d9a
to
a46ed99
Compare
Quality Gate passedIssues Measures |
Adds the Hashing for the seeding algorithm.
Instead of doing the seeding on every space points at once, this approach create small groups of space points called buckets and do the standard seeding on each of those buckets independently.
As there is overlaps between the buckets, the same seed might be reconstructed several times (due to independent reconstruction in several buckets). A set container of seeds is used to naturally handle this duplication.
The
SeedingAlgorithmHashing
example provided can be seen as a generalization of the standardSeedingAlgorithm
. A unique bucket containing all space points of the events is expected to give strictly the same seeds than theSeedingAlgorithm
on the event as theSeedingAlgorithmHashing
behave like a wrapper around the standard algorithm.This approach mitigate the filtering of good seeds due to an upper limit on the number of seeds sharing the same middle space point (the
maxSeedsPerSpM
parameter).More details are in the poster presented on CTD 2023.
A third party software called Annoy is used to create the buckets. Some modifications of the software code has been done with respect to the official repository linked previously.