Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split ParticleID collection into smaller collections #131

Merged
merged 3 commits into from
Oct 23, 2024

Conversation

tmadlener
Copy link
Contributor

@tmadlener tmadlener commented Oct 4, 2024

BEGINRELEASENOTES

  • Split the global ParticleIDs collection into several smaller collections to facilitate downstream usage after the reversal of the ParticleID - ReconstructedParticle relation direction in EDM4hep#268.
    • Tracks will get a ParticleIDCollection with the suffix _PID
    • Jets will get a ParticleIDCollection with the suffix _HF_tags for any heavy flavor tags, and _tau_tags for any tau tags.

ENDRELEASENOTES

Having one large ParticleID collection made sense historically before EDM4hep reversed the direction of the ReconstructedParticle -> ParticleID relation. Since now ParticleIDs point to the ReconstructedParticles it's easier to handle downstream (and especially in FCCAnalyses) if there are several smaller and more specific ParticleID collections for each of the input collections that are converted.

@bistapf since we are at it should we also think about splitting the tags for the jets into separate collections?

@tmadlener
Copy link
Contributor Author

@kjvbrt @BrieucF as you will probably have inferred from HEP-FCC/FCCAnalyses#411 this will have an effect there as well. I think it should make things a bit easier on the function implementation side, because there is less index hopping involved now. On the other hand users will now have to deal with potentially multiple ParticleID collections.

Also just as a reminder: Most of the functions that deal with ParticleID in FCCAnalyses will have to be changed in any case since key4hep/EDM4hep#268 changed the direction of the relation.

@tmadlener tmadlener force-pushed the split-pid-collections branch from 183a804 to 1bacea7 Compare October 15, 2024 12:52
@bistapf
Copy link
Contributor

bistapf commented Oct 15, 2024

As far as I'm aware, the only code for reading the jet tags from Delphes is in the FCC-hh analyzer that I have adapted in HEP-FCC/FCCAnalyses#411. If this is not true and I have missed something, please let me know in the PR and I will adapt the code.

For the track PIDs I'm not sure if they are used, but I'm happy to adapt code as needed there as well (on FCCAnalyses side).

@kjvbrt
Copy link
Contributor

kjvbrt commented Oct 15, 2024

I like the idea, this will also help with how many PIDs will need to be searched, right?
Would the splitting be advantageous also in the case the analysis will use datasource and the PIDHandler?

@tmadlener
Copy link
Contributor Author

Yes, it should make working with PIDs in general (also outside of FCCAnalyses) easier because you don't have to do some specific filtering according to e.g. the type or by checking which reconstructed particle is attached to it.

Would the splitting be advantageous also in the case the analysis will use datasource and the PIDHandler?

The PIDHandler can deal with one large PID collection, or several smaller ones. It doesn't really care where the ParticleID objects come from. However, the internal maps will obviously be smaller if it is constructed from smaller PID collections, which would at least make its memory footprint smaller.

tmadlener and others added 3 commits October 15, 2024 19:03
Having one large ParticleID collection made sense historically before
EDM4hep reversed the direction of the ReconstructedParticle ->
ParticleID relation. Since now ParticleIDs point to the
ReconstructedParticles it's easier to handle downstream (and especially
in FCCAnalyses) if there are several smaller and more specific
ParticleID collections for each of the input collections that are
converted.
@tmadlener tmadlener force-pushed the split-pid-collections branch from 1bacea7 to f45a960 Compare October 15, 2024 17:03
@bistapf
Copy link
Contributor

bistapf commented Oct 23, 2024

We discussed this in the FCCAnalyses chat today with @kjvbrt and @jeyserma and it seems there is no (officially supported) FCCAnalyses code relying on this.
Also since the big inclusiveParticleIDCollection is replaced by the smaller, split ones with different names, it should be immediately obvious that there is an issue to be fixed, in case there is any code relying on this collection after all.
So this should be good to go.

@tmadlener tmadlener merged commit 9e3ea08 into key4hep:main Oct 23, 2024
7 checks passed
@tmadlener tmadlener deleted the split-pid-collections branch October 23, 2024 15:17
@kjvbrt
Copy link
Contributor

kjvbrt commented Oct 23, 2024

There is function ReconstructedParticle::getJet_btag and then the ones from @bistapf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants