Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify initial states and PDFs used #135

Open
cschwan opened this issue Apr 1, 2022 · 14 comments
Open

Clarify initial states and PDFs used #135

cschwan opened this issue Apr 1, 2022 · 14 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@cschwan
Copy link
Contributor

cschwan commented Apr 1, 2022

PineAPPL's metadata contains the keys initial_state_1 and initial_state_2, which are the PDG Monte Carlo IDs of the PDFs that the grid must be convolved with.

However, I now realise that the name is confusing, because the PDFs not necessarily agree with the actual hadronic initial states. For instance, when we use processes which collide lead nuclei with protons, but we want to fit only the proton, yadism for instance will convert the lead nuclei using isospin-symmetry to an 'effective proton in lead' so that initial_state_1 = 2212 and initial_state_2 = 2212.

I therefore suggest that we replace initial_state_1 and initial_state_2 with pdf1 and pdf2, and add the following additional keys:

  • in1: the actual initial state 1 of the collision as a PDF MC ID
  • in2: same for the initial state 2

If in1 and pdf1 and/or in2 and pdf2 are different, this means that the situation above is true and that inside the grid we make assumptions about the 'nuclear model'; for this we should document the atomic number (A) and number of protons (Z), for instance in the following way:

  • nuclear_model_1: A=1,Z=1

for deuterons.

Furthermore, if we have leptons or other non-hadronic particles in the initial state, the corresponding pdf1/pdf2 should be present but emtpy.

@cschwan cschwan added the enhancement New feature or request label Apr 1, 2022
@cschwan cschwan self-assigned this Apr 1, 2022
@cschwan
Copy link
Contributor Author

cschwan commented Apr 1, 2022

@alecandido
Copy link
Member

That's a perfect solution for me.

The only additional proposal is that, since PineAPPL grids metadata are always strings, I would make the nuclear_model_x valid JSON, for parsing simplicity. E.g.: for deuteron

nuclear_model_1: {"A": 1, "Z": 1}

@Radonirinaunimi
Copy link
Member

Thanks a lot @cschwan for this. This proposal also is perfect for me (this would make my life infinitely easier).

The only additional proposal is that, since PineAPPL grids metadata are always strings, I would make the nuclear_model_x valid JSON, for parsing simplicity. E.g.: for deuteron

nuclear_model_1: {"A": 1, "Z": 1}

I also like very much this way of representing the metadata which would also make validphys very happy.

@cschwan
Copy link
Contributor Author

cschwan commented Apr 1, 2022

@alecandido @Radonirinaunimi : yes, let's do that!

@scarlehoff
Copy link
Member

For the time being (to get the ball rolling) I will write a script to basically burn the metadata a posteriori. Basically instead of doing it like in this PR NNPDF/nnpdf#1632 (where the information is put in the fit runcard) the relevant theory will be modified to contain the metadata as discussed in this issue.

That way when this is implemented in pineappl (issue #118?) the number of changes in vp will be minimal (maybe the way in which the information is retrieved is changed, but nothing beyond that).

@cschwan
Copy link
Contributor Author

cschwan commented Mar 2, 2023

Another problem that we should keep in mind that

  • real protons and
  • 'protons as the average nucleus' in nuclei

both unfortunately have the same PDG number, and that leads to potential problems in Grid::optimize; it assumes that all protons are equal, and that isn't the case here, clearly. @Radonirinaunimi this might be a problem that you've already stumbled over.

@Radonirinaunimi
Copy link
Member

Another problem that we should keep in mind that

* real protons and

* 'protons as the average nucleus' in nuclei

both unfortunately have the same PDG number, and that leads to potential problems in Grid::optimize; it assumes that all protons are equal, and that isn't the case here, clearly. @Radonirinaunimi this might be a problem that you've already stumbled over.

@cschwan Is this a problem at the level of the storing of the partonic bits or at the level of the convolution? I am not sure if the following applies to the above but usually the way I've dealt with the two different scenarios so far is to always generate grids for the real/free protons and account for the isospin asymmetry later.

@cschwan
Copy link
Contributor Author

cschwan commented Mar 2, 2023

Let's say you have a proton-lead collision, and generate your grid using initial_state_1 and initial_state_2 set to 2212. In that case you should generate a grid where, for instance, u u~ is treated differently from u~ u because both quarks come from different hadrons. This becomes a problem when you optimize the grid because PineAPPL sees that the initial-state PDG IDs are both 2212, and therefore symmetrizes by merging u~ u into u u~. However, this is wrong, because the two 'protons' aren't the same. This could for instance mean that for DY all quarks come from the first hadron, and all anti-quarks from the second hadron. If the two hadrons aren't actually the same you'll get wrong numbers.

Practically you can check this by doing your analyses with your default grid and one where you make sure that it's not optimized.

@Radonirinaunimi
Copy link
Member

That's right! But there is actually a way around this which was the procedure that has been adopted by nNNPDF in the previous releases. That is, the grids are always generated using $ep$ or $pp$ and to get to $eA$ or $pA$ one convolutes the grids with:

$$ f^A(x) = Z f^{p/A}(x) + (A - Z) f^{n/A}(x)$$

with $f^{p/A}(x)$ and $f^{n/A}(x)$ denoting the proton- and neutron-bound PDFs respectively. Doing so ofc assumes that all the nuclear datasets are corrected for isoscalarity even if $A \neq 2Z$.

@cschwan
Copy link
Contributor Author

cschwan commented Mar 3, 2023

@Radonirinaunimi if it's a problem, it would only be one for $ p A $ generated using $ p p $. In that case you should probably shouldn't optimize.

@alecandido
Copy link
Member

@Radonirinaunimi if it's a problem, it would only be one for $ p A $ generated using $ p p $. In that case you should probably shouldn't optimize.

However, this would a problem with the Pineline, since at some point optimize() is called (in Pineko I believe, while in Pinefarm it should be up to the selected external implementation).

@cschwan
Copy link
Contributor Author

cschwan commented Mar 3, 2023

I agree. I think we should start investigating the size of the problem.

@felixhekhorn
Copy link
Contributor

just to echo the discussion from #265 and to summarize the situation: we need to replace initial_state_1 with a more sophisticated structure, which states:

  • whether the hadron is space-like or time-like, i.e. in the initial state or final state
  • whether the hadron is linearly polarized
  • if and how nuclear corrections are taken into account
  • and of course, we still need the PID

@cschwan
Copy link
Contributor Author

cschwan commented Jun 2, 2024

This has been mostly implemented in #287. For general nuclei we could add a new type in Convolution that specifies A and N.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants