-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multigpu hcal #498
Multigpu hcal #498
Conversation
… is relinquishable
@vkhristenko why do we move the |
@fwyzard |
@fwyzard pulseOffsets is an algorithm configuration parameter. We use it to decide to fit with 1,3,8 templates. My understanding is that the GPU-fit is not configurable, so in principle there is no need to have all this gymnastic |
this is actually not the right way to think about this. this array shows the distance from the sample of interest. and that can be configurable! for instance, if sample of interest goes from 4 to 3, all you need is to change the contents of this array, not the length. for whatever reasons... regarding the length, true, it is made a static parameter in there, but that can be changed in principle... (removing eigen even further and using more mapping kinda stuff) furthermore, this does not solve a problem, i will need the same for ecal... whenever you want an array of parameters to be allocated/handled generically for the gpu, i found that thru es producer is the easiest way to make this generic, although u gotta write a bunch of code... i'm happy to change that if there are suggestions what i should use ... |
If it's not configurable but constant, I guess a simple approach is to use a |
The problem with event one is how make sure u already transferred things
for that device... I guess I could keep that info somehow
Previously I was advised to use the same mechanism as for conditions...
For constant or not, I can put those arrays into constant mem, then people
will ask “it would be nice to have this configurable”... typically this
goes this way
VK
…On Mon, 6 Jul 2020 at 17:02, Andrea Bocci ***@***.***> wrote:
If it's not configurable but constant, I guess a simple approach is to use
a __constant__ C array on the GPU.
If it's configurable, the simplest approach could be to copy it to the GPU
on event (using the caching allocator, the stream from the context, etc.)
@makortel <https://github.com/makortel>, do you have other suggestions
for this ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#498 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABSFUCI4CBDENZC5RUUCS53R2HRQHANCNFSM4OP4AT6Q>
.
|
On Mon, 6 Jul 2020, 18:33 Viktor Khristenko, ***@***.***> wrote:
The problem with event one is how make sure u already transferred things
for that device... I guess I could keep that info somehow
What I meant was to transfer them for every event.
Previously I was advised to use the same mechanism as for conditions...
OK, let's keep it like this.
|
The SOI is stored in the Frame and has actually a dedicated condition for it pulseOffsets array there is a parameter of the multifit templates not the HCAL-frame. so it's not a detector condition.
|
(I still need to look the code but) in general configurable "constants" are currently most effectively treated as conditions data to avoid transferring them on each event. In principle transferring them from the EDProducer is possible as well, e.g. at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me.
edm::ValidityInterval&) override; | ||
|
||
private: | ||
edm::ParameterSet const& pset_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storing the std::vector<int>
would be better.
@@ -239,6 +215,10 @@ void HBHERecHitProducerGPU::acquire(edm::Event const& event, | |||
setup.get<HcalSiPMCharacteristicsRcd>().get(sipmCharacteristicsHandle); | |||
auto const& sipmCharacteristicsProduct = sipmCharacteristicsHandle->getProduct(ctx.stream()); | |||
|
|||
edm::ESHandle<HcalMahiPulseOffsetsGPU> pulseOffsetsHandle; | |||
setup.get<HcalMahiPulseOffsetsGPURecord>().get(pulseOffsetsHandle); | |||
auto const& pulseOffsetsProduct = pulseOffsetsHandle->getProduct(ctx.stream()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to note that eventually these need to be migrated to ESGetToken (but probably a separate PR is better)
https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideHowToGetDataFromES#In_ED_module
https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideHowToGetDataFromES#Getting_data_from_EventSetup_wit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we need to do it once we move to CMSSW_11_2_X ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we need to do it once we move to CMSSW_11_2_X ?
The "new code must use ESGetToken" is rather a policy than technical requirement (so "need" is before making a PR to CMSSW master
). Technically the migration can be done at any time, the ESGetToken API has been there already for over a year.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - I was wondering if it is technically enforced in 11.2.x.
If it isn't "before making the PR for master" (or rather "while the PR for master is being reviewed") seems like a good moment :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not enforced (yet) for EDModules because of ~5500 existing calls that need to be migrated first :)
HcalMahiPulseOffsetsGPU::HcalMahiPulseOffsetsGPU(edm::ParameterSet const& ps) | ||
{ | ||
auto const& values = ps.getParameter<std::vector<int>>("pulseOffsets"); | ||
values_.resize(values.size()); | ||
std::copy(values.begin(), values.end(), values_.begin()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HcalMahiPulseOffsetsGPU::HcalMahiPulseOffsetsGPU(edm::ParameterSet const& ps) | |
{ | |
auto const& values = ps.getParameter<std::vector<int>>("pulseOffsets"); | |
values_.resize(values.size()); | |
std::copy(values.begin(), values.end(), values_.begin()); | |
} | |
HcalMahiPulseOffsetsGPU::HcalMahiPulseOffsetsGPU(edm::ParameterSet const& ps) : | |
values_(ps.getParameter<std::vector<int>>("pulseOffsets")) | |
{ | |
} |
But I do agree with @makortel , it would be better to pass directly the std::vector<int>
rather than the edm::ParameterSet
.
auto const& product = | ||
product_.dataForCurrentDeviceAsync(cudaStream, [this](HcalMahiPulseOffsetsGPU::Product& product, cudaStream_t cudaStream) { | ||
// malloc | ||
cudaCheck(cudaMalloc((void**)&product.values, this->values_.size() * sizeof(int))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@makortel is there any reason we shouldn't use the caching allocator here as well ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current caching allocator API does not suit well for the ESProduct case, because the life time of the memory is
not related to the processing queued to the argument cudaStream_t
.
I do have an idea (old prototype if I still manage to find it) on how to improve on that on top of #412, although given #487 I'd do it a bit differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I should interpret the silence on the RFC's as "no strong objections" and just go ahead with further prototyping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I a re-read to remind myself about it once 11.2.0-pre2 Patatrack is out...
superseeeded by #502 |
PR description:
this is to allow hcal running on a node with multiple gpus.
all the modules have been updated for that and now basically no protection for cuda service is needed.
the only thing in this pr is that the newly added condition's
Record
should be moved toCondFormats/DataRecord
eventually. Here it sits inside of the hcal mahi packagePR validation:
as usual using the provided exec for hcal