Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SYCL Track Finding, main branch (2024.11.14.) #773

Closed
Closed
4 changes: 2 additions & 2 deletions core/include/traccc/edm/measurement.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -88,14 +88,14 @@ inline bool operator==(const measurement& lhs, const measurement& rhs) {
/// Comparator based on detray barcode value
struct measurement_sort_comp {
TRACCC_HOST_DEVICE
bool operator()(const measurement& lhs, const measurement& rhs) {
bool operator()(const measurement& lhs, const measurement& rhs) const {
return lhs.surface_link < rhs.surface_link;
}
};

struct measurement_equal_comp {
TRACCC_HOST_DEVICE
bool operator()(const measurement& lhs, const measurement& rhs) {
bool operator()(const measurement& lhs, const measurement& rhs) const {
return lhs.surface_link == rhs.surface_link;
}
};
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/** TRACCC library, part of the ACTS project (R&D line)
*
* (c) 2023 CERN for the benefit of the ACTS project
* (c) 2023-2024 CERN for the benefit of the ACTS project
*
* Mozilla Public License Version 2.0
*/
Expand All @@ -25,7 +25,7 @@ struct apply_interaction_payload {
/**
* @brief Total number of input parameters (including non-live ones)
*/
const int n_params;
const unsigned int n_params;

/**
* @brief View object to the vector of bound track parameters
Expand All @@ -36,7 +36,7 @@ struct apply_interaction_payload {
* @brief View object to the vector of boolean-like integers describing
* whether each parameter is live. Has the same size as \ref params_view
*/
vecmem::data::vector_view<const unsigned int> params_liveness_view;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this change? CUDA doesn't natively support 8-bit loads so I'm a bit worried about the performance implications of this. Also is there a reason to use char and not unsigned char?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use these values as bools. Using 32-bits where we only need 1, seems very silly.

Of course bool doesn't work. 😦 But our convention in the offline code is to use char when we need "boolean information", but bool can't be used.

Also, take this into account: https://github.com/acts-project/traccc/blob/main/device/cuda/src/finding/finding_algorithm.cu#L154

What do you think is actually getting set for this buffer with that operation? 😏 Because it's not 0x1 values in the unsigned int variables...

In any case, I can't see why we shouldn't go for this. Even if NVIDIA always copies at least 16 bits, right now we move 32 bits in all cases. Even though we only need 1. If some of the loads are next to each other, this could still win us a little bit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think is actually getting set for this buffer with that operation? 😏 Because it's not 0x1 values in the unsigned int variables...

It's setting a non-zero value; I don't see the problem?

Copy link
Member

@stephenswat stephenswat Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, I can't see why we shouldn't go for this. Even if NVIDIA always copies at least 16 bits, right now we move 32 bits in all cases. Even though we only need 1. If some of the loads are next to each other, this could still win us a little bit.

Think of what happens if four adjacent threads want to write their chars to global memory at the same time.

I seriously do not see why we need to change this to save what boils down to 240 kilobytes...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logically the code worked. But you can't argue that we're not spending more time with memsetting these values, and using more global memory on it, then we need to. 🤔

vecmem::data::vector_view<const char> params_liveness_view;
};

/// Function applying the Pre material interaction to tracks spawned by bound
Expand All @@ -47,7 +47,7 @@ struct apply_interaction_payload {
/// @param[inout] payload The function call payload
template <typename detector_t>
TRACCC_DEVICE inline void apply_interaction(
std::size_t globalIndex, const finding_config& cfg,
unsigned int globalIndex, const finding_config& cfg,
const apply_interaction_payload<detector_t>& payload);
} // namespace traccc::device

Expand Down
4 changes: 2 additions & 2 deletions device/common/include/traccc/finding/device/build_tracks.hpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/** TRACCC library, part of the ACTS project (R&D line)
*
* (c) 2023 CERN for the benefit of the ACTS project
* (c) 2023-2024 CERN for the benefit of the ACTS project
*
* Mozilla Public License Version 2.0
*/
Expand Down Expand Up @@ -65,7 +65,7 @@ struct build_tracks_payload {
/// @param[in] cfg Track finding config object
/// @param[inout] payload The function call payload
template <typename config_t>
TRACCC_DEVICE inline void build_tracks(std::size_t globalIndex,
TRACCC_DEVICE inline void build_tracks(unsigned int globalIndex,
const config_t cfg,
const build_tracks_payload& payload);

Expand Down
10 changes: 5 additions & 5 deletions device/common/include/traccc/finding/device/find_tracks.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ struct find_tracks_payload {
* @brief View object to the vector of boolean-like integers describing the
* liveness of each parameter
*/
vecmem::data::vector_view<const unsigned int> in_params_liveness_view;
vecmem::data::vector_view<const char> in_params_liveness_view;

/**
* @brief The total number of input parameters
Expand Down Expand Up @@ -84,7 +84,7 @@ struct find_tracks_payload {
/**
* @brief View object to the output track parameter liveness vector
*/
vecmem::data::vector_view<unsigned int> out_params_liveness_view;
vecmem::data::vector_view<char> out_params_liveness_view;

/**
* @brief View object to the output candidate links
Expand Down Expand Up @@ -129,10 +129,10 @@ struct find_tracks_shared_payload {
/// @param[in] cfg Track finding config object
/// @param[inout] payload The global memory payload
/// @param[inout] shared_payload The shared memory payload
template <concepts::thread_id1 thread_id_t, concepts::barrier barrier_t,
typename detector_t, typename config_t>
template <typename detector_t, concepts::thread_id1 thread_id_t,
concepts::barrier barrier_t>
TRACCC_DEVICE inline void find_tracks(
thread_id_t& thread_id, barrier_t& barrier, const config_t cfg,
thread_id_t& thread_id, barrier_t& barrier, const finding_config& cfg,
const find_tracks_payload<detector_t>& payload,
const find_tracks_shared_payload& shared_payload);
} // namespace traccc::device
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/** TRACCC library, part of the ACTS project (R&D line)
*
* (c) 2023 CERN for the benefit of the ACTS project
* (c) 2023-2024 CERN for the benefit of the ACTS project
*
* Mozilla Public License Version 2.0
*/
Expand All @@ -18,7 +18,7 @@ namespace traccc::device {

template <typename detector_t>
TRACCC_DEVICE inline void apply_interaction(
std::size_t globalIndex, const finding_config& cfg,
unsigned int globalIndex, const finding_config& cfg,
const apply_interaction_payload<detector_t>& payload) {

// Type definitions
Expand All @@ -30,7 +30,7 @@ TRACCC_DEVICE inline void apply_interaction(

// in param
bound_track_parameters_collection_types::device params(payload.params_view);
vecmem::device_vector<const unsigned int> params_liveness(
vecmem::device_vector<const char> params_liveness(
payload.params_liveness_view);

if (globalIndex >= payload.n_params) {
Expand All @@ -39,7 +39,7 @@ TRACCC_DEVICE inline void apply_interaction(

auto& bound_param = params.at(globalIndex);

if (params_liveness.at(globalIndex) != 0u) {
if (params_liveness.at(globalIndex) != 0) {
// Get surface corresponding to bound params
const detray::tracking_surface sf{det, bound_param.surface_link()};
const typename detector_t::geometry_context ctx{};
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/** TRACCC library, part of the ACTS project (R&D line)
*
* (c) 2023 CERN for the benefit of the ACTS project
* (c) 2023-2024 CERN for the benefit of the ACTS project
*
* Mozilla Public License Version 2.0
*/
Expand All @@ -17,7 +17,7 @@
namespace traccc::device {

template <typename config_t>
TRACCC_DEVICE inline void build_tracks(std::size_t globalIndex,
TRACCC_DEVICE inline void build_tracks(unsigned int globalIndex,
const config_t cfg,
const build_tracks_payload& payload) {

Expand Down Expand Up @@ -73,12 +73,9 @@ TRACCC_DEVICE inline void build_tracks(std::size_t globalIndex,
// Resize the candidates with the exact size
cands_per_track.resize(n_cands);

unsigned int i = 0;

// Reversely iterate to fill the track candidates
for (auto it = cands_per_track.rbegin(); it != cands_per_track.rend();
it++) {
i++;

while (L.meas_idx > n_meas &&
L.previous.first !=
Expand Down
25 changes: 14 additions & 11 deletions device/common/include/traccc/finding/device/impl/find_tracks.ipp
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,10 @@

namespace traccc::device {

template <concepts::thread_id1 thread_id_t, concepts::barrier barrier_t,
typename detector_t, typename config_t>
template <typename detector_t, concepts::thread_id1 thread_id_t,
concepts::barrier barrier_t>
TRACCC_DEVICE inline void find_tracks(
thread_id_t& thread_id, barrier_t& barrier, const config_t cfg,
thread_id_t& thread_id, barrier_t& barrier, const finding_config& cfg,
const find_tracks_payload<detector_t>& payload,
const find_tracks_shared_payload& shared_payload) {

Expand All @@ -53,13 +53,13 @@ TRACCC_DEVICE inline void find_tracks(
payload.measurements_view);
bound_track_parameters_collection_types::const_device in_params(
payload.in_params_view);
vecmem::device_vector<const unsigned int> in_params_liveness(
vecmem::device_vector<const char> in_params_liveness(
payload.in_params_liveness_view);
vecmem::device_vector<const candidate_link> prev_links(
payload.prev_links_view);
bound_track_parameters_collection_types::device out_params(
payload.out_params_view);
vecmem::device_vector<unsigned int> out_params_liveness(
vecmem::device_vector<char> out_params_liveness(
payload.out_params_liveness_view);
vecmem::device_vector<candidate_link> links(payload.links_view);
vecmem::device_atomic_ref<unsigned int,
Expand Down Expand Up @@ -94,7 +94,7 @@ TRACCC_DEVICE inline void find_tracks(
unsigned int num_meas = 0;

if (in_param_id < payload.n_in_params &&
in_params_liveness.at(in_param_id) > 0u) {
in_params_liveness.at(in_param_id) != 0) {
/*
* Get the barcode of this thread's parameters, then find the first
* measurement that matches it.
Expand All @@ -116,7 +116,10 @@ TRACCC_DEVICE inline void find_tracks(
* this thread.
*/
else {
const auto bcd_id = std::distance(barcodes.begin(), lo);
const vecmem::device_vector<const unsigned int>::size_type bcd_id =
static_cast<
vecmem::device_vector<const unsigned int>::size_type>(
std::distance(barcodes.begin(), lo));

init_meas = lo == barcodes.begin() ? 0u : upper_bounds[bcd_id - 1];
num_meas = upper_bounds[bcd_id] - init_meas;
Expand Down Expand Up @@ -185,7 +188,7 @@ TRACCC_DEVICE inline void find_tracks(
const unsigned int owner_global_thread_id =
owner_local_thread_id +
thread_id.getBlockDimX() * thread_id.getBlockIdX();
assert(in_params_liveness.at(owner_global_thread_id) != 0u);
assert(in_params_liveness.at(owner_global_thread_id) != 0);
const bound_track_parameters& in_par =
in_params.at(owner_global_thread_id);
const unsigned int meas_idx =
Expand Down Expand Up @@ -236,7 +239,7 @@ TRACCC_DEVICE inline void find_tracks(
.fetch_add(1u);

out_params.at(l_pos) = trk_state.filtered();
out_params_liveness.at(l_pos) = 1u;
out_params_liveness.at(l_pos) = 1;
}
}
}
Expand Down Expand Up @@ -268,7 +271,7 @@ TRACCC_DEVICE inline void find_tracks(
* match any measurements.
*/
if (in_param_id < payload.n_in_params &&
in_params_liveness.at(in_param_id) > 0u &&
in_params_liveness.at(in_param_id) != 0 &&
shared_payload.shared_num_candidates[thread_id.getLocalThreadIdX()] ==
0u) {
// Add measurement candidates to link
Expand All @@ -292,7 +295,7 @@ TRACCC_DEVICE inline void find_tracks(
}

out_params.at(l_pos) = in_params.at(in_param_id);
out_params_liveness.at(l_pos) = 1u;
out_params_liveness.at(l_pos) = 1;
}
}
}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/** TRACCC library, part of the ACTS project (R&D line)
*
* (c) 2023 CERN for the benefit of the ACTS project
* (c) 2023-2024 CERN for the benefit of the ACTS project
*
* Mozilla Public License Version 2.0
*/
Expand All @@ -15,7 +15,7 @@
namespace traccc::device {

TRACCC_DEVICE inline void make_barcode_sequence(
std::size_t globalIndex, const make_barcode_sequence_payload& payload) {
unsigned int globalIndex, const make_barcode_sequence_payload& payload) {

measurement_collection_types::const_device uniques(payload.uniques_view);
vecmem::device_vector barcodes(payload.barcodes_view);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@

namespace traccc::device {

template <typename propagator_t, typename bfield_t, typename config_t>
template <typename propagator_t, typename bfield_t>
TRACCC_DEVICE inline void propagate_to_next_surface(
std::size_t globalIndex, const config_t cfg,
unsigned int globalIndex, const finding_config& cfg,
const propagate_to_next_surface_payload<propagator_t, bfield_t>& payload) {

if (globalIndex >= payload.n_in_params) {
Expand All @@ -49,8 +49,7 @@ TRACCC_DEVICE inline void propagate_to_next_surface(
n_tracks_per_seed.at(orig_param_id));

const unsigned int s_pos = num_tracks_per_seed.fetch_add(1);
vecmem::device_vector<unsigned int> params_liveness(
payload.params_liveness_view);
vecmem::device_vector<char> params_liveness(payload.params_liveness_view);

if (s_pos >= cfg.max_num_branches_per_seed) {
params_liveness[param_id] = 0u;
Expand All @@ -62,7 +61,7 @@ TRACCC_DEVICE inline void propagate_to_next_surface(
payload.tips_view);

if (links.at(param_id).n_skipped > cfg.max_num_skipping_per_cand) {
params_liveness[param_id] = 0u;
params_liveness[param_id] = 0;
tips.push_back({payload.step, param_id});
return;
}
Expand All @@ -73,7 +72,7 @@ TRACCC_DEVICE inline void propagate_to_next_surface(
// Parameters
bound_track_parameters_collection_types::device params(payload.params_view);

if (params_liveness.at(param_id) == 0u) {
if (params_liveness.at(param_id) == 0) {
return;
}

Expand Down Expand Up @@ -121,12 +120,12 @@ TRACCC_DEVICE inline void propagate_to_next_surface(

if (payload.step == cfg.max_track_candidates_per_track - 1) {
tips.push_back({payload.step, param_id});
params_liveness[param_id] = 0u;
params_liveness[param_id] = 0;
} else {
params_liveness[param_id] = 1u;
params_liveness[param_id] = 1;
}
} else {
params_liveness[param_id] = 0u;
params_liveness[param_id] = 0;

if (payload.step >= cfg.min_track_candidates_per_track - 1) {
tips.push_back({payload.step, param_id});
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

namespace traccc::device {

TRACCC_DEVICE inline void prune_tracks(std::size_t globalIndex,
TRACCC_DEVICE inline void prune_tracks(unsigned int globalIndex,
const prune_tracks_payload& payload) {

track_candidate_container_types::const_device track_candidates(
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/** TRACCC library, part of the ACTS project (R&D line)
*
* (c) 2023 CERN for the benefit of the ACTS project
* (c) 2023-2024 CERN for the benefit of the ACTS project
*
* Mozilla Public License Version 2.0
*/
Expand Down Expand Up @@ -30,7 +30,7 @@ struct make_barcode_sequence_payload {
/// @param[in] globalIndex The index of the current thread
/// @param[inout] payload The function call payload
TRACCC_DEVICE inline void make_barcode_sequence(
std::size_t globalIndex, const make_barcode_sequence_payload& payload);
unsigned int globalIndex, const make_barcode_sequence_payload& payload);
} // namespace traccc::device

#include "./impl/make_barcode_sequence.ipp"
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/** TRACCC library, part of the ACTS project (R&D line)
*
* (c) 2023 CERN for the benefit of the ACTS project
* (c) 2023-2024 CERN for the benefit of the ACTS project
*
* Mozilla Public License Version 2.0
*/
Expand All @@ -13,6 +13,7 @@
#include "traccc/edm/measurement.hpp"
#include "traccc/edm/track_parameters.hpp"
#include "traccc/finding/candidate_link.hpp"
#include "traccc/finding/finding_config.hpp"
#include "traccc/utils/particle.hpp"

namespace traccc::device {
Expand All @@ -36,7 +37,7 @@ struct propagate_to_next_surface_payload {
/**
* @brief View object to the vector of track parameter liveness values
*/
vecmem::data::vector_view<unsigned int> params_liveness_view;
vecmem::data::vector_view<char> params_liveness_view;

/**
* @brief View object to the access order of parameters so they are sorted
Expand Down Expand Up @@ -81,9 +82,9 @@ struct propagate_to_next_surface_payload {
/// @param[in] globalIndex The index of the current thread
/// @param[in] cfg Track finding config object
/// @param[inout] payload The function call payload
template <typename propagator_t, typename bfield_t, typename config_t>
template <typename propagator_t, typename bfield_t>
TRACCC_DEVICE inline void propagate_to_next_surface(
std::size_t globalIndex, const config_t cfg,
unsigned int globalIndex, const finding_config& cfg,
const propagate_to_next_surface_payload<propagator_t, bfield_t>& payload);
} // namespace traccc::device

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ struct prune_tracks_payload {
///
/// @param[in] globalIndex The index of the current thread
/// @param[inout] payload The function call payload
TRACCC_DEVICE inline void prune_tracks(std::size_t globalIndex,
TRACCC_DEVICE inline void prune_tracks(unsigned int globalIndex,
const prune_tracks_payload& payload);
} // namespace traccc::device

Expand Down
Loading
Loading