-
Notifications
You must be signed in to change notification settings - Fork 911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor libcudf indexalator to typed normalator #14043
Refactor libcudf indexalator to typed normalator #14043
Conversation
@@ -32,193 +31,6 @@ | |||
namespace cudf { | |||
namespace detail { | |||
|
|||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code moved/refactored to normalizing_iterator.cuh
@@ -0,0 +1,362 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code moved/refactored from indexalator.cuh
struct input_normalator : base_normalator<input_normalator<Integer>, Integer> { | ||
friend struct base_normalator<input_normalator<Integer>, Integer>; // for CRTP | ||
|
||
using reference = Integer const; // this keeps STL and thrust happy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the reference
member type be defined on the base_normalator
instead?
I'm not sure if I understand this comment. What keeps STL/Thrust happy, and what is the alternative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just that it has to be here (as like a tag) even though we don't actually use it anywhere in the class.
It is not needed (and may not work) for the output_normalizer
so I think it is more correct to put it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for helping me think through my previous review comments. This looks good to me.
* @brief Indirection operator returns this iterator instance in order | ||
* to capture the `operator=(size_type)` calls. | ||
*/ | ||
__device__ inline output_normalator const& operator*() const { return *this; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fairly certain I haven't understood this part. By returning *this
, then for any output_normalator
p
,
p
== *p
== **p
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have thought we'd have to support semantics like:
output_normalator p = ...;
*p = int16_t{5};
*(p+1) = int32_t{10};
p[2] = int64_t{15};
I haven't understood how current implementation for operator*()
and operator[]()
work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot return a reference to the underlying array element which is what you would normally see.
From what I remember, this is a way to capture that request *itr = value
and redirect it to the operator=(Integer)
function below. It's a subtle thing for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output iterator expects a specific type but can store it as the underlying type.
The operator[]()
works normally in that it handles p[2] = value
.
I've tried to explain the operator*()
and operator=
relationship in a few separate comments now.
These two handle the *p = value
and *(p+1) = value
from your example.
Though only one type Integer
would usually be accepted on the right-hand side (discounting automatic casting).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, maybe I see it now?
*p = 5;
is evaluated as:
output_normalator::operator *()
, which returnsp
.p = 5
callsoutput_normalator::operator =(Integer)
.
Do I have this right?
Were I to nitpick, I would have preferred it if *p
returned an integer_reference
object, and integer_reference::operator=()
did the assignment. As it stands, doesn't the user run the risk of assigning integers to the normalator directly, instead of to *normalator
?
That said, yours has the benefit of brevity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That reference idea sounds brilliant to me! I think that’s much clearer in intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is more to it than what you have described.
*p = 5; is evaluated as:
output_normalator::operator*()
, which returns reference top
.&p = 5
callsoutput_normalator::operator =(Integer)
.
I'm not seeing what the integer_reference
object would be defined to be.
The operator must resolve to a member function on the output_normalizer
object so we can intercept the assignment and place it correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this discussion is getting beyond the intention of the PR.
This PR did not introduce any of iterator mechanisms discussed here but is only moving (and renaming) the code around so it can be reused for future iterators.
I'd be happy to open an issue or separate PR to explore alternate implementations for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's do this in a later PR. I'd propose the following. If you agree, opening an issue for this would be good.
output_normalator::reference
is a member type defined as a classinteger_reference
integer_reference
requirements:- holds some state to know what it's referencing (perhaps the output normalator "begin" iterator and the index which it references?)
- defines
integer_reference::operator=(Integer)
which normalizes and writes to the referenced location - possibly defines other operators that modify the value in-place (as-if by reference)
output_normalator::operator*
is changed to return aninteger_reference
-- this aligns with various STL container iterator designs, where the dereference operator returns a value of the member typereference
output_normalator::operator=(Integer)
is no longer defined -- because assignment of a value to a (non-dereferenced) iterator is an unclear design that we can avoid by usinginteger_reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's do this in a later PR
Yes, sir. That works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Barring a minor suggestion regarding how to deal with output_normalator
assignments, this looks good to ship. Thanks, David!
Sorry if I muddied the waters here with my last comment. Let's not hold this PR up for the |
/merge |
Description
Creates generic normalizing-iterator for integer types for use by the
indexalator
and the future offsets normalizing iterator.Mostly code has been moved around or renamed so the normalizing-iterator part can take type template parameter to identify which integer type to normalize to. For the
indexalator
, this type iscudf::size_type
and for the offsets iterator this type would beint64
.Checklist