-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add from_arrow_device
function to cudf interop using nanoarrow
#15458
Add from_arrow_device
function to cudf interop using nanoarrow
#15458
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few nanoarrow notes. Looking great!
NANOARROW_THROW_NOT_OK( | ||
ArrowSchemaViewInit(&child_schema_view, schema->schema->children[0], nullptr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably do want the error message here, with bonus points if it has the column name/field ref path. Most users won't see the errors (unless a producer produces an invalid schema), but they do show up in CI and save much debugging effort. I will make a PR into nanoarrow to make this less annoying to do.
cpp/src/interop/from_arrow_device.cu
Outdated
auto const has_nulls = skip_mask ? false : input->null_count > 0; | ||
bitmask_type const* null_mask = | ||
has_nulls ? reinterpret_cast<bitmask_type const*>(input->buffers[0]) : nullptr; | ||
auto data_buffer = input->buffers[1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like you don't need buffer sizes here...if there does come a point where you do need them (or want to do some basic validation of your input), ArrowArrayViewSetArrayMinimal()
can get (some of) them for you (without any device-host copying).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, I haven't quite figured out when / why I should use ArrowArrayView
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for when you need the buffer sizes (e.g., validation, copying) and is the basis for higher-level getters like ArrowArrayViewGetInt()
. I'm not sure you need any of that here 🙂
CUDF_FUNC_RANGE(); | ||
|
||
ArrowSchemaView view; | ||
NANOARROW_THROW_NOT_OK(ArrowSchemaViewInit(&view, schema, nullptr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like you might be calling this more than once. I don't think this is particularly expensive, but if there is an easy way to avoid it it might be nice. (This is also a nanoarrow problem: we don't provide an easy internal way to avoid this. It might be that the ArrowArrayView
should have an ArrowSchemaView
as one of its members since this one comes up a lot).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, currently as far as I can tell, ArrowSchemaView
doesn't initialize the views for its children so I need to call it for each child.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One for each child is definitely OK...I think sometimes it happens here twice for the same ArrowSchema
too (but perhaps I'm reading the code wrong).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it happens twice for the same schema as I've made it a point to pass around ArrowSchemaView
and not ArrowSchema
to the functions, so if you spot an area that is calling it more than once on the same schema, let me know.
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a few questions/comments about the related to_arrow
function, but the PR got merged before I could submit them. So they don't get immediately lost in time, here were my comments about it:
- Do you plan to support converting from CUDF view types, e.g.
cudf::column_view
andcudf::table_view
to non-owning Arrow C Data Arrays? - Does the resulting
ArrowArray
take ownership of the passedcudf::column
? The only cognitive cue is the presence of an r-value reference. It might be helpful to clarify more explicitly in the docs that->release()
causes the contents ofcudf::column
to be destroyed. - Do you have plans to support specifying the target
device_type
anddevice_id
as well as allowing for deep copies when needed (e.g. host<->device changes)? This would be useful to allow an array to be passed to another non-CUDA framework. - How about 1-row Arrow C device arrays to represent scalars?
- Am I not mistaken that the Arrow C decimal formats for
decimal32
anddecimal64
ared:pp,ss,032
andd:pp,ss,064
respectively? It would be helpful to support zero-copy by default here. - non-zero exit statuses returned by
cudaEventCreate
,cudaEventRecord
, andcudaEventDestroy
are ignored. - I did not see tests for random nulls or the full space of numeric types (e.g.
(u|)int(8|16|32|64)
,float32
,float64
). - The
to_arrow
tests could try more levels of nesting of structs and lists.
input->device_type == ARROW_DEVICE_CUDA_HOST || | ||
input->device_type == ARROW_DEVICE_CUDA_MANAGED, | ||
"ArrowDeviceArray memory must be accessible to CUDA"); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems very likely that the device_type
will be ARROW_DEVICE_HOST
. The ability to copy to ARROW_DEVICE_CUDA
from ARROW_DEVICE_HOST
would be very useful for passing between CPU, Cuda frameworks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently if the data is on the CPU, you would use from_arrow
instead of from_arrow_device
to do the conversion and copying between CPU and the GPU. A future PR will be to update the existing to_arrow
/from_arrow
methods to leverage nanoarrow and the C Data interface to perform CPU/GPU copies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the suggestion might be to use something like cudaHostGetDevicePointer()
to avoid the CPU altogether for a chunk of CPU memory that is already page-locked/registered with the driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPU memory that is already page-locked / registered with the driver would be ARROW_DEVICE_CUDA_MANAGED
or ARROW_DEVICE_CUDA_HOST
which should work fine in these functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion was allowing the user to specify the target device type and id to allow GPU->CPU
and CPU->GPU
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intend on enabling the GPU->CPU
and CPU->GPU
cases, just not in this PR. That will be a follow-up to this one which would be to replace the existing from_arrow
and to_arrow
APIs that use the C++ libarrow with CPU based memory.
|
||
TEST_F(FromArrowDeviceTest, DateTimeTable) | ||
{ | ||
auto data = std::vector<int64_t>{1, 2, 3, 4, 5, 6}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests should produce data larger than 8 elements with non-zero slice starts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this with the SliceTests that you can find at line 457 in here. It uses columns with 10k elements and various slice starts/ends including non-zero slice starts.
auto got_cudf_table = cudf::from_arrow_device(schema.get(), &input); | ||
CUDF_TEST_EXPECT_TABLES_EQUAL(expected_cudf_table, *got_cudf_table); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no tests for bool
and (u|)int(8|16|32|64)
, nor random null masks, nor slicing. You could consider writing a helper function that generates random data and another helper function that performs comparisons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the FromArrowDeviceTestSlice
provides tests calling get_nanoarrow_tables
which produces bool columns with random null masks and slicing.
} | ||
|
||
} // namespace | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the equivalent for columns, i.e. returning a unique_column_view_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you can't overload/specialize based solely on the return type, I chose to go with the table_view, if you need a single column then it is returned as a table_view with a single column, by which you can pull the column_view
from it. I can't think of a use case offhand where it would be useful / better to make this templated or otherwise and create a unique_column_view_t
.
Is there a case you can think of that isn't easily serviced by just doing unique_table_view->column(0)
?
CUDF_TEST_EXPECT_TABLES_EQUAL(expected_table_view, *got_cudf_table_view); | ||
} | ||
|
||
TEST_F(FromArrowDeviceTest, NestedList) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you include more levels of nesting, more slicing, and toggle random nulls on/off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also handled with the Slicing tests via the get_nanoarrow_tables
which has multiple levels of nesting and random nulls with the slicing
Started here: #15465
The ownership transfer is mentioned here: cudf/cpp/include/cudf/interop.hpp Line 249 in 9ae32fe
Do you think we need more than that? |
There are plans to replace the existing
I can add this, sure.
The Arrow format currently does not support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my first pass through. I got through everything but the new tests. The new functionality looks very good, fewer traps than in the other direction I think.
cpp/src/interop/from_arrow_device.cu
Outdated
owned_columns_t owned_mem; | ||
|
||
auto type = arrow_to_cudf_type(schema); | ||
if (type != data_type(type_id::STRUCT)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC the reason that you have this block is so that a caller could provide either a top-level struct column representing multiple columns (i.e. a table) with each child representing a column in the table, or they could provide a single non-struct column in which case we interpret that as a single column to stick into a table and return. Is that what you're trying to accomplish here? Please let me know if I'm misunderstanding.
Assuming that I've understood correctly, I don't think that we should support this use case. This breaks some key invariants, such as the idempotency of to_arrow_device(from_arrow_device(...))
, and it removes the otherwise fairly strict isomorphism between an ArrowDeviceArray and a cudf table_view (up to violations that come from to_arrow_device
transforming 32 and 64 bit decimals to 128 bit). Moreover, it makes it practically confusing that a single struct column cannot be interconverted whereas a single column of any other type can.
I recognize the dilemma here though in that for to_arrow_device
we could have column and table overloads whereas here we cannot return based solely on the return type. Although none of the alternatives are ideal here, I would generally prefer one of the obvious ones: 1) add a parameter, 2) add a template parameter, or 3) add a separate function with a different name (e.g. from_arrow_device_column
and from_arrow_device_table
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of the options, I guess the best option I can think of would be to add a from_arrow_device_column
function and have the from_arrow_device
function that returns a table_view throw an error if the input is not a struct column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be fine. IMO that is preferable to the current behavior. @davidwendt WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with adding the from_arrow_device_column
and throwing an exception if passed to the wrong API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool let's do that then.
@@ -183,6 +204,24 @@ void get_nanoarrow_dict_array(ArrowArray* arr, | |||
get_nanoarrow_array<IND_TYPE>(arr, ind, validity); | |||
} | |||
|
|||
template <typename KEY_TYPE, typename IND_TYPE> | |||
void populate_dict_from_col(ArrowArray* arr, cudf::dictionary_column_view dview) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be consistent about whether these helpers are defined in this header or in to_arrow_device_test.cpp
. populate_list_from_col
is declared here but defined there. Do you think they're reusable (also possibly a question for @davidwendt on his PR)? If they are, maybe we need a nanoarrow_utils.cpp
file with definitions, which I also think David has created in his PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how much effort we should put into supporting dictionary.
I would be ok with throwing an exception for now and then supporting it later when someone asks for it.
This goes for to_arrow_device
as well.
I always prefer when an essential detail is included in the first paragraph. |
@vyasr @davidwendt any updates on my replies / questions? Anything left to address on my end? |
I should free up to take another look at this PR by tomorrow. |
/ok to test |
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this when @davidwendt is. I think the main open question is how to handle the single-column use-case. I'll defer to him on what API he'd like to see there between the three options I laid out in #15458 (comment). I'm fine moving forward with a separate function if that's the decision, but a (template) parameter is also OK with me.
/ok to test |
Merged with base branch to fix build error that started this morning and is fixed now. |
/ok to test |
@vyasr @davidwendt I resolved the conflicts, so we should hopefully be good here to merge unless there's other changes you'd like me to make |
/ok to test |
This looks good to me.
|
% compute-sanitizer --tool memcheck cpp/build/gtests/INTEROP_TEST --gtest_filter="FromArrowDevice*" --rmm_mode=cuda
========= COMPUTE-SANITIZER
Note: Google Test filter = FromArrowDevice*
[==========] Running 21 tests from 7 test suites.
[----------] Global test environment set-up.
[----------] 10 tests from FromArrowDeviceTest
[ RUN ] FromArrowDeviceTest.FailConditions
[ OK ] FromArrowDeviceTest.FailConditions (0 ms)
[ RUN ] FromArrowDeviceTest.EmptyTable
[ OK ] FromArrowDeviceTest.EmptyTable (95 ms)
[ RUN ] FromArrowDeviceTest.DateTimeTable
[ OK ] FromArrowDeviceTest.DateTimeTable (28 ms)
[ RUN ] FromArrowDeviceTest.NestedList
[ OK ] FromArrowDeviceTest.NestedList (38 ms)
[ RUN ] FromArrowDeviceTest.StructColumn
[ OK ] FromArrowDeviceTest.StructColumn (75 ms)
[ RUN ] FromArrowDeviceTest.DictionaryIndicesType
[ OK ] FromArrowDeviceTest.DictionaryIndicesType (127 ms)
[ RUN ] FromArrowDeviceTest.FixedPoint128Table
[ OK ] FromArrowDeviceTest.FixedPoint128Table (22 ms)
[ RUN ] FromArrowDeviceTest.FixedPoint128TableLarge
[ OK ] FromArrowDeviceTest.FixedPoint128TableLarge (23 ms)
[ RUN ] FromArrowDeviceTest.FixedPoint128TableNulls
[ OK ] FromArrowDeviceTest.FixedPoint128TableNulls (23 ms)
[ RUN ] FromArrowDeviceTest.FixedPoint128TableNullsLarge
[ OK ] FromArrowDeviceTest.FixedPoint128TableNullsLarge (24 ms)
[----------] 10 tests from FromArrowDeviceTest (458 ms total)
[----------] 1 test from FromArrowDeviceTestDurationsTest/0, where TypeParam = cuda::std::__4::chrono::duration<int,cuda::std::__4::ratio<86400l,1l> >
[ RUN ] FromArrowDeviceTestDurationsTest/0.DurationTable
[ OK ] FromArrowDeviceTestDurationsTest/0.DurationTable (0 ms)
[----------] 1 test from FromArrowDeviceTestDurationsTest/0 (0 ms total)
[----------] 1 test from FromArrowDeviceTestDurationsTest/1, where TypeParam = cuda::std::__4::chrono::duration<long,cuda::std::__4::ratio<1l,1l> >
[ RUN ] FromArrowDeviceTestDurationsTest/1.DurationTable
[ OK ] FromArrowDeviceTestDurationsTest/1.DurationTable (3 ms)
[----------] 1 test from FromArrowDeviceTestDurationsTest/1 (3 ms total)
[----------] 1 test from FromArrowDeviceTestDurationsTest/2, where TypeParam = cuda::std::__4::chrono::duration<long,cuda::std::__4::ratio<1l,1000l> >
[ RUN ] FromArrowDeviceTestDurationsTest/2.DurationTable
[ OK ] FromArrowDeviceTestDurationsTest/2.DurationTable (3 ms)
[----------] 1 test from FromArrowDeviceTestDurationsTest/2 (3 ms total)
[----------] 1 test from FromArrowDeviceTestDurationsTest/3, where TypeParam = cuda::std::__4::chrono::duration<long,cuda::std::__4::ratio<1l,1000000l> >
[ RUN ] FromArrowDeviceTestDurationsTest/3.DurationTable
[ OK ] FromArrowDeviceTestDurationsTest/3.DurationTable (3 ms)
[----------] 1 test from FromArrowDeviceTestDurationsTest/3 (3 ms total)
[----------] 1 test from FromArrowDeviceTestDurationsTest/4, where TypeParam = cuda::std::__4::chrono::duration<long,cuda::std::__4::ratio<1l,1000000000l> >
[ RUN ] FromArrowDeviceTestDurationsTest/4.DurationTable
[ OK ] FromArrowDeviceTestDurationsTest/4.DurationTable (3 ms)
[----------] 1 test from FromArrowDeviceTestDurationsTest/4 (3 ms total)
[----------] 6 tests from FromArrowDeviceTest/FromArrowDeviceTestSlice
[ RUN ] FromArrowDeviceTest/FromArrowDeviceTestSlice.SliceTest/0
[ OK ] FromArrowDeviceTest/FromArrowDeviceTestSlice.SliceTest/0 (211 ms)
[ RUN ] FromArrowDeviceTest/FromArrowDeviceTestSlice.SliceTest/1
[ OK ] FromArrowDeviceTest/FromArrowDeviceTestSlice.SliceTest/1 (222 ms)
[ RUN ] FromArrowDeviceTest/FromArrowDeviceTestSlice.SliceTest/2
[ OK ] FromArrowDeviceTest/FromArrowDeviceTestSlice.SliceTest/2 (238 ms)
[ RUN ] FromArrowDeviceTest/FromArrowDeviceTestSlice.SliceTest/3
[ OK ] FromArrowDeviceTest/FromArrowDeviceTestSlice.SliceTest/3 (118 ms)
[ RUN ] FromArrowDeviceTest/FromArrowDeviceTestSlice.SliceTest/4
[ OK ] FromArrowDeviceTest/FromArrowDeviceTestSlice.SliceTest/4 (204 ms)
[ RUN ] FromArrowDeviceTest/FromArrowDeviceTestSlice.SliceTest/5
[ OK ] FromArrowDeviceTest/FromArrowDeviceTestSlice.SliceTest/5 (116 ms)
[----------] 6 tests from FromArrowDeviceTest/FromArrowDeviceTestSlice (1111 ms total)
[----------] Global test environment tear-down
[==========] 21 tests from 7 test suites ran. (1583 ms total)
[ PASSED ] 21 tests.
========= ERROR SUMMARY: 0 errors Looks good! |
/merge |
Thanks @zeroshade! You and @davidwendt and I ought to coordinate a bit on next steps here. Since #15465 is also merged, the obvious next steps I see would be 1) a corresponding |
Agreed! I was planning on the host APIs being my next steps personally. The @davidwendt, since you did the work for #15465, would you want to do the corresponding |
Fixes debug build errors introduced by #15458 These warnings show up in a debug build where warnings become errors. Some of the errors: ``` /cudf/cpp/tests/interop/from_arrow_device_test.cpp:103:27: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaSetTypeStruct(ArrowSchema*, int64_t)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 103 | ArrowSchemaSetTypeStruct(input_schema.get(), 1); /cudf/cpp/tests/interop/from_arrow_device_test.cpp:105:29: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaSetTypeDateTime(ArrowSchema*, ArrowType, ArrowTimeUnit, const char*)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 105 | ArrowSchemaSetTypeDateTime( /cudf/cpp/tests/interop/from_arrow_device_test.cpp:107:21: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaSetName(ArrowSchema*, const char*)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 107 | ArrowSchemaSetName(input_schema->children[0], "a"); /cudf/cpp/tests/interop/from_arrow_device_test.cpp:110:27: error: ignoring return value of 'ArrowErrorCode cudfArrowArrayInitFromSchema(ArrowArray*, const ArrowSchema*, ArrowError*)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 110 | ArrowArrayInitFromSchema(input_array.get(), input_schema.get(), nullptr); /cudf/cpp/tests/interop/from_arrow_device_test.cpp:115:26: error: ignoring return value of 'ArrowErrorCode ArrowBufferSetAllocator(ArrowBuffer*, ArrowBufferAllocator)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 115 | ArrowBufferSetAllocator(ArrowArrayBuffer(input_array->children[0], 1), noop_alloc); | ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /cudf/cpp/tests/interop/from_arrow_device_test.cpp:118:27: error: ignoring return value of 'ArrowErrorCode cudfArrowArrayFinishBuilding(ArrowArray*, ArrowValidationLevel, ArrowError*)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 118 | ArrowArrayFinishBuilding(input_array.get(), NANOARROW_VALIDATION_LEVEL_MINIMAL, nullptr); /cudf/cpp/tests/interop/from_arrow_device_test.cpp: In member function 'virtual void FromArrowDeviceTest_NestedList_Test::TestBody()': /cudf/cpp/tests/interop/from_arrow_device_test.cpp:202:27: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaSetTypeStruct(ArrowSchema*, int64_t)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 202 | ArrowSchemaSetTypeStruct(input_schema.get(), 1); /cudf/cpp/tests/interop/from_arrow_device_test.cpp:204:26: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaInitFromType(ArrowSchema*, ArrowType)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 204 | ArrowSchemaInitFromType(input_schema->children[0], NANOARROW_TYPE_LIST); /cudf/cpp/tests/interop/from_arrow_device_test.cpp:205:21: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaSetName(ArrowSchema*, const char*)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 205 | ArrowSchemaSetName(input_schema->children[0], "a"); /cudf/cpp/tests/interop/from_arrow_device_test.cpp:208:26: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaInitFromType(ArrowSchema*, ArrowType)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 208 | ArrowSchemaInitFromType(input_schema->children[0]->children[0], NANOARROW_TYPE_LIST); /cudf/cpp/tests/interop/from_arrow_device_test.cpp:209:21: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaSetName(ArrowSchema*, const char*)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 209 | ArrowSchemaSetName(input_schema->children[0]->children[0], "element"); /cudf/cpp/tests/interop/from_arrow_device_test.cpp:212:26: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaInitFromType(ArrowSchema*, ArrowType)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 212 | ArrowSchemaInitFromType(input_schema->children[0]->children[0]->children[0], /cudf/cpp/tests/interop/from_arrow_device_test.cpp:214:21: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaSetName(ArrowSchema*, const char*)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 214 | ArrowSchemaSetName(input_schema->children[0]->children[0]->children[0], "element"); /cudf/cpp/tests/interop/from_arrow_device_test.cpp:226:27: error: ignoring return value of 'ArrowErrorCode cudfArrowArrayFinishBuilding(ArrowArray*, ArrowValidationLevel, ArrowError*)' declared with attribute 'warn_unused_result' [-Werror=unused-result] 226 | ArrowArrayFinishBuilding(input_array.get(), NANOARROW_VALIDATION_LEVEL_NONE, nullptr); /cudf/cpp/tests/interop/from_arrow_device_test.cpp: In member function 'virtual void FromArrowDeviceTest_StructColumn_Test::TestBody()': ``` Closes #15597 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Muhammad Haseeb (https://github.com/mhaseeb123) - Vukasin Milovanovic (https://github.com/vuule) URL: #15596
Yes, I should be able to start on this next week. |
Let me know if/when either of you need something from me here. |
Following up from #15458 and continuing the work to address #14926 adding host memory version of `from_arrow_device` which will perform the copies from host memory to create cudf objects. Authors: - Matt Topol (https://github.com/zeroshade) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Paul Mattione (https://github.com/pmattione-nvidia) - Vyas Ramasubramani (https://github.com/vyasr) - David Wendt (https://github.com/davidwendt) URL: #15645
Description
Adding a corresponding
from_arrow_device
function following up from #15047. This continues the work towards addressing #14926.Checklist
I still need to write the tests but wanted to get initial eyes looking at the design and structure.
CC @vyasr @kkraus14 @davidwendt