-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support interfaces for Union logical type in DuckDB::LogicalType #893
Conversation
WalkthroughThis pull request introduces the Changes
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
assert_equal(2, union_column.logical_type.member_count) | ||
end | ||
|
||
def test_union_member_name_at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if it might be helpful to provide an iterator-like method for union members. This could make it more convenient for users to iterate through member names. If you think this would be a good addition, I'm planning to address it in a follow-up PR.
Here's an example of how it could look:
diff --git a/lib/duckdb/logical_type.rb b/lib/duckdb/logical_type.rb
index fb38b67..bc77b36 100644
--- a/lib/duckdb/logical_type.rb
+++ b/lib/duckdb/logical_type.rb
@@ -18,5 +18,31 @@ def type
type_id = _type
DuckDB::Converter::IntToSym.type_to_sym(type_id)
end
+
+ # Iterates over each union member name associated with a Union column.
+ #
+ # When a block is provided, this method yields each union member name in order.
+ # It also returns the total number of members yielded.
+ #
+ # union_logical_type.each_member_name do |name|
+ # puts "Union member: #{name}"
+ # end
+ #
+ # If no block is given, an Enumerator is returned, which can be used to
+ # retrieve all member names.
+ #
+ # names = union_logical_type.each_member_name.to_a
+ # # => ["member1", "member2"]
+ #
+ # @return [Enumerator, Integer]
+ # Returns an Enumerator if no block is provided.
+ # Returns an Integer (the member count) when a block is provided.
+ def each_member_name
+ return to_enum(__method__) {member_count} unless block_given?
+
+ member_count.times do |member_idx|
+ yield member_name_at(member_idx)
+ end
+ end
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you create another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
test/duckdb_test/logical_type_test.rb (1)
72-72
: Consider adding test cases for both union member types.The current test only inserts an INTEGER value. Consider adding test cases that insert VARCHAR values as well to ensure both union member types work correctly.
- 1::INTEGER, + 1::INTEGER, -- Test case for 'num' member + 'test'::VARCHAR, -- Test case for 'str' memberext/duckdb/logical_type.c (3)
164-175
: Fix documentation typo in call-seq example.The call-seq example incorrectly shows
value_type
instead ofmember_count
.- * member_col.logical_type.value_type -> Integer + * member_col.logical_type.member_count -> Integer
184-199
: Improve error message for member name retrieval failure.The current error message could be more descriptive to help with debugging.
- rb_raise(eDuckDBError, "fail to get name of %llu member", (unsigned long long)idx); + rb_raise(eDuckDBError, "Failed to retrieve name for union member at index %llu", (unsigned long long)idx);
209-224
: Maintain consistent error message style.The error message style should be consistent with the suggested improvement for member_name_at.
- "Failed to get the union member type at index %llu", + "Failed to retrieve type for union member at index %llu",
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
CHANGELOG.md
(1 hunks)ext/duckdb/logical_type.c
(3 hunks)test/duckdb_test/logical_type_test.rb
(4 hunks)
🔇 Additional comments (6)
test/duckdb_test/logical_type_test.rb (3)
39-39
: LGTM!The union column is correctly defined with appropriate member types and follows consistent naming conventions.
105-105
: LGTM!The union type is correctly added to the expected types array, maintaining consistency with the table schema.
178-200
: Consider adding an iterator-like method for union members.The tests effectively verify the basic functionality of union type methods. However, as suggested in past review comments, adding an iterator-like method would improve the API's usability.
Example implementation:
def test_union_member_iterator union_column = @columns.find { |column| column.type == :union } union_logical_type = union_column.logical_type # Test the iterator without a block enum = union_logical_type.each_member_name assert_instance_of(Enumerator, enum) assert_equal(["num", "str"], enum.to_a) # Test the iterator with a block names = [] count = union_logical_type.each_member_name { |name| names << name } assert_equal(2, count) assert_equal(["num", "str"], names) endext/duckdb/logical_type.c (2)
15-17
: LGTM!The function declarations follow the established pattern and maintain consistent naming conventions.
252-254
: LGTM!The method registration follows the established pattern and correctly exposes the new functionality.
CHANGELOG.md (1)
6-8
: LGTM!The changelog entries correctly document the new functionality and follow the established format. The "under construction" note appropriately indicates that more methods may be added in the future.
7450bf8
to
f7e6043
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
test/duckdb_test/logical_type_test.rb (1)
71-72
: Consider adding test cases for both union member types.Currently, the test only inserts an INTEGER value. Consider adding test cases that cover both INTEGER and VARCHAR members to ensure comprehensive testing of the union type functionality.
- 1::INTEGER, + 1::INTEGER, -- Test case for INTEGER member + 'test'::VARCHAR, -- Test case for VARCHAR memberext/duckdb/logical_type.c (1)
164-170
: Improve documentation for member_count method.The documentation should clarify that this method is specifically for union types and mention its behavior for non-union types.
/* * call-seq: * member_col.logical_type.member_count -> Integer * - * Returns the member count of union type, otherwise 0. + * Returns the number of members in a union type. For non-union types, + * returns 0. This method is specifically designed for working with + * union types and helps determine how many members are available + * for access using member_name_at and member_type_at methods. * */
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
CHANGELOG.md
(1 hunks)ext/duckdb/logical_type.c
(3 hunks)test/duckdb_test/logical_type_test.rb
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- CHANGELOG.md
⏰ Context from checks skipped due to timeout of 90000ms (19)
- GitHub Check: test (head, 1.1.1)
- GitHub Check: test (head, 1.1.3)
- GitHub Check: test (head, 1.2.0)
- GitHub Check: test (mswin, 1.1.1)
- GitHub Check: test (mingw, 1.1.1)
- GitHub Check: test (mingw, 1.1.3)
- GitHub Check: test (mingw, 1.2.0)
- GitHub Check: test (3.4.1, 1.1.1)
- GitHub Check: test (3.4.1, 1.1.3)
- GitHub Check: test (3.4.1, 1.2.0)
- GitHub Check: test (3.3.6, 1.1.1)
- GitHub Check: test (3.3.6, 1.1.3)
- GitHub Check: test (3.3.6, 1.2.0)
- GitHub Check: test (3.2.6, 1.1.1)
- GitHub Check: test (3.2.6, 1.1.3)
- GitHub Check: test (3.2.6, 1.2.0)
- GitHub Check: test (3.1.6, 1.1.1)
- GitHub Check: test (3.1.6, 1.1.3)
- GitHub Check: test (3.1.6, 1.2.0)
🔇 Additional comments (1)
test/duckdb_test/logical_type_test.rb (1)
178-200
: Consider adding an iterator for union members.As suggested in the previous review, an iterator-like method would make it more convenient for users to iterate through member names. The current implementation requires manual iteration using indices.
The implementation could be added to
lib/duckdb/logical_type.rb
as previously proposed:def each_member_name return to_enum(__method__) {member_count} unless block_given? member_count.times do |member_idx| yield member_name_at(member_idx) end end
static VALUE duckdb_logical_type_member_name_at(VALUE self, VALUE midx) { | ||
rubyDuckDBLogicalType *ctx; | ||
VALUE mname; | ||
const char *member_name; | ||
idx_t idx = NUM2ULL(midx); | ||
|
||
TypedData_Get_Struct(self, rubyDuckDBLogicalType, &logical_type_data_type, ctx); | ||
|
||
member_name = duckdb_union_type_member_name(ctx->logical_type, idx); | ||
if (member_name == NULL) { | ||
rb_raise(eDuckDBError, "fail to get name of %llu member", (unsigned long long)idx); | ||
} | ||
mname = rb_str_new_cstr(member_name); | ||
duckdb_free((void *)member_name); | ||
return mname; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add index validation to prevent out-of-bounds access.
The function should validate that the index is within bounds before accessing the member name.
static VALUE duckdb_logical_type_member_name_at(VALUE self, VALUE midx) {
rubyDuckDBLogicalType *ctx;
VALUE mname;
const char *member_name;
idx_t idx = NUM2ULL(midx);
TypedData_Get_Struct(self, rubyDuckDBLogicalType, &logical_type_data_type, ctx);
+ idx_t count = duckdb_union_type_member_count(ctx->logical_type);
+ if (idx >= count) {
+ rb_raise(eDuckDBError, "index %llu out of bounds (member count: %llu)",
+ (unsigned long long)idx, (unsigned long long)count);
+ }
+
member_name = duckdb_union_type_member_name(ctx->logical_type, idx);
if (member_name == NULL) {
rb_raise(eDuckDBError, "fail to get name of %llu member", (unsigned long long)idx);
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
static VALUE duckdb_logical_type_member_name_at(VALUE self, VALUE midx) { | |
rubyDuckDBLogicalType *ctx; | |
VALUE mname; | |
const char *member_name; | |
idx_t idx = NUM2ULL(midx); | |
TypedData_Get_Struct(self, rubyDuckDBLogicalType, &logical_type_data_type, ctx); | |
member_name = duckdb_union_type_member_name(ctx->logical_type, idx); | |
if (member_name == NULL) { | |
rb_raise(eDuckDBError, "fail to get name of %llu member", (unsigned long long)idx); | |
} | |
mname = rb_str_new_cstr(member_name); | |
duckdb_free((void *)member_name); | |
return mname; | |
} | |
static VALUE duckdb_logical_type_member_name_at(VALUE self, VALUE midx) { | |
rubyDuckDBLogicalType *ctx; | |
VALUE mname; | |
const char *member_name; | |
idx_t idx = NUM2ULL(midx); | |
TypedData_Get_Struct(self, rubyDuckDBLogicalType, &logical_type_data_type, ctx); | |
idx_t count = duckdb_union_type_member_count(ctx->logical_type); | |
if (idx >= count) { | |
rb_raise(eDuckDBError, "index %llu out of bounds (member count: %llu)", | |
(unsigned long long)idx, (unsigned long long)count); | |
} | |
member_name = duckdb_union_type_member_name(ctx->logical_type, idx); | |
if (member_name == NULL) { | |
rb_raise(eDuckDBError, "fail to get name of %llu member", (unsigned long long)idx); | |
} | |
mname = rb_str_new_cstr(member_name); | |
duckdb_free((void *)member_name); | |
return mname; | |
} |
static VALUE duckdb_logical_type_member_type_at(VALUE self, VALUE midx) { | ||
rubyDuckDBLogicalType *ctx; | ||
duckdb_logical_type union_member_type; | ||
idx_t idx = NUM2ULL(midx); | ||
|
||
TypedData_Get_Struct(self, rubyDuckDBLogicalType, &logical_type_data_type, ctx); | ||
|
||
union_member_type = duckdb_union_type_member_type(ctx->logical_type, idx); | ||
if (union_member_type == NULL) { | ||
rb_raise(eDuckDBError, | ||
"Failed to get the union member type at index %llu", | ||
(unsigned long long)idx); | ||
} | ||
|
||
return rbduckdb_create_logical_type(union_member_type); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add index validation to prevent out-of-bounds access.
Similar to member_name_at, this function should validate that the index is within bounds before accessing the member type.
static VALUE duckdb_logical_type_member_type_at(VALUE self, VALUE midx) {
rubyDuckDBLogicalType *ctx;
duckdb_logical_type union_member_type;
idx_t idx = NUM2ULL(midx);
TypedData_Get_Struct(self, rubyDuckDBLogicalType, &logical_type_data_type, ctx);
+ idx_t count = duckdb_union_type_member_count(ctx->logical_type);
+ if (idx >= count) {
+ rb_raise(eDuckDBError, "index %llu out of bounds (member count: %llu)",
+ (unsigned long long)idx, (unsigned long long)count);
+ }
+
union_member_type = duckdb_union_type_member_type(ctx->logical_type, idx);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
static VALUE duckdb_logical_type_member_type_at(VALUE self, VALUE midx) { | |
rubyDuckDBLogicalType *ctx; | |
duckdb_logical_type union_member_type; | |
idx_t idx = NUM2ULL(midx); | |
TypedData_Get_Struct(self, rubyDuckDBLogicalType, &logical_type_data_type, ctx); | |
union_member_type = duckdb_union_type_member_type(ctx->logical_type, idx); | |
if (union_member_type == NULL) { | |
rb_raise(eDuckDBError, | |
"Failed to get the union member type at index %llu", | |
(unsigned long long)idx); | |
} | |
return rbduckdb_create_logical_type(union_member_type); | |
} | |
static VALUE duckdb_logical_type_member_type_at(VALUE self, VALUE midx) { | |
rubyDuckDBLogicalType *ctx; | |
duckdb_logical_type union_member_type; | |
idx_t idx = NUM2ULL(midx); | |
TypedData_Get_Struct(self, rubyDuckDBLogicalType, &logical_type_data_type, ctx); | |
idx_t count = duckdb_union_type_member_count(ctx->logical_type); | |
if (idx >= count) { | |
rb_raise(eDuckDBError, "index %llu out of bounds (member count: %llu)", | |
(unsigned long long)idx, (unsigned long long)count); | |
} | |
union_member_type = duckdb_union_type_member_type(ctx->logical_type, idx); | |
if (union_member_type == NULL) { | |
rb_raise(eDuckDBError, | |
"Failed to get the union member type at index %llu", | |
(unsigned long long)idx); | |
} | |
return rbduckdb_create_logical_type(union_member_type); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. LGTM
refs: GH-690
In this PR, we support interfaces for Union logical type in DuckDB::Column#logical_type about the following.
This is one of the steps for supporting duckdb_logical_column_type C API.
Summary by CodeRabbit