-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix input table jpy restriction #5260
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified the Java object wrapping logic to always wrap with the most specific (deepest) descendent class in the class hierarchy
Co-authored-by: Chip Kent <[email protected]>
Co-authored-by: Chip Kent <[email protected]>
def wrap_j_object(j_obj: jpy.JType) -> Union[JObjectWrapper, jpy.JType]: | ||
""" Wraps the specified Java object as an instance of a custom wrapper class if one is available, otherwise returns | ||
the raw Java object. """ | ||
def _wrap_with_subclass(j_obj: jpy.JType, cls: type) -> Optional[JObjectWrapper]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably, this is slower than the existing behavior. The only place I can imagine where that might matter is something like a PartitionedTable.transform
with a Python function. Does the Java constituent Table
get wrapped automatically as a Python Table
wrapper, and if so , is it slow enough to matter?
@jmao-denver you might need to test this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, they are not. When requested (e.g. via. constituent_tables()), the constituent tables are explicitly wrapped in Table, not through this auto wrapping facility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that addresses my concern. I am talking about when you pass a Python function (adapted to Java) to transform
.
deephaven.table.PartitionedTable.transform
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some quick/minimal testing, ~2 - 4 % slower, the more constituent tables, the slower, which is kinda expected (more wrapping):
def transform_func(t: Table) -> Table:
return t.update("f = a + b")
t = empty_table(100_000_000).update(["a = i", "b = i + 1", "c = i % 10_000", "d = `text`", "e = i % 1000000"])
pt = t.partition_by(by=["c"])
print("num of constituent tables: ", len(pt.constituent_tables))
print("constituent table size: ", pt.constituent_tables[0].size)
with make_user_exec_ctx():
st = time.process_time_ns()
transformed_pt = pt.transform(transform_func)
print("transform time: ", (time.process_time_ns() - st)/10**9)
self.assertIsNotNone(transformed_pt)
Before:
num of constituent tables: 10000
constituent table size: 10000
transform time: 42.321798386
num of constituent tables: 1000
constituent table size: 100000
transform time: 20.706249172
After:
num of constituent tables: 10000
constituent table size: 10000
transform time: 44.08636249
num of constituent tables: 1000
constituent table size: 100000
transform time: 21.192759225
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This addressed two of my concerns
Trying to create a custom input table resulted in:
Somehow missed this line in my testing of Core+ custom input tables until a fresh install. Added a basic test for custom input table creation.
Although BaseArrayBackedInputTable is no longer suitable, should we be more specific thanTable
jpy requirement, e.g. "QueryTable"?