-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
385 chained joins - possible solution #723
Conversation
Codecov Report
@@ Coverage Diff @@
## master #723 +/- ##
==========================================
+ Coverage 95.52% 95.59% +0.07%
==========================================
Files 67 68 +1
Lines 1184 1204 +20
Branches 39 34 -5
==========================================
+ Hits 1131 1151 +20
Misses 53 53
Flags with carried forward coverage won't be shown. Click here to find out more.
|
* @tparam T the type of ds | ||
* @tparam U the type of other | ||
*/ | ||
case class ChainedJoinOps[T, U](ds: TypedDataset[T], other: TypedDataset[U]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non case class
@@ -15,4 +16,8 @@ trait FramelessSyntax { | |||
implicit class DataframeSyntax(self: DataFrame){ | |||
def unsafeTyped[T: TypedEncoder]: TypedDataset[T] = TypedDataset.createUnsafe(self) | |||
} | |||
|
|||
implicit class ChainedJoinSyntax[T](ds: TypedDataset[T]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have to think about, not fond of adding automagic implicit for join use cases
Hi, Would you mind to add as first commit (before any change) a failing test corresponding to #385 ? |
hihi, there isn't a "failing" case - it's purely cosmetic line-saver sugar to remove the current requirement of introducing an extra "val" for each join and allowing a dataset to be defined in-line e.g. .join(load_data_here).jointype((left, right) => ... ) |
Then I'm not sure it worth the cost of maintaining each join function in double. |
Aside from you putting some new lines in and a one-line fix from me 3 years ago, the code hasn't been touched for 6 years. I don't suppose it needs much maintenance. That said documenting to use a generic tap / thrush would allow the primary issue to be solved (that of not chaining) without duplication, it just wouldn't support drop in functions with both left and right datasets. You'd end up with a perhaps less pretty: val leftDs = ...
val midDs = ...
val rightDs = ...
import mouse.any._
val joinedDs = leftDs.joinLeft(midDs)(...)
.thrush( leftWithMid => leftWithMid.joinLeft(rightDs)(leftWithMid(`attrib == ...))
.thrush( other joins ) it's still seems a niche issue though, I can't find anything on stack but I'll ask one of the people who original raised a comment if that's not solving the issue for them. |
The preferred resolution is to document for me. |
ok, given I just got confirmation from one of the people on the issue that it was about the ugliness and added difficulty in reasoning about it with 5+ joins, advising to use a thrush s probably enough, if not as fluent. Happy for you to close the PR / issue on that basis |
#385 - possible solution covering most uses I can imagine, seriously not an issue if it's not desired/fitting the aesthetic, the itch simply needed scratching