-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/sorted generic v2 #16
base: master
Are you sure you want to change the base?
Conversation
Gentle ping |
I haven't had the time to properly look at this PR yet. Maybe some of the previous contributors are interested in reviewing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR and sorry for the late response.
IMHO having functions with side effects isn't great - if one has ordered arrays and they loose ordering because of a call to intersect it is confusing.
Instead if I were you I would create a new method - one which doesn't expect ordering. Then within the method sort those two arrays (by copying) and then use this algorithm,
It will bring a lot more memory overhead, but I think the usage would be more realistic and hence the benchmark would give a better interpretation of the real gain.
// Best case complexity: O(n) where n is length of the shortest array (all values unique) | ||
// Worst case complexity: O(n) where n is length of the longest array (all values of the longest array are duplicates of intersect match) | ||
// Warning: Function will change left array order | ||
func SortedGenericV2[T comparable](a []T, b []T, leftGreater Comparator) []T { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one always should use a "leftgreater" comparer, then I would give this as a parameter.
// SortedGenericV2 has complexity: O(n + x) where n is length of the shortest array and x duplicate cases in the longest array. | ||
// Best case complexity: O(n) where n is length of the shortest array (all values unique) | ||
// Worst case complexity: O(n) where n is length of the longest array (all values of the longest array are duplicates of intersect match) | ||
// Warning: Function will change left array order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I'm not fan of side effects - if one has a sorted array, then loosing ordering because of a call to a intersect would be suppressing in my humble opinion.
@@ -63,6 +64,39 @@ func TestHash(t *testing.T) { | |||
assert.Equal(t, s, []interface{}{2}) | |||
} | |||
|
|||
func TestSortedGenericV2(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like two test cases - Please create a data struct which each test and then loop through them.
Hello!
I have some feature proposal, that can increase perforamance for sorted array case for few times.
Idea is about changing places for elements of the left array and iterating on both arrays at the same time, we can win a lot of perforamnce cause it makes complexity O(n + x) where n is length of the shortest array and x duplicate cases in the longest array.
n+x always less or equal length of the longest array.
It can be unsafe in case when its important to save order of input. Also it will find duplate matches only for the shortest array too.
Usage example:
Output:
New a's order:
vanila function benchmarks:
proposal benchmarks:
speed diffs:
As you see, with a little risks (like losing order of input array and weird behaviour for a lot of duplicate matches) you can get a huge performance boost.
P.S. alogorythm idea is not mine, my collegue (@painkuter) shared it with me.