Feature/sorted generic v2 #16

koodeex · 2022-09-11T15:19:14Z

Hello!

I have some feature proposal, that can increase perforamance for sorted array case for few times.

Idea is about changing places for elements of the left array and iterating on both arrays at the same time, we can win a lot of perforamnce cause it makes complexity O(n + x) where n is length of the shortest array and x duplicate cases in the longest array.

n+x always less or equal length of the longest array.

It can be unsafe in case when its important to save order of input. Also it will find duplate matches only for the shortest array too.

Usage example:

	a = []int{1, 2}
	b = []int{2}
	s = SortedGenericV2(a, b, func(i, j int) bool { return a[i] > b[j] })

Output:

[2]

New a's order:

[2, 1]

vanila function benchmarks:

Name	Runs	Average	Allocated	Allocations from heap
BenchmarkSortedGeneric/SortedGeneric_-_1-16	41 944 108	28.25 ns/op	8 B/op	1 allocs/op
BenchmarkSortedGeneric/SortedGeneric_-_10-16	10 445 590	111.6 ns/op	0 B/op	0 allocs/op
BenchmarkSortedGeneric/SortedGeneric_-_100-16	745 248	1 624 ns/op	0 B/op	0 allocs/op
BenchmarkSortedGeneric/SortedGeneric_-_1000-16	38 613	31 097 ns/op	0 B/op	0 allocs/op
BenchmarkSortedGeneric/SortedGeneric_-_10000-16	3 799	308 313 ns/op	0 B/op	0 allocs/op
BenchmarkSortedGeneric/SortedGeneric_-_100000-16	327	3 679 350 ns/op	0 B/op	0 allocs/op

proposal benchmarks:

Name	Runs	Average	Allocations from heap
BenchmarkSortedGeneric/SortedGenericV2_-_1-16	199 261 370	5.992 ns/op	0 allocs/op
BenchmarkSortedGeneric/SortedGenericV2_-_10-16	20 496 318	57.31 ns/op	0 allocs/op
BenchmarkSortedGeneric/SortedGenericV2_-_100-16	1 909 890	626.2 ns/op	0 allocs/op
BenchmarkSortedGeneric/SortedGenericV2_-_1000-16	188 733	6 367 ns/op	0 allocs/op
BenchmarkSortedGeneric/SortedGenericV2_-_10000-16	9 190	134 617 ns/op	0 allocs/op
BenchmarkSortedGeneric/SortedGenericV2_-_100000-16	1 537	783 528 ns/op	0 allocs/op

speed diffs:

size	Average diff, ns/op	Average diff, %
1	-22,258 ns/op	-78,78%
10	-54,29 ns/op	-48,64%
100	-997,8 ns/op	-61,45%
1 000	-24 730 ns/op	-99,97%
10 000	-173 696 ns/op	-56,33%
100 000	-2 895 822 ns/op	-78,70%

As you see, with a little risks (like losing order of input array and weird behaviour for a lot of duplicate matches) you can get a huge performance boost.

P.S. alogorythm idea is not mine, my collegue (@painkuter) shared it with me.

# Conflicts: # Readme.md

koodeex · 2022-10-15T19:47:53Z

Gentle ping

juliangruber · 2022-10-16T09:43:05Z

I haven't had the time to properly look at this PR yet. Maybe some of the previous contributors are interested in reviewing?

@3schwartz @colynn @cmin764 @kalashnikovisme

3schwartz

Thanks for the PR and sorry for the late response.

IMHO having functions with side effects isn't great - if one has ordered arrays and they loose ordering because of a call to intersect it is confusing.

Instead if I were you I would create a new method - one which doesn't expect ordering. Then within the method sort those two arrays (by copying) and then use this algorithm,

It will bring a lot more memory overhead, but I think the usage would be more realistic and hence the benchmark would give a better interpretation of the real gain.

3schwartz · 2023-01-04T19:15:21Z

intersect.go

+// Best case complexity: O(n) where n is length of the shortest array (all values unique)
+// Worst case complexity: O(n) where n is length of the longest array (all values of the longest array are duplicates of intersect match)
+// Warning: Function will change left array order
+func SortedGenericV2[T comparable](a []T, b []T, leftGreater Comparator) []T {


If one always should use a "leftgreater" comparer, then I would give this as a parameter.

3schwartz · 2023-01-04T19:16:43Z

intersect.go

+// SortedGenericV2 has complexity: O(n + x) where n is length of the shortest array and x duplicate cases in the longest array.
+// Best case complexity: O(n) where n is length of the shortest array (all values unique)
+// Worst case complexity: O(n) where n is length of the longest array (all values of the longest array are duplicates of intersect match)
+// Warning: Function will change left array order


In general I'm not fan of side effects - if one has a sorted array, then loosing ordering because of a call to a intersect would be suppressing in my humble opinion.

3schwartz · 2023-01-04T19:17:27Z

intersect_test.go

@@ -63,6 +64,39 @@ func TestHash(t *testing.T) {
 	assert.Equal(t, s, []interface{}{2})
 }

+func TestSortedGenericV2(t *testing.T) {


Seems like two test cases - Please create a data struct which each test and then loop through them.

koodeex added 4 commits September 11, 2022 17:53

add SortedGenericV2 feature proposal

0e97403

Merge branch 'master' into feature/sorted-generic-v2

063f5b5

# Conflicts: # Readme.md

fix method descriptions

bd8084a

remove test method

71a0f04

3schwartz suggested changes Jan 4, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/sorted generic v2 #16

Feature/sorted generic v2 #16

koodeex commented Sep 11, 2022 •

edited

Loading

koodeex commented Oct 15, 2022

juliangruber commented Oct 16, 2022

3schwartz left a comment

3schwartz Jan 4, 2023

3schwartz Jan 4, 2023

3schwartz Jan 4, 2023

Feature/sorted generic v2 #16

Are you sure you want to change the base?

Feature/sorted generic v2 #16

Conversation

koodeex commented Sep 11, 2022 • edited Loading

koodeex commented Oct 15, 2022

juliangruber commented Oct 16, 2022

3schwartz left a comment

Choose a reason for hiding this comment

3schwartz Jan 4, 2023

Choose a reason for hiding this comment

3schwartz Jan 4, 2023

Choose a reason for hiding this comment

3schwartz Jan 4, 2023

Choose a reason for hiding this comment

koodeex commented Sep 11, 2022 •

edited

Loading