Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store: track HashPosition for first and last elements #134

Merged
merged 6 commits into from
Nov 15, 2023
Merged

Conversation

dktapps
Copy link
Member

@dktapps dktapps commented Nov 13, 2023

this produces a huge performance improvement for queues with large internal tables.

an internal table of large size may appear if the array had lots of elements inserted into it and later deleted. this resulted in major performance losses for the reader of the elements, as zend_hash_internal_pointer_reset_ex() had to scan through many IS_UNDEF offsets to find the actual first element.

there are two ways to attack this problem:

  1. reallocate the internal table as elements are deleted to reduce the internal table size - this proved to be relatively ineffective 2) track the start and end of the hashtable to avoid repeated scans during every shift() call - this is the approach taken in this commit, and provides major performance benefits

the test case written in #42 now runs to completion substantially faster, without any performance degradation.

more tests are needed to ensure that this works fully as intended, but I chose to take the safe route with invalidating vs updating the offsets, so I think it should be good.

this produces a huge performance improvement for queues with large internal tables.

an internal table of large size may appear if the array had lots of elements inserted into it and later deleted.
this resulted in major performance losses for the reader of the elements, as zend_hash_internal_pointer_reset_ex() had to scan through many IS_UNDEF offsets to find the actual first element.

there are two ways to attack this problem:
1) reallocate the internal table as elements are deleted to reduce the internal table size - this proved to be relatively ineffective
2) track the start and end of the hashtable to avoid repeated scans during every shift() call - this is the approach taken in this commit, and provides major performance benefits

the test case written in #42 now runs to completion substantially faster, without any performance degradation.

more tests are needed to ensure that this works fully as intended, but I chose to take the safe route with invalidating vs updating the offsets, so I think it should be good.
…are no elements ...

as well as modifying the HashPosition and potentially borking it
this caused elements to be removed in the wrong order, or not to be removed at all.
@dktapps dktapps merged commit b2b6100 into fork Nov 15, 2023
40 checks passed
@dktapps dktapps deleted the track-first-last branch November 15, 2023 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant