-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experiment to replace the large array of Excel function definitions with a FlyWeight Pattern collection of Value Objects #2714
base: master
Are you sure you want to change the base?
Conversation
3dff948
to
e6bbb7a
Compare
@oleibman - Thought you might like to take a look at this, replacing the large array of Excel function signatures in Calculation with a collection of Value Objects that's lazy-loaded as functions are referenced. Entries in the collection; and properties in the VOs can be accessed either as array elements or as object properties. It's still not 100% yet, there's still some work I need to do on it; but it does seem to show both memory and speed improvements on the test suite (despite the one test that loads every single VO in the collection)... at least there's lower memory usage for PHP 7.4 and PHP 8.0; but all versions runs in the CI tests seem faster. How that will translate into real-world end-user performance, I don't know; but it should still be better. |
…n signature definitions with a structured collection of Value Objects to see what affect it has on memory usage and speed. This isn't a perfect implementation yet: but it's enough to take a look at the memory usage and speed.
e6bbb7a
to
3878e32
Compare
@MarkBaker, I will certainly take a look. However, I will be offline for personal reasons for about a week, so will probably not be able to get back to you before late next week. |
@MarkBaker, as usual I am learning so much from reviewing this. It will take a while for me to put it all together. For now, I do have one question. You have moved the initialization of the large array |
Also possible: The new array is a lot smaller than the original array; but it does still use a not insignificant amount of memory.... it took me a while to get this working "as is", but a version of the I'll have a go at that on a separate branch next week, so that we can compare memory/speed results for both variations. Given that only a small number of Excel functions are typically used in any given spreadsheet; but that those few functions are often used repeatedly, it did seem a good target for memory optimisation; though I was a bit concerned about the performance cost of the additional "load on demand"... that pattern of real world demand should see both memory and performance benefits, even if our test suite doesn't fully reflect that. Though I was pleased that there didn't seem any real performance overhead with the change, and the memory saving is visible in the unit test runs, even though there are tests there that would load every function definition. |
Just as a warning: the biggest issue that I had with this was the large number of additional classes slowing PHPStorm's re-indexing, and making it slower overall... I doubt that would be a problem once I switch to my new laptop, with more memory and a faster processor, but it is a maintenance overhead. |
And if I do start from an initial empty array, I'll also have to play around a bit with the logic that builds the list of implemented functions when we do a release; because that uses the array keys as its base data; but I'm sure I can work out an alternative reading through the Engine/Functions folder to handle that. Slower release isn't real problem if we're improving run-time performance. |
- TEXTBEFORE – Returns text that’s before delimiting characters - TEXTAFTER – Returns text that’s after delimiting character - TEXTSPLIT – Splits text into rows or columns using delimiters - VSTACK – Stacks arrays vertically - HSTACK – Stacks arrays horizontally - TOROW – Returns the array as one row - TOCOL – Returns the array as one column - WRAPROWS – Wraps a row array into a 2D array - WRAPCOLS – Wraps a column array into a 2D array - TAKE – Returns rows or columns from array start or end - DROP – Drops rows or columns from array start or end - CHOOSEROWS – Returns the specified rows from an array - CHOOSECOLS – Returns the specified columns from an array - EXPAND – Expands an array to the specified dimensions
This is:
Checklist:
Why this change is needed?
Experiment to replace the large "unstructured" array of Excel function signature definitions with a structured collection of Value Objects to see what affect it has on memory usage and speed.
This isn't a perfect implementation yet: but it's enough to take a look at the memory usage and speed of the new solution.
Hopefully, this approach will also make Excel function definitions/implementations easier to maintain in future, and potentially allow new benefits such as defining the Excel version where they became available, and also if Microsoft have deprecated them.