Skip to content

11.0.0

Compare
Choose a tag to compare
@bodil bodil released this 10 Jul 21:13
· 284 commits to master since this release
11.0.0
aad8285

Changed

This is a major release with many breaking changes, and is intended to stabilise the API more than to denote that the rewrite is now production ready. You should expect future releases with significant performance improvements as well as additional APIs, but there should be no further major release with breaking changes in the immediate future, barring very serious unforeseen issues.

Specifically, you should expect imminent minor releases with performance improvements for Vector and OrdMap, for which I have a number of known optimisations that remain unimplemented.

Thanks to the several contributors without whom im would not be what it is.

No More Arc

All data structures have been reworked to take values of A: Clone instead of Arc<A>, meaning that there's less performance overhead (as well as mental overhead) when using values that clone cheaply. The performance gain when values are A: Copy is a factor of two or more. It's expected that users should wrap values in Arc themselves when using values which are expensive to clone.

Data structures still use reference counters internally to reference nodes, but values are stored directly in the nodes with no further indirection. This is also good for cache locality.

Data structures now use Rc instead of Arc by default to do reference counting. If you need a thread safe version that implements Send and Sync, you can enable the arc feature on the package to compile with Arc instead.

std::collections Compatible API

The API has been reworked to align more closely with std::collections, favouring mutable operations by default, so that operations that were previously suffixed with _mut are now the standard operations, and immutable operations which return a modified copy have been given different names altogether. In short, all your code using previous versions of this library will no longer work, and if it was relying heavily on immutable operations, it's recommended that you rewrite it to be mutable by preference, but you should generally be able to make it work again by using the new method names for the immutable operations.

Here is a list of the most notable changed method names for maps and sets:

Previous immutable Current immutable Previous mutable Current mutable
insert update insert_mut insert
remove without remove_mut remove
pop extract pop_mut remove

You should expect to be able to rewrite code using std::collections::HashMap and std::collections::BTreeMap with minimal or no changes using im::HashMap and im::OrdMap respectively.

Vector has been completely rewritten and has an API that aligns closely with std::collections::VecDeque, with very few immutable equivalents. It's expected that you should use Vector::clone() to take a snapshot when you need it rather than cause an implicit clone for each operation. (It's still O(1) and practically instant.)

I'm considering adding back some of the immutable operations if I can come up with good names for them, but for now, just clone it if you need it.

RRB Vector

Vector is now implemented as an RRB tree with smart head/tail chunking, obsoleting the previous Hickey trie implementation.

RRB trees have generally similar performance characteristics to the Hickey trie, with the added benefit of having O(log n) splitting and concatenation.

Operation RRB tree Hickey trie Vec VecDeque
Push front O(1)* O(log n) O(n) O(1)*
Push back O(1)* O(log n) O(1)* O(1)*
Pop front O(1)* O(log n) O(n) O(1)*
Pop back O(1)* O(log n) O(1) O(1)*
Lookup by index O(log n) O(log n) O(1) O(1)
Split O(log n) O(log n) O(n) O(n)
Join O(log n) O(n) O(n) O(n)

(Please note that the timings above are for the im version of the Hickey trie, based on the Immutable.js implementation, which performs better than the original Clojure version on splits and push/pop front, but worse on push/pop back).

The RRB tree is the most generally efficient list like data structure currently known, to my knowledge, but obviously it does not and cannot perform as well as a simple Vec on certain operations. It makes up for that by having no operations you need to worry about the performance complexity of: nothing you can do to an RRB tree is going to be more expensive than just iterating over it. For larger data sets, being able to concatenate (and, by extension, insert and remove at arbitrary locations) several orders of magnitude faster than Vec could also be considered a selling point.

No More CatList And ConsList

CatList has been superseded by Vector, and ConsList was generally not very useful except in the more peculiar edge cases where memory consumption matters more than performance, and keeping it in line with current API changes wasn't practical.

No More Funny Words

Though it breaks my heart, words like cons, snoc, car, cdr and uncons are no longer used in the im API, to facilitiate closer alignment with std::collections. Even the head/tail pair is gone, though head and last remain as aliases for front and back.