Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix:
collect_columns
quadratic complexity (apache#11843)
Fix accidental $O(n^2)$ in `collect_columns`. There are the following ways to insert a clone into a hash set: - **clone before check:** Basically `set.insert(x.clone())`. That's rather expensive if you have a lot of duplicates. - **iter & clone:** That's what we do right now, but that's in $O(n^2)$. - **check & insert:** Basically `if !set.contains(x) {set.insert(x.clone())}` That requires two hash probes though. - **entry-based API:** Sadly the stdlib doesn't really offer any handle/entry-based APIs yet (see rust-lang/rust#60896 ), but `hashbrown` does, so we can use the nice `set.get_or_insert_owned(x)` which will only clone the reference if it doesn't exists yet and only hashes once. We now use the last approach.
- Loading branch information