Skip to content

Commit

Permalink
Define n
Browse files Browse the repository at this point in the history
  • Loading branch information
purplesyringa committed Dec 18, 2024
1 parent 4e3b556 commit aad805d
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion blog/the-ram-myth/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
In reality, this is leaving a lot of performance on the table, and certain asymptotically slower algorithms can perform sharding significantly faster on large input. They are mostly used by on-disk databases, but, surprisingly, they are useful even for in-RAM data."property=og:description><meta content=en_US property=og:locale><meta content="purplesyringa's blog"property=og:site_name><meta content=summary_large_image name=twitter:card><meta content=https://purplesyringa.moe/blog/the-ram-myth/og.png name=twitter:image><script data-website-id=0da1961d-43f2-45cc-a8e2-75679eefbb69 defer src=https://zond.tei.su/script.js></script><body><header><div class=viewport-container><div class=media><a href=https://github.com/purplesyringa><img alt=GitHub src=../../images/github-mark-white.svg></a></div><h1><a href=/>purplesyringa</a></h1><nav><a href=../..>about</a><a class=current href=../../blog/>blog</a><a href=../../sink/>kitchen sink</a></nav></div></header><section><div class=viewport-container><h2>The RAM myth</h2><time>December 19, 2024</time><p>The RAM myth is a belief that modern computer memory resembles perfect random-access memory. Cache is seen as an optimization for small data: if it fits in L2, it’s going to be processed faster; if it doesn’t, there’s nothing we can do.<p>Most likely, you believe that pseudocode like this is the fastest way to shard data:<pre><code class=language-python>groups = [[] <span class=hljs-keyword>for</span> _ <span class=hljs-keyword>in</span> <span class=hljs-built_in>range</span>(n_groups)]
<span class=hljs-keyword>for</span> element <span class=hljs-keyword>in</span> elements:
groups[element.group].append(element)
</code></pre><p>Indeed, it’s linear (i.e. asymptotically optimal), and we have to access random indices anyway, so cache isn’t going to help us in any case.<p>In reality, this is leaving a lot of performance on the table, and certain <em>asymptotically slower</em> algorithms can perform sharding significantly faster on large input. They are mostly used by on-disk databases, but, surprisingly, they are useful even for in-RAM data.<p class=next-group><span aria-level=3 class=side-header role=heading><span>Solution</span></span>The algorithm from above has <eq><math><mrow><mrow><mi mathvariant=normal>Θ</mi></mrow><mo form=prefix stretchy=false>(</mo><mi>n</mi><mo form=postfix stretchy=false>)</mo></mrow></math></eq> cache misses on random input. The only way to reduce this number is to make the memory accesses more ordered. If you can ensure the elements are ordered by <code>group</code>, that’s great. If you can’t, you can still sort the accesses before the <code>for</code> loop:<pre><code class=language-python>elements.sort(key = <span class=hljs-keyword>lambda</span> element: element.group)
</code></pre><p>Indeed, it’s linear (i.e. asymptotically optimal), and we have to access random indices anyway, so cache isn’t going to help us in any case.<p>In reality, this is leaving a lot of performance on the table, and certain <em>asymptotically slower</em> algorithms can perform sharding significantly faster on large input. They are mostly used by on-disk databases, but, surprisingly, they are useful even for in-RAM data.<p class=next-group><span aria-level=3 class=side-header role=heading><span>Solution</span></span>The algorithm from above has <eq><math><mrow><mrow><mi mathvariant=normal>Θ</mi></mrow><mo form=prefix stretchy=false>(</mo><mi>n</mi><mo form=postfix stretchy=false>)</mo></mrow></math></eq> cache misses on random input (where <eq><math><mi>n</mi></math></eq> is the number of elements). The only way to reduce this number is to make the memory accesses more ordered. If you can ensure the elements are ordered by <code>group</code>, that’s great. If you can’t, you can still sort the accesses before the <code>for</code> loop:<pre><code class=language-python>elements.sort(key = <span class=hljs-keyword>lambda</span> element: element.group)
</code></pre><p>Sorting costs some time, but in return, removes cache misses from the <code>for</code> loop entirely. If the data is large enough that it doesn’t fit in cache, this is a net win. As a bonus, creating individual lists can be replaced with a group-by operation:<pre><code class=language-python>elements.sort(key = <span class=hljs-keyword>lambda</span> element: element.group)
groups = [
group_elements
Expand Down
2 changes: 1 addition & 1 deletion blog/the-ram-myth/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ In reality, this is leaving a lot of performance on the table, and certain *asym

### Solution

The algorithm from above has $\Theta(n)$ cache misses on random input. The only way to reduce this number is to make the memory accesses more ordered. If you can ensure the elements are ordered by `group`, that's great. If you can't, you can still sort the accesses before the `for` loop:
The algorithm from above has $\Theta(n)$ cache misses on random input (where $n$ is the number of elements). The only way to reduce this number is to make the memory accesses more ordered. If you can ensure the elements are ordered by `group`, that's great. If you can't, you can still sort the accesses before the `for` loop:

```python
elements.sort(key = lambda element: element.group)
Expand Down

0 comments on commit aad805d

Please sign in to comment.