Pseudocode

purplesyringa · Dec 18, 2024 · ced1da1 · ced1da1
1 parent aad805d
commit ced1da1
Show file tree

Hide file tree

Showing 4 changed files with 7 additions and 7 deletions.
diff --git a/blog/feed.rss b/blog/feed.rss
@@ -7,7 +7,7 @@
 		<copyright>Alisa Sireneva, CC BY</copyright>
 		<managingEditor>[email protected] (Alisa Sireneva)</managingEditor>
 		<webMaster>[email protected] (Alisa Sireneva)</webMaster>
-		<lastBuildDate>Wed, 18 Dec 2024 22:15:20 GMT</lastBuildDate>
+		<lastBuildDate>Wed, 18 Dec 2024 22:31:08 GMT</lastBuildDate>
 		<docs>https://www.rssboard.org/rss-specification</docs>
 		<ttl>60</ttl>
 		<atom:link href="https://purplesyringa.moe/blog/feed.rss" rel="self" type="application/rss+xml" />
@@ -16,7 +16,7 @@
 				<title>The RAM myth</title>
 				<link>https://purplesyringa.moe/blog/./the-ram-myth/</link>
 				<description>The RAM myth is a belief that modern computer memory resembles perfect random-access memory. Cache is seen as an optimization for small data: if it fits in L2, it’s going to be processed faster; if it doesn’t, there’s nothing we can do.
-Most likely, you believe that pseudocode like this is the fastest way to shard data:
+Most likely, you believe that code like this is the fastest way to shard data (I’m using Python as pseudocode; pretend I used your favorite low-level language):
 groups = [[] for _ in range(n_groups)]
 for element in elements:
 groups[element.group].append(element)

diff --git a/blog/index.html b/blog/index.html
@@ -1,4 +1,4 @@
-<!doctypehtml><meta charset=utf-8><meta content=width=device-width,initial-scale=1 name=viewport><title>purplesyringa's blog</title><link href="{{ root }}/favicon.ico?v=2"rel=icon><link href=../all.css rel=stylesheet><link href=../blog.css rel=stylesheet><link href=../vendor/Temml-Local.css rel=stylesheet><link crossorigin href=https://fonts.googleapis.com/css2?family=Noto+Sans:ital,wght@0,100..900;1,100..900&family=Roboto+Mono:ital,wght@0,100..700;1,100..700&family=Roboto:ital,wght@0,400;0,700;1,400;1,700&family=Slabo+27px&display=swap rel=stylesheet><link href=../fonts/webfont.css rel=stylesheet><link media="screen and (prefers-color-scheme: dark"href=../vendor/atom-one-dark.min.css rel=stylesheet><link media="screen and (prefers-color-scheme: light"href=../vendor/a11y-light.min.css rel=stylesheet><link title="Blog posts"href=/blog/feed.rss rel=alternate type=application/rss+xml><script data-website-id=0da1961d-43f2-45cc-a8e2-75679eefbb69 defer src=https://zond.tei.su/script.js></script><body><header><div class=viewport-container><div class=media><a href=https://github.com/purplesyringa><img alt=GitHub src=../images/github-mark-white.svg></a></div><h1><a href=/>purplesyringa</a></h1><nav><a href=..>about</a><a class=current href>blog</a><a href=../sink/>kitchen sink</a></nav></div></header><section><div class=viewport-container><p><a href=/blog/feed.rss><i class="nf nf-fa-rss_square"title=RSS></i> Subscribe to RSS</a><div class=post-entry><h2><a href=./the-ram-myth/>The RAM myth</a></h2><time>December 19, 2024</time><p>The RAM myth is a belief that modern computer memory resembles perfect random-access memory. Cache is seen as an optimization for small data: if it fits in L2, it’s going to be processed faster; if it doesn’t, there’s nothing we can do.<p>Most likely, you believe that pseudocode like this is the fastest way to shard data:<pre><code class=language-python>groups = [[] <span class=hljs-keyword>for</span> _ <span class=hljs-keyword>in</span> <span class=hljs-built_in>range</span>(n_groups)]
+<!doctypehtml><meta charset=utf-8><meta content=width=device-width,initial-scale=1 name=viewport><title>purplesyringa's blog</title><link href="{{ root }}/favicon.ico?v=2"rel=icon><link href=../all.css rel=stylesheet><link href=../blog.css rel=stylesheet><link href=../vendor/Temml-Local.css rel=stylesheet><link crossorigin href=https://fonts.googleapis.com/css2?family=Noto+Sans:ital,wght@0,100..900;1,100..900&family=Roboto+Mono:ital,wght@0,100..700;1,100..700&family=Roboto:ital,wght@0,400;0,700;1,400;1,700&family=Slabo+27px&display=swap rel=stylesheet><link href=../fonts/webfont.css rel=stylesheet><link media="screen and (prefers-color-scheme: dark"href=../vendor/atom-one-dark.min.css rel=stylesheet><link media="screen and (prefers-color-scheme: light"href=../vendor/a11y-light.min.css rel=stylesheet><link title="Blog posts"href=/blog/feed.rss rel=alternate type=application/rss+xml><script data-website-id=0da1961d-43f2-45cc-a8e2-75679eefbb69 defer src=https://zond.tei.su/script.js></script><body><header><div class=viewport-container><div class=media><a href=https://github.com/purplesyringa><img alt=GitHub src=../images/github-mark-white.svg></a></div><h1><a href=/>purplesyringa</a></h1><nav><a href=..>about</a><a class=current href>blog</a><a href=../sink/>kitchen sink</a></nav></div></header><section><div class=viewport-container><p><a href=/blog/feed.rss><i class="nf nf-fa-rss_square"title=RSS></i> Subscribe to RSS</a><div class=post-entry><h2><a href=./the-ram-myth/>The RAM myth</a></h2><time>December 19, 2024</time><p>The RAM myth is a belief that modern computer memory resembles perfect random-access memory. Cache is seen as an optimization for small data: if it fits in L2, it’s going to be processed faster; if it doesn’t, there’s nothing we can do.<p>Most likely, you believe that code like this is the fastest way to shard data (I’m using Python as pseudocode; pretend I used your favorite low-level language):<pre><code class=language-python>groups = [[] <span class=hljs-keyword>for</span> _ <span class=hljs-keyword>in</span> <span class=hljs-built_in>range</span>(n_groups)]
 <span class=hljs-keyword>for</span> element <span class=hljs-keyword>in</span> elements:
     groups[element.group].append(element)
 </code></pre><p>Indeed, it’s linear (i.e. asymptotically optimal), and we have to access random indices anyway, so cache isn’t going to help us in any case.<p>In reality, this is leaving a lot of performance on the table, and certain <em>asymptotically slower</em> algorithms can perform sharding significantly faster on large input. They are mostly used by on-disk databases, but, surprisingly, they are useful even for in-RAM data.<p><a href=./the-ram-myth/>Keep reading</a></div><div class=post-entry><h2><a href=./thoughts-on-rust-hashing/>Thoughts on Rust hashing</a></h2><time>December 12, 2024</time><a class=discussion href=https://www.reddit.com/r/rust/comments/1hclif3/thoughts_on_rust_hashing/><i class="nf nf-md-comment"title=Comment></i> Reddit</a><a class=discussion href=https://internals.rust-lang.org/t/low-latency-hashing/22010><i class="nf nf-md-comment"title=Comment></i> IRLO</a><p>In languages like Python, Java, or C++, values are hashed by calling a “hash me” method on them, implemented by the type author. This fixed-hash size is then immediately used by the hash table or what have you. This design suffers from some obvious problems, like:<p>How do you hash an integer? If you use a no-op hasher (booo), DoS attacks on hash tables are inevitable. If you hash it thoroughly, consumers that only cache hashes to optimize equality checks lose out of performance.<p><a href=./thoughts-on-rust-hashing/>Keep reading</a></div><div class=post-entry><h2><a href=./any-python-program-fits-in-24-characters/>Any Python program fits in 24 characters*</a></h2><time>November 17, 2024</time><p><em>* If you don’t take whitespace into account.</em><p>My friend challenged me to find the shortest solution to a certain Leetcode-style problem in Python. They were generous enough to let me use whitespace for free, so that the code stays readable. So that’s exactly what we’ll abuse to encode <em>any</em> Python program in <eq><math><mn>24</mn></math></eq> bytes, ignoring whitespace.<p><a href=./any-python-program-fits-in-24-characters/>Keep reading</a></div><div class=post-entry><h2><a href=./the-rust-trademark-policy-is-still-harmful/>The Rust Trademark Policy is still harmful</a></h2><time>November 10, 2024</time><a class=discussion href=https://www.reddit.com/r/rust/comments/1gnz5sm/the_rust_trademark_policy_is_still_harmful/><i class="nf nf-md-comment"title=Comment></i> Reddit</a><p>Four days ago, the Rust Foundation <a href=https://blog.rust-lang.org/2024/11/06/trademark-update.html>released</a> a new <a href=https://drive.google.com/file/d/1hjTx11Fb-4W7RQLmp3R8BLDACc7zxIpG/view>draft</a> of the Rust Language Trademark Policy. The previous draft caused division within the community several years ago, prompting its retraction with the aim of creating a new, milder version.<p>Well, that failed. While certain issues were addressed (thank you, we appreciate it!), the new version remains excessively restrictive and, in my opinion, will harm both the Rust community as a whole <em>and</em> compiler and crate developers. While I expect the stricter rules to not be enforced in practice, I don’t want to constantly feel like I’m under threat while contributing to the Rust ecosystem, and this is exactly what it would feel like if this draft is finalized.<p>Below are some of my core objections to the draft.<p><a href=./the-rust-trademark-policy-is-still-harmful/>Keep reading</a></div><div class=post-entry><h2><a href=./bringing-faster-exceptions-to-rust/>Bringing faster exceptions to Rust</a></h2><time>November 6, 2024</time><a class=discussion href=https://www.reddit.com/r/rust/comments/1gl050z/bringing_faster_exceptions_to_rust/><i class="nf nf-md-comment"title=Comment></i> Reddit</a><p>Three months ago, I wrote about why <a href=../you-might-want-to-use-panics-for-error-handling/>you might want to use panics for error handling</a>. Even though it’s a catchy title, panics are hardly suited for this goal, even if you try to hack around with macros and libraries. The real star is <em>the unwinding mechanism</em>, which powers panics. This post is the first in a series exploring what unwinding is, how to speed it up, and how it can benefit Rust and C++ programmers.<p><a href=./bringing-faster-exceptions-to-rust/>Keep reading</a></div><div class=post-entry><h2><a href=./we-built-the-best-bad-apple-in-minecraft/>We built the best "Bad Apple!!" in Minecraft</a></h2><time>October 10, 2024</time><a class=discussion href=https://news.ycombinator.com/item?id=41798369><i class="nf nf-md-comment"title=Comment></i> Hacker News</a><p>Demoscene is the art of pushing computers to perform tasks they weren’t designed to handle. One recurring theme in demoscene is the shadow-art animation “Bad Apple!!”. We’ve played it on the Commodore 64, <a href=https://en.wikipedia.org/wiki/Vectrex>Vectrex</a> (a unique game console utilizing only vector graphics), <a href=https://www.youtube.com/watch?v=SDvk3aL78fI>Impulse Tracker</a>, and even <a href=https://tasvideos.org/6012M>exploited Super Mario Bros.</a> to play it.<p>But how about Bad Apple!!.. in Minecraft?<p><a href=./we-built-the-best-bad-apple-in-minecraft/>Keep reading</a></div><div class=post-entry><h2><a href=ru/minecraft-compares-arrays-in-cubic-time/>Minecraft сравнивает массивы за куб</a></h2><time>September 14, 2024</time><a class=discussion href=https://t.me/alisa_rummages/156><i class="nf nf-md-comment"title=Comment></i> Telegram</a><p>Коллизии в играх обнаруживаются тяжелыми алгоритмами. Для примера попробуйте представить себе, насколько сложно это для просто двух произвольно повернутых кубов в пространстве. Они могут контактировать двумя ребрами, вершиной и гранью или еще как-то более сложно.<p>В майнкрафте вся геометрия хитбоксов параллельна осям координат, т.е. наклона не бывает. Это сильно упрощает поиск коллизий.<p>Я бы такое писала просто. Раз хитбокс блока — это объединение нескольких параллелепипедов, то можно его так и хранить: как список 6-элементных тьюплов. В подавляющем большинстве случаев этот список будет очень коротким. Для обычных кубов его длина — 1, для стеклопаналей может достигать 2, наковальня, о боги, состоит из 3 элементов, а стены могут иметь их аж целых 4. Для проверки хитбоксов на пересечение достаточно перебрать пары параллелепипедов двух хитбоксов (кажется, их может быть максимум 16). Для параллелепипедов с параллельными осями задача решается тривиально.<p>Но Minecraft JE писала не я, поэтому там реализация иная.<p><a href=ru/minecraft-compares-arrays-in-cubic-time/>Keep reading</a></div><div class=post-entry><h2><a href=./webp-the-webpage-compression-format/>WebP: The WebPage compression format</a></h2><time>September 7, 2024</time><a class=discussion href=https://news.ycombinator.com/item?id=41475124><i class="nf nf-md-comment"title=Comment></i> Hacker News</a><a class=discussion href=https://www.reddit.com/r/programming/comments/1fb5pzh/webp_the_webpage_compression_format/><i class="nf nf-md-comment"title=Comment></i> Reddit</a><a class=discussion href=https://lobste.rs/s/t81n2g/webp_webpage_compression_format><i class="nf nf-md-comment"title=Comment></i> Lobsters</a><a class=discussion href=https://habr.com/ru/articles/841754/><i class="nf nf-md-translate"title=Translation></i> Russian</a><p>I want to provide a smooth experience to my site visitors, so I work on accessibility and ensure it works without JavaScript enabled. I care about page load time because some pages contain large illustrations, so I minify my HTML.<p>But one <em>thing</em> makes turning my blog light as a feather a pain in the ass.<p><a href=./webp-the-webpage-compression-format/>Keep reading</a></div><div class=post-entry><h2><a href=./division-is-hard-but-it-does-not-have-to-be/>Division is hard, but it doesn't have to be</a></h2><time>August 24, 2024</time><a class=discussion href=https://www.reddit.com/r/programming/comments/1f0n8sk/division_is_hard_but_it_doesnt_have_to_be/><i class="nf nf-md-comment"title=Comment></i> Reddit</a><p>Developers don’t usually divide numbers all the time, but hashmaps often need to compute <a href=https://en.wikipedia.org/wiki/Remainder>remainders</a> modulo a prime. Hashmaps are really common, so fast division is useful.<p>For instance, rolling hashes might compute <code>u128 % u64</code> with a fixed divisor. Compilers just drop the ball here:<pre><code class=language-rust><span class=hljs-keyword>fn</span> <span class="hljs-title function_">modulo</span>(n: <span class=hljs-type>u128</span>) <span class=hljs-punctuation>-></span> <span class=hljs-type>u64</span> {