This is a collection of stories gathered from this work. These might be interesting problems, things I learned, or simply why I did. These may be outlines or prose.
- error handling. Verbose but appreciate the explicitness
- first class philosophy on cancellation, timeout, and process signaling. Compare cancellation to Java (interrupt threads and then they have to handle Exception) and Python (...how do you cancel a blocking operation in python?) and signal handling in Java (only Term and Int, then using shutdown hooks) and Python (signal handling runs with a non-called frame in the main thread; Exceptions appear out of thin air and so are not safe). I love the philosophy of what I call "cancellation is always cooperative". Waitgroups and errgroups don't accept a context when blocking, forcing you to ensure that the goroutines they are blocking on return quickly.
- great conventions -- context, returning errors, channels/select built in, defer.
- project setup and build was a breeze.
- generics are still clunky. Cannot use []ConcreteType as a []AbstractInterface where ConcreteType implements AbstractInterface because the memory layout of a concrete struct is different from an interface (interface is a tuple of pointer to a struct alongside a type). Makes it fairly challenging to write generic code with containers -- all code must assume interface and never a concrete type.
- MacOS, or socket programming in general, is brittle when you push it to its limit.
- ECONNRESET on dial on the first client read. Client connected but server accepted the ACK and then dropped the connection silently. Means that any retry loop has to be after the first read, making it hard to isolate retry of dialing from general connection issues. MacOS issue because the incoming connection buffer is small.
- Cancelling and timing out with Golang sockets is tricky. You can't issue read/write to a socket with a context. Rather, you have to put a timeout on it (and you can change the timeout to 0 after the fact from a different goroutine to force it to wake up) or else chain closing the connection from a context. Not a huge deal but led to an interesting bug where I closed a channel via a context and then tried to use the connection again which is an error.
- it appears that on MacOS at least, writing and then closing a socket may result in the FIN reaching the peer before the data, and so they then write back and receive a RST. See https://cs.baylor.edu/~donahoo/practical/CSockets/TCPRST.pdf. My solution to this was to only ever close the connection after performing a read. Don't write and hang up. The pdf's additional solution is to close each direction of the socket separately as desired. Golang doesn't allow this.
- Great in theory.
- First performance test shows platform threads greatly outperform virtual threads (like 4x) although it looks like there's huge skew in platform threads and suggests that the echo is so fast that the thread is spin-waiting for a response.
- Structured Concurrency is great in that I no longer have to think about how cancellation works -- I don't have to find a thread to cancel it, I don't cancel a future and then have to reason about whether that future is associated with a specific thread (and CompletableFutures are a mess). It's still a little clunky to have threads interact. For example, when setting up a test in a Structured Task with subtasks for the server and the client, I want to start the server, start the client and block on it, and when the client finishes gracefully shut down the server. If either results in an error I want to cancel everything immediately. This isn't "first succeeded" because the server shouldn't succeed first. This isn't (just) "first failure" because I also need something to proceed once the client subtasks completes. I modelled this as the client task shutting down the whole scope once it completes, which means that neither task actually completes. It works, it's still a little unsatisfying that there isn't an easier way to let subtasks interact more directly. But I guess that's the point -- when you interact laterally things get complicated. Interacting with only your parent and your children is what gives it structure. Other ways to approach it: "first failure" with a condition variable or CompletableFuture passed from Client to Server for it to shut down (doesn't interrupt it though); "first failure" where the Client fails with a specific exception on success; new subtype of ConcurrentTaskScope that allows designating which subtasks are expected to finish (maybe a class that implements Runnable and has a field to indicate). Things get a little bit more formal, verbose, and harder to read when you have to build new classes or subtype in order to solve a problem. I want simple out of the box solutions.