From 8be19405113095952e9e014ba03b1884e605ab2c Mon Sep 17 00:00:00 2001 From: Christine Dodrill Date: Sat, 15 Feb 2020 16:58:55 -0500 Subject: [PATCH] blog: Why Rust, A Tale of Satori (#118) --- blog/why-rust-2020-02-15.markdown | 233 ++++++++++++++++++++++++++++++ 1 file changed, 233 insertions(+) create mode 100644 blog/why-rust-2020-02-15.markdown diff --git a/blog/why-rust-2020-02-15.markdown b/blog/why-rust-2020-02-15.markdown new file mode 100644 index 0000000..bc636a4 --- /dev/null +++ b/blog/why-rust-2020-02-15.markdown @@ -0,0 +1,233 @@ +--- +title: Why Rust +date: 2020-02-15 +tags: + - rust + - rant + - satori + - golang +--- + +# Why Rust + +Or: A Trip Report from my Satori with Rust and Functional Programming + +Software is a very odd field to work in. It is simultaneously an abstract and +physical one. You build systems that can deal with an unfathomable amount of +input and output at the same time. As a job, I peer into the madness of an +unthinking automaton and give order to the inherent chaos. I then emit +incantations to describe what this unthinking automaton should do in my stead. I +cannot possibly track the relations between a hundred thousand transactions +going on in real time, much less file them appropriately so they can be summoned +back should the need arise. + +However, this incantation (by necessity) is an _unthinkably_ precise and fickle +beast. It's almost as if you are training a four-year old to go to the store, +but doing it by having them read a grocery list. This grocery list has to be +precise enough that the four year old ends up getting what you want and not a +cart full of frosted flakes and candy bars. But, at the same time, the four year +old needs to understand it. Thus, the precision. + +There's many schools of thought around ways to write the grocery list. Some +follow a radically simple approach, relying on the toddler to figure things out +at the store. Sometimes this simpler approach doesn't work out in more obscure +scenarios, like when they are out of red grapes but do have green grapes, but it +tends to work out enough. Proponents of these list-making tools also will +advocate for doing full tests of the grocery list before they send the toddler +off to the store. This means setting up a fake grocery store with funny money, a +fake card, plastic food, the whole nine yards. This can get expensive and can +become a logistical issue (where are you going to store all that plastic fruit +in a way that you can just set up and tear down the grocery store mock so +quickly?). + +Another school of thought is that the process of writing the grocery list should +be done in a way that prevents ambiguity at the grocery store. This kind of flow +uses some more advanced concepts like the ability to describe something by its +attributes. For example, this could specify the difference between fruit and +vegetables, and only allow fruit to be put in one place of the cart and only +allow vegetables to be placed in the other. And if the writer of the list tries +to violate this, the list gets rejected and isn't used at all. + +There is yet another school of thought that decides that the exact spatial +position of the toddler relative to everything else should be thought of in +advance, along with a process to make sure that nothing is done in an improper +way. This means writing the list can be a lot harder at first, but it's much +less likely to result in the toddler coming back with a weird state. Consider +what happens if two items show up at the same time and the toddler tries to grab +both of them at the same time due to the instructions in the list! They only +have one arm to grab things with, so it just doesn't work. Proponents of the +more strict methods have reference cells and other mechanisms to ensure that the +toddler can only ever grab one thing at a time. + +If we were to match these three ludicrous examples to programming languages, the +first would be Lua, the second would be Go and the third would be something like +Haskell or Rust. Software development is a complicated process because the +problems involved with directing that unthinking automaton to do what you want +are hard. There is a lot going on, much in the same way there is a lot going on +when you send a toddler to do your grocery shopping for you. + +A good way to look at the tradeoffs involved is to see things as a balance +between two forces, pragmatism and correctness. Languages that are more +pragmatic are easier to develop in, but are mathematically more likely to run +into problems at runtime. Languages that are more correct take more investment +to write up front, but over time the correctness means that there's fewer failed +assumptions about what is going on. The compiler stops you from doing things +that don't make sense to it. This means that it's difficult to literally +impossible to create a bad state at runtime. + +Tools like Lua and Go can (and have) been used to develop stable and viable +software. [itch.io][itchio] is written in Lua running on top of nginx and it +handles financial transactions well enough that it's turned into the guy's full +time job. Google uses Go everywhere in their stack, and it's been used to create +powerful tools like Kubernetes, Caddy, and Docker. These tools are trusted +implicitly by a generation of developers, even though the language itself has +its flaws. If you are reading this blog in Firefox, statistically there is Rust +involved in the rendering and viewing of this post. Rust is built for ensuring +that code is _as correct as possible_, even if it means eating into development +time to ensure that. + +[itchio]: https://itch.io + +In Rust, you don't have to memorize rules about how and when it is safe to +update data in structures, because the compiler ensures you _cannot mess it up +by rejecting the code if you could be messing it up_. You don't have to run your +tests with a race detector or figure out how to expose that in production to +trace down that obscure double-write to a non-threadsafe hashmap, because in +Rust there is no such thing as a non-threadsafe hashmap. There is only a safe +hashmap and only can ever be a safe hashmap. + +As an absurd example, consider the following two snippets of code, one in Go and +one in Rust, both of them will put integers into a standard library list and +then print them all out: + +```go +l := list.New() // () -> *list.List +for i := 0; i < 5; i++ { + l.PushBack(i) // interface{} -> () +} + +for e := l.Front(); e != nil; e = e.Next() { + log.Printf("%T: %v", e.Value, e.Value) +} +``` + +```rust +let mut vec = Vec::new::(); // () -> Vec + +for i in 0..5 { + vec.push(i as i64); // (mut Vec, i64) -> () +} + +for i in vec.iter() { + println!("{}", i); +} +``` + +The Go version uses `interface{}` as the data element because Go [literally +cannot describe types as parameters to functions][gonerics]. The Rust version +took me a bit longer to write, but there is _no_ ambiguity as to what the vector +holds. The Go version can also hold multiple types of data in the same list, +a-la: + +[gonerics]: https://golang.org/doc/faq#generics + +```go +l := list.New() +l.PushBack(42) +l.PushBack("hotdogs") +l.PushBack(420.69) +``` + +All of which is valid because in Go, an `interface{}` matches _every kind of +value possible_. An integer is an `interface{}`. A floating-point number is an +`interface{}`. A string is an `interface{}`. A bool is an `interface{}`. Any +custom type you create is an `interface{}`. Normally, this would be very +restrictive and make it difficult to do things like JSON parsing. However the Go +runtime lets you hack around this with [reflection][wtfisreflection]. + +[wtfisreflection]: https://golangbot.com/reflection/ + +This allows the standard library to handle things like JSON parsing with +functions [that look like this](https://godoc.org/encoding/json#Unmarshal): + +``` +func Unmarshal(data []byte, v interface{}) error +``` + +There's even a set of complicated rules you need to memorize about how to trick +the JSON parser into massaging your data into place. This lets you do things +like this: + +```go +type Rilkef struct { + Foo string `json:"foo"` + CallToArms string `json:"call_to_arms"` +} +``` + +This allows the programmer a lot of flexibility while developing and compiling +the code. It's very easy for the compiler to say "oh, hey, that could be +anything, and you gave it some kind of anything, sounds legit to me", but then +the job of ensuring the sanity of the inputs is shunted to _runtime_ rather than +stopped before the code gets deployed. This means you need to test the code in +order to see how it behaves, making sure that _the standard library is doing its +job correctly_. This kind of stuff does not happen in Rust. + +The Rust version of this JSON example uses the [serde][serde] and +[serde_json][serdejson] libraries: + +[serde]: https://serde.rs +[serdejson]: https://serde.rs/json.html + +```rust +use serde::*; + +#[derive(Serialize, Deserialize)] +pub struct Rilkef { + pub foo: String, + pub call_to_arms: String, +} +``` + +And the logic for handling the correct rules for serialization and +deserialization is handled at _compile time_ by the compiler itself. Serde also +allows you to support more than just JSON, so this same type can be reused for +Dhall, YAML or whatever you could imagine. + +## tl;dr + +Rust allows for more correctness at the cost of developer efficiency. This is a +tradeoff, but I think it may actually be worth it. Code that is more correct is +more robust and less prone to failure than code that is less correct. This leads +to software that is less likely to crash at 3 am and wake you up due to a +preventable developer error. + +After working in Go for more than half a decade, I'm starting to think that it +is probably a better idea to impact developer velocity and force them to write +software that is more correct. Go works if you are careful about how you handle +it. It however amounts to a giant list of rules that you just have to know (like +maps not being threadsafe) and a lot of those rules come from battle rather than +from the development process. + +This came out as more of a rant than I had thought it would, but overall I hope +my point isn't lost. + +### Things You Might Complain About + +Yes, I know slices exist in Go. I wanted to prove a point about how the overuse +of `interface{}` in some relatively core things (like generic lists) can cause +headaches in term of correctness. Go will reject you trying to append a string +to an integer slice, but you cannot create a type that functions identically to +an integer slice. + +Go does have a race detector that will point out a lot of sins in concurrent +programs, but that is again at _runtime_, not at _compile time_. + +--- + +Many thanks to Tene, Sr. Oracle, A. Wilfox, Byte-slice, SiIvagunner and anyone +who watched the stream where I wrote this blogpost. If I got things wrong in +this, please [reach out to me](/contact) to let me know what I messed up. This +is a composite of a few twitter threads and a conversation I had on IRC. + +Thanks for reading, be well.