blog: Why Rust, A Tale of Satori (#118)
This commit is contained in:
parent
955e63fa76
commit
8be1940511
|
@ -0,0 +1,233 @@
|
|||
---
|
||||
title: Why Rust
|
||||
date: 2020-02-15
|
||||
tags:
|
||||
- rust
|
||||
- rant
|
||||
- satori
|
||||
- golang
|
||||
---
|
||||
|
||||
# Why Rust
|
||||
|
||||
Or: A Trip Report from my Satori with Rust and Functional Programming
|
||||
|
||||
Software is a very odd field to work in. It is simultaneously an abstract and
|
||||
physical one. You build systems that can deal with an unfathomable amount of
|
||||
input and output at the same time. As a job, I peer into the madness of an
|
||||
unthinking automaton and give order to the inherent chaos. I then emit
|
||||
incantations to describe what this unthinking automaton should do in my stead. I
|
||||
cannot possibly track the relations between a hundred thousand transactions
|
||||
going on in real time, much less file them appropriately so they can be summoned
|
||||
back should the need arise.
|
||||
|
||||
However, this incantation (by necessity) is an _unthinkably_ precise and fickle
|
||||
beast. It's almost as if you are training a four-year old to go to the store,
|
||||
but doing it by having them read a grocery list. This grocery list has to be
|
||||
precise enough that the four year old ends up getting what you want and not a
|
||||
cart full of frosted flakes and candy bars. But, at the same time, the four year
|
||||
old needs to understand it. Thus, the precision.
|
||||
|
||||
There's many schools of thought around ways to write the grocery list. Some
|
||||
follow a radically simple approach, relying on the toddler to figure things out
|
||||
at the store. Sometimes this simpler approach doesn't work out in more obscure
|
||||
scenarios, like when they are out of red grapes but do have green grapes, but it
|
||||
tends to work out enough. Proponents of these list-making tools also will
|
||||
advocate for doing full tests of the grocery list before they send the toddler
|
||||
off to the store. This means setting up a fake grocery store with funny money, a
|
||||
fake card, plastic food, the whole nine yards. This can get expensive and can
|
||||
become a logistical issue (where are you going to store all that plastic fruit
|
||||
in a way that you can just set up and tear down the grocery store mock so
|
||||
quickly?).
|
||||
|
||||
Another school of thought is that the process of writing the grocery list should
|
||||
be done in a way that prevents ambiguity at the grocery store. This kind of flow
|
||||
uses some more advanced concepts like the ability to describe something by its
|
||||
attributes. For example, this could specify the difference between fruit and
|
||||
vegetables, and only allow fruit to be put in one place of the cart and only
|
||||
allow vegetables to be placed in the other. And if the writer of the list tries
|
||||
to violate this, the list gets rejected and isn't used at all.
|
||||
|
||||
There is yet another school of thought that decides that the exact spatial
|
||||
position of the toddler relative to everything else should be thought of in
|
||||
advance, along with a process to make sure that nothing is done in an improper
|
||||
way. This means writing the list can be a lot harder at first, but it's much
|
||||
less likely to result in the toddler coming back with a weird state. Consider
|
||||
what happens if two items show up at the same time and the toddler tries to grab
|
||||
both of them at the same time due to the instructions in the list! They only
|
||||
have one arm to grab things with, so it just doesn't work. Proponents of the
|
||||
more strict methods have reference cells and other mechanisms to ensure that the
|
||||
toddler can only ever grab one thing at a time.
|
||||
|
||||
If we were to match these three ludicrous examples to programming languages, the
|
||||
first would be Lua, the second would be Go and the third would be something like
|
||||
Haskell or Rust. Software development is a complicated process because the
|
||||
problems involved with directing that unthinking automaton to do what you want
|
||||
are hard. There is a lot going on, much in the same way there is a lot going on
|
||||
when you send a toddler to do your grocery shopping for you.
|
||||
|
||||
A good way to look at the tradeoffs involved is to see things as a balance
|
||||
between two forces, pragmatism and correctness. Languages that are more
|
||||
pragmatic are easier to develop in, but are mathematically more likely to run
|
||||
into problems at runtime. Languages that are more correct take more investment
|
||||
to write up front, but over time the correctness means that there's fewer failed
|
||||
assumptions about what is going on. The compiler stops you from doing things
|
||||
that don't make sense to it. This means that it's difficult to literally
|
||||
impossible to create a bad state at runtime.
|
||||
|
||||
Tools like Lua and Go can (and have) been used to develop stable and viable
|
||||
software. [itch.io][itchio] is written in Lua running on top of nginx and it
|
||||
handles financial transactions well enough that it's turned into the guy's full
|
||||
time job. Google uses Go everywhere in their stack, and it's been used to create
|
||||
powerful tools like Kubernetes, Caddy, and Docker. These tools are trusted
|
||||
implicitly by a generation of developers, even though the language itself has
|
||||
its flaws. If you are reading this blog in Firefox, statistically there is Rust
|
||||
involved in the rendering and viewing of this post. Rust is built for ensuring
|
||||
that code is _as correct as possible_, even if it means eating into development
|
||||
time to ensure that.
|
||||
|
||||
[itchio]: https://itch.io
|
||||
|
||||
In Rust, you don't have to memorize rules about how and when it is safe to
|
||||
update data in structures, because the compiler ensures you _cannot mess it up
|
||||
by rejecting the code if you could be messing it up_. You don't have to run your
|
||||
tests with a race detector or figure out how to expose that in production to
|
||||
trace down that obscure double-write to a non-threadsafe hashmap, because in
|
||||
Rust there is no such thing as a non-threadsafe hashmap. There is only a safe
|
||||
hashmap and only can ever be a safe hashmap.
|
||||
|
||||
As an absurd example, consider the following two snippets of code, one in Go and
|
||||
one in Rust, both of them will put integers into a standard library list and
|
||||
then print them all out:
|
||||
|
||||
```go
|
||||
l := list.New() // () -> *list.List
|
||||
for i := 0; i < 5; i++ {
|
||||
l.PushBack(i) // interface{} -> ()
|
||||
}
|
||||
|
||||
for e := l.Front(); e != nil; e = e.Next() {
|
||||
log.Printf("%T: %v", e.Value, e.Value)
|
||||
}
|
||||
```
|
||||
|
||||
```rust
|
||||
let mut vec = Vec::new::<i64>(); // () -> Vec<i64>
|
||||
|
||||
for i in 0..5 {
|
||||
vec.push(i as i64); // (mut Vec<i64>, i64) -> ()
|
||||
}
|
||||
|
||||
for i in vec.iter() {
|
||||
println!("{}", i);
|
||||
}
|
||||
```
|
||||
|
||||
The Go version uses `interface{}` as the data element because Go [literally
|
||||
cannot describe types as parameters to functions][gonerics]. The Rust version
|
||||
took me a bit longer to write, but there is _no_ ambiguity as to what the vector
|
||||
holds. The Go version can also hold multiple types of data in the same list,
|
||||
a-la:
|
||||
|
||||
[gonerics]: https://golang.org/doc/faq#generics
|
||||
|
||||
```go
|
||||
l := list.New()
|
||||
l.PushBack(42)
|
||||
l.PushBack("hotdogs")
|
||||
l.PushBack(420.69)
|
||||
```
|
||||
|
||||
All of which is valid because in Go, an `interface{}` matches _every kind of
|
||||
value possible_. An integer is an `interface{}`. A floating-point number is an
|
||||
`interface{}`. A string is an `interface{}`. A bool is an `interface{}`. Any
|
||||
custom type you create is an `interface{}`. Normally, this would be very
|
||||
restrictive and make it difficult to do things like JSON parsing. However the Go
|
||||
runtime lets you hack around this with [reflection][wtfisreflection].
|
||||
|
||||
[wtfisreflection]: https://golangbot.com/reflection/
|
||||
|
||||
This allows the standard library to handle things like JSON parsing with
|
||||
functions [that look like this](https://godoc.org/encoding/json#Unmarshal):
|
||||
|
||||
```
|
||||
func Unmarshal(data []byte, v interface{}) error
|
||||
```
|
||||
|
||||
There's even a set of complicated rules you need to memorize about how to trick
|
||||
the JSON parser into massaging your data into place. This lets you do things
|
||||
like this:
|
||||
|
||||
```go
|
||||
type Rilkef struct {
|
||||
Foo string `json:"foo"`
|
||||
CallToArms string `json:"call_to_arms"`
|
||||
}
|
||||
```
|
||||
|
||||
This allows the programmer a lot of flexibility while developing and compiling
|
||||
the code. It's very easy for the compiler to say "oh, hey, that could be
|
||||
anything, and you gave it some kind of anything, sounds legit to me", but then
|
||||
the job of ensuring the sanity of the inputs is shunted to _runtime_ rather than
|
||||
stopped before the code gets deployed. This means you need to test the code in
|
||||
order to see how it behaves, making sure that _the standard library is doing its
|
||||
job correctly_. This kind of stuff does not happen in Rust.
|
||||
|
||||
The Rust version of this JSON example uses the [serde][serde] and
|
||||
[serde_json][serdejson] libraries:
|
||||
|
||||
[serde]: https://serde.rs
|
||||
[serdejson]: https://serde.rs/json.html
|
||||
|
||||
```rust
|
||||
use serde::*;
|
||||
|
||||
#[derive(Serialize, Deserialize)]
|
||||
pub struct Rilkef {
|
||||
pub foo: String,
|
||||
pub call_to_arms: String,
|
||||
}
|
||||
```
|
||||
|
||||
And the logic for handling the correct rules for serialization and
|
||||
deserialization is handled at _compile time_ by the compiler itself. Serde also
|
||||
allows you to support more than just JSON, so this same type can be reused for
|
||||
Dhall, YAML or whatever you could imagine.
|
||||
|
||||
## tl;dr
|
||||
|
||||
Rust allows for more correctness at the cost of developer efficiency. This is a
|
||||
tradeoff, but I think it may actually be worth it. Code that is more correct is
|
||||
more robust and less prone to failure than code that is less correct. This leads
|
||||
to software that is less likely to crash at 3 am and wake you up due to a
|
||||
preventable developer error.
|
||||
|
||||
After working in Go for more than half a decade, I'm starting to think that it
|
||||
is probably a better idea to impact developer velocity and force them to write
|
||||
software that is more correct. Go works if you are careful about how you handle
|
||||
it. It however amounts to a giant list of rules that you just have to know (like
|
||||
maps not being threadsafe) and a lot of those rules come from battle rather than
|
||||
from the development process.
|
||||
|
||||
This came out as more of a rant than I had thought it would, but overall I hope
|
||||
my point isn't lost.
|
||||
|
||||
### Things You Might Complain About
|
||||
|
||||
Yes, I know slices exist in Go. I wanted to prove a point about how the overuse
|
||||
of `interface{}` in some relatively core things (like generic lists) can cause
|
||||
headaches in term of correctness. Go will reject you trying to append a string
|
||||
to an integer slice, but you cannot create a type that functions identically to
|
||||
an integer slice.
|
||||
|
||||
Go does have a race detector that will point out a lot of sins in concurrent
|
||||
programs, but that is again at _runtime_, not at _compile time_.
|
||||
|
||||
---
|
||||
|
||||
Many thanks to Tene, Sr. Oracle, A. Wilfox, Byte-slice, SiIvagunner and anyone
|
||||
who watched the stream where I wrote this blogpost. If I got things wrong in
|
||||
this, please [reach out to me](/contact) to let me know what I messed up. This
|
||||
is a composite of a few twitter threads and a conversation I had on IRC.
|
||||
|
||||
Thanks for reading, be well.
|
Loading…
Reference in New Issue