forked from cadey/xesite
234 lines
10 KiB
Markdown
234 lines
10 KiB
Markdown
|
---
|
||
|
title: Why Rust
|
||
|
date: 2020-02-15
|
||
|
tags:
|
||
|
- rust
|
||
|
- rant
|
||
|
- satori
|
||
|
- golang
|
||
|
---
|
||
|
|
||
|
# Why Rust
|
||
|
|
||
|
Or: A Trip Report from my Satori with Rust and Functional Programming
|
||
|
|
||
|
Software is a very odd field to work in. It is simultaneously an abstract and
|
||
|
physical one. You build systems that can deal with an unfathomable amount of
|
||
|
input and output at the same time. As a job, I peer into the madness of an
|
||
|
unthinking automaton and give order to the inherent chaos. I then emit
|
||
|
incantations to describe what this unthinking automaton should do in my stead. I
|
||
|
cannot possibly track the relations between a hundred thousand transactions
|
||
|
going on in real time, much less file them appropriately so they can be summoned
|
||
|
back should the need arise.
|
||
|
|
||
|
However, this incantation (by necessity) is an _unthinkably_ precise and fickle
|
||
|
beast. It's almost as if you are training a four-year old to go to the store,
|
||
|
but doing it by having them read a grocery list. This grocery list has to be
|
||
|
precise enough that the four year old ends up getting what you want and not a
|
||
|
cart full of frosted flakes and candy bars. But, at the same time, the four year
|
||
|
old needs to understand it. Thus, the precision.
|
||
|
|
||
|
There's many schools of thought around ways to write the grocery list. Some
|
||
|
follow a radically simple approach, relying on the toddler to figure things out
|
||
|
at the store. Sometimes this simpler approach doesn't work out in more obscure
|
||
|
scenarios, like when they are out of red grapes but do have green grapes, but it
|
||
|
tends to work out enough. Proponents of these list-making tools also will
|
||
|
advocate for doing full tests of the grocery list before they send the toddler
|
||
|
off to the store. This means setting up a fake grocery store with funny money, a
|
||
|
fake card, plastic food, the whole nine yards. This can get expensive and can
|
||
|
become a logistical issue (where are you going to store all that plastic fruit
|
||
|
in a way that you can just set up and tear down the grocery store mock so
|
||
|
quickly?).
|
||
|
|
||
|
Another school of thought is that the process of writing the grocery list should
|
||
|
be done in a way that prevents ambiguity at the grocery store. This kind of flow
|
||
|
uses some more advanced concepts like the ability to describe something by its
|
||
|
attributes. For example, this could specify the difference between fruit and
|
||
|
vegetables, and only allow fruit to be put in one place of the cart and only
|
||
|
allow vegetables to be placed in the other. And if the writer of the list tries
|
||
|
to violate this, the list gets rejected and isn't used at all.
|
||
|
|
||
|
There is yet another school of thought that decides that the exact spatial
|
||
|
position of the toddler relative to everything else should be thought of in
|
||
|
advance, along with a process to make sure that nothing is done in an improper
|
||
|
way. This means writing the list can be a lot harder at first, but it's much
|
||
|
less likely to result in the toddler coming back with a weird state. Consider
|
||
|
what happens if two items show up at the same time and the toddler tries to grab
|
||
|
both of them at the same time due to the instructions in the list! They only
|
||
|
have one arm to grab things with, so it just doesn't work. Proponents of the
|
||
|
more strict methods have reference cells and other mechanisms to ensure that the
|
||
|
toddler can only ever grab one thing at a time.
|
||
|
|
||
|
If we were to match these three ludicrous examples to programming languages, the
|
||
|
first would be Lua, the second would be Go and the third would be something like
|
||
|
Haskell or Rust. Software development is a complicated process because the
|
||
|
problems involved with directing that unthinking automaton to do what you want
|
||
|
are hard. There is a lot going on, much in the same way there is a lot going on
|
||
|
when you send a toddler to do your grocery shopping for you.
|
||
|
|
||
|
A good way to look at the tradeoffs involved is to see things as a balance
|
||
|
between two forces, pragmatism and correctness. Languages that are more
|
||
|
pragmatic are easier to develop in, but are mathematically more likely to run
|
||
|
into problems at runtime. Languages that are more correct take more investment
|
||
|
to write up front, but over time the correctness means that there's fewer failed
|
||
|
assumptions about what is going on. The compiler stops you from doing things
|
||
|
that don't make sense to it. This means that it's difficult to literally
|
||
|
impossible to create a bad state at runtime.
|
||
|
|
||
|
Tools like Lua and Go can (and have) been used to develop stable and viable
|
||
|
software. [itch.io][itchio] is written in Lua running on top of nginx and it
|
||
|
handles financial transactions well enough that it's turned into the guy's full
|
||
|
time job. Google uses Go everywhere in their stack, and it's been used to create
|
||
|
powerful tools like Kubernetes, Caddy, and Docker. These tools are trusted
|
||
|
implicitly by a generation of developers, even though the language itself has
|
||
|
its flaws. If you are reading this blog in Firefox, statistically there is Rust
|
||
|
involved in the rendering and viewing of this post. Rust is built for ensuring
|
||
|
that code is _as correct as possible_, even if it means eating into development
|
||
|
time to ensure that.
|
||
|
|
||
|
[itchio]: https://itch.io
|
||
|
|
||
|
In Rust, you don't have to memorize rules about how and when it is safe to
|
||
|
update data in structures, because the compiler ensures you _cannot mess it up
|
||
|
by rejecting the code if you could be messing it up_. You don't have to run your
|
||
|
tests with a race detector or figure out how to expose that in production to
|
||
|
trace down that obscure double-write to a non-threadsafe hashmap, because in
|
||
|
Rust there is no such thing as a non-threadsafe hashmap. There is only a safe
|
||
|
hashmap and only can ever be a safe hashmap.
|
||
|
|
||
|
As an absurd example, consider the following two snippets of code, one in Go and
|
||
|
one in Rust, both of them will put integers into a standard library list and
|
||
|
then print them all out:
|
||
|
|
||
|
```go
|
||
|
l := list.New() // () -> *list.List
|
||
|
for i := 0; i < 5; i++ {
|
||
|
l.PushBack(i) // interface{} -> ()
|
||
|
}
|
||
|
|
||
|
for e := l.Front(); e != nil; e = e.Next() {
|
||
|
log.Printf("%T: %v", e.Value, e.Value)
|
||
|
}
|
||
|
```
|
||
|
|
||
|
```rust
|
||
|
let mut vec = Vec::new::<i64>(); // () -> Vec<i64>
|
||
|
|
||
|
for i in 0..5 {
|
||
|
vec.push(i as i64); // (mut Vec<i64>, i64) -> ()
|
||
|
}
|
||
|
|
||
|
for i in vec.iter() {
|
||
|
println!("{}", i);
|
||
|
}
|
||
|
```
|
||
|
|
||
|
The Go version uses `interface{}` as the data element because Go [literally
|
||
|
cannot describe types as parameters to functions][gonerics]. The Rust version
|
||
|
took me a bit longer to write, but there is _no_ ambiguity as to what the vector
|
||
|
holds. The Go version can also hold multiple types of data in the same list,
|
||
|
a-la:
|
||
|
|
||
|
[gonerics]: https://golang.org/doc/faq#generics
|
||
|
|
||
|
```go
|
||
|
l := list.New()
|
||
|
l.PushBack(42)
|
||
|
l.PushBack("hotdogs")
|
||
|
l.PushBack(420.69)
|
||
|
```
|
||
|
|
||
|
All of which is valid because in Go, an `interface{}` matches _every kind of
|
||
|
value possible_. An integer is an `interface{}`. A floating-point number is an
|
||
|
`interface{}`. A string is an `interface{}`. A bool is an `interface{}`. Any
|
||
|
custom type you create is an `interface{}`. Normally, this would be very
|
||
|
restrictive and make it difficult to do things like JSON parsing. However the Go
|
||
|
runtime lets you hack around this with [reflection][wtfisreflection].
|
||
|
|
||
|
[wtfisreflection]: https://golangbot.com/reflection/
|
||
|
|
||
|
This allows the standard library to handle things like JSON parsing with
|
||
|
functions [that look like this](https://godoc.org/encoding/json#Unmarshal):
|
||
|
|
||
|
```
|
||
|
func Unmarshal(data []byte, v interface{}) error
|
||
|
```
|
||
|
|
||
|
There's even a set of complicated rules you need to memorize about how to trick
|
||
|
the JSON parser into massaging your data into place. This lets you do things
|
||
|
like this:
|
||
|
|
||
|
```go
|
||
|
type Rilkef struct {
|
||
|
Foo string `json:"foo"`
|
||
|
CallToArms string `json:"call_to_arms"`
|
||
|
}
|
||
|
```
|
||
|
|
||
|
This allows the programmer a lot of flexibility while developing and compiling
|
||
|
the code. It's very easy for the compiler to say "oh, hey, that could be
|
||
|
anything, and you gave it some kind of anything, sounds legit to me", but then
|
||
|
the job of ensuring the sanity of the inputs is shunted to _runtime_ rather than
|
||
|
stopped before the code gets deployed. This means you need to test the code in
|
||
|
order to see how it behaves, making sure that _the standard library is doing its
|
||
|
job correctly_. This kind of stuff does not happen in Rust.
|
||
|
|
||
|
The Rust version of this JSON example uses the [serde][serde] and
|
||
|
[serde_json][serdejson] libraries:
|
||
|
|
||
|
[serde]: https://serde.rs
|
||
|
[serdejson]: https://serde.rs/json.html
|
||
|
|
||
|
```rust
|
||
|
use serde::*;
|
||
|
|
||
|
#[derive(Serialize, Deserialize)]
|
||
|
pub struct Rilkef {
|
||
|
pub foo: String,
|
||
|
pub call_to_arms: String,
|
||
|
}
|
||
|
```
|
||
|
|
||
|
And the logic for handling the correct rules for serialization and
|
||
|
deserialization is handled at _compile time_ by the compiler itself. Serde also
|
||
|
allows you to support more than just JSON, so this same type can be reused for
|
||
|
Dhall, YAML or whatever you could imagine.
|
||
|
|
||
|
## tl;dr
|
||
|
|
||
|
Rust allows for more correctness at the cost of developer efficiency. This is a
|
||
|
tradeoff, but I think it may actually be worth it. Code that is more correct is
|
||
|
more robust and less prone to failure than code that is less correct. This leads
|
||
|
to software that is less likely to crash at 3 am and wake you up due to a
|
||
|
preventable developer error.
|
||
|
|
||
|
After working in Go for more than half a decade, I'm starting to think that it
|
||
|
is probably a better idea to impact developer velocity and force them to write
|
||
|
software that is more correct. Go works if you are careful about how you handle
|
||
|
it. It however amounts to a giant list of rules that you just have to know (like
|
||
|
maps not being threadsafe) and a lot of those rules come from battle rather than
|
||
|
from the development process.
|
||
|
|
||
|
This came out as more of a rant than I had thought it would, but overall I hope
|
||
|
my point isn't lost.
|
||
|
|
||
|
### Things You Might Complain About
|
||
|
|
||
|
Yes, I know slices exist in Go. I wanted to prove a point about how the overuse
|
||
|
of `interface{}` in some relatively core things (like generic lists) can cause
|
||
|
headaches in term of correctness. Go will reject you trying to append a string
|
||
|
to an integer slice, but you cannot create a type that functions identically to
|
||
|
an integer slice.
|
||
|
|
||
|
Go does have a race detector that will point out a lot of sins in concurrent
|
||
|
programs, but that is again at _runtime_, not at _compile time_.
|
||
|
|
||
|
---
|
||
|
|
||
|
Many thanks to Tene, Sr. Oracle, A. Wilfox, Byte-slice, SiIvagunner and anyone
|
||
|
who watched the stream where I wrote this blogpost. If I got things wrong in
|
||
|
this, please [reach out to me](/contact) to let me know what I messed up. This
|
||
|
is a composite of a few twitter threads and a conversation I had on IRC.
|
||
|
|
||
|
Thanks for reading, be well.
|