diff --git a/blog/gonads-2022-04-24.markdown b/blog/gonads-2022-04-24.markdown new file mode 100644 index 0000000..b60a266 --- /dev/null +++ b/blog/gonads-2022-04-24.markdown @@ -0,0 +1,550 @@ +--- +title: Crimes with Go Generics +date: 2022-04-24 +tags: + - cursed + - golang + - generics +vod: + twitch: https://www.twitch.tv/videos/1465727432 + youtube: https://youtu.be/UiJtaKYQnzg +--- + +Go 1.18 added [generics](https://go.dev/doc/tutorial/generics) to the +language. This allows you to have your types take types as parameters +so that you can create composite types (types out of types). This lets +you get a lot of expressivity and clarity about how you use Go. + +However, if you are looking for good ideas on how to use Go generics, +this is not the post for you. This is full of bad ideas. This post is +full of ways that you should not use Go generics in production. Do not +copy the examples in this post into production. By reading this post +you agree to not copy the examples in this post into production. + +I have put my code for this article [on my git +server](https://tulpa.dev/internal/gonads). This repo has been +intentionally designed to be difficult to use in production by me +taking the following steps: + +1. I have created it under a Gitea organization named `internal`. This + will make it impossible for you to import the package unless you + are using it from a repo on my Gitea server. Signups are disabled + on that Gitea server. See + [here](https://go.dev/doc/go1.4#internalpackages) for more + information about the internal package rule. +1. The package documentation contains a magic comment that will make + Staticcheck and other linters complain that you are using this + package even though it is deprecated. + +What is that package +name? + +It's a reference to +Haskell's monads, but adapted to Go as a pun. + +A gonad is just a gonoid in the +category of endgofunctors. What's there to be confused +about? + +\*sigh\* + +## `Queue[T]` + +To start things out, let's show off a problem in computer science that +is normally difficult. Let's make a MPMS (multiple producer, multiple +subscriber) queue. + +First we are going to need a struct to wrap everything around. It will +look like this: + +```go +type Queue[T any] struct { + data chan T +} +``` + +This creates a type named `Queue` that takes a type argument `T`. This +`T` can be absolutely anything, but the only requirement is that the +data is a Go type. + +You can create a little constructor for `Queue` instances with a +function like this: + +```go +func NewQueue[T any](size int) Queue[T] { + return Queue[T]{ + data: make(chan T, size), + } +} +``` + +Now let's make some methods on the `Queue` struct that will let us +push to the queue and pop from the queue. They could look like this: + +```go +func (q Queue[T]) Push(val T) { + q.data <- val +} + +func (q Queue[T]) Pop() T { + return <-q.data +} +``` + +These methods will let you put data at the end of the queue and then +pull it out from the beginning. You can use them like this: + +```go +q := NewQueue[string](5) +q.Push("hi there") +str := q.Pop() +if str != "hi there" { + panic("string is wrong") +} +``` + +This is good, but the main problem comes from trying to pop from an +empty queue. It'll stay there forever doing nothing. We can use the +`select` statement to allow us to write a nonblocking version of the +`Pop` function: + +```go +func (q Queue[T]) TryPop() (T, bool) { + select { + case val := <-q.data: + return val, true + default: + return nil, false + } +} +``` + +However when we try to compile this, we get an error: + +``` +cannot use nil as T value in return statement +``` + +In that code, `T` can be _anything_, including values that may not be +able to be `nil`. We can work around this by taking advantage of the +`var` statement, which makes a new variable and initializes it to the +zero value of that type: + +```go +func Zero[T any]() T { + var zero T + return zero +} +``` + +When we run the `Zero` function like +[this](https://go.dev/play/p/Z5tRs1-aKBU): + +```go +log.Printf("%q", Zero[string]()) +log.Printf("%v", Zero[int]()) +``` + +We get output that looks like this: + +``` +2009/11/10 23:00:00 "" +2009/11/10 23:00:00 0 +``` + +So we can adapt the `default` branch of `TryPop` to this: + +```go +func (q Queue[T]) TryPop() (T, bool) { + select { + case val := <-q.data: + return val, true + default: + var zero T + return zero, false + } +} +``` + +And finally write a test for good measure: + +```go +func TestQueue(t *testing.T) { + q := NewQueue[int](5) + for i := range make([]struct{}, 5) { + q.Push(i) + } + + for range make([]struct{}, 5) { + t.Log(q.Pop()) + } +} +``` + +## `Option[T]` + +In Go, people use pointer values for a number of reasons: + +1. A pointer value may be `nil`, so this can signal that the value may + not exist. +1. A pointer value only stores the offset in memory, so passing around + the value causes Go to only copy the pointer instead of copying the + value being passed around. +1. A pointer value being passed to a function lets you mutate values + in the value being passed. Otherwise Go will copy the value and you + can mutate it all you want, but the changes you made will not + persist past that function call. You can sort of consider this to + be "immutable", but it's not as strict as something like passing + `&mut T` to functions in Rust. + +This `Option[T]` type will help us model the first kind of constraint: +a value that may not exist. We can define it like this: + +```go +type Option[T any] struct { + val *T +} +``` + +Then you can define a couple methods to use this container: + +```go +var ErrOptionIsNone = errors.New("gonads: Option[T] has no value") + +func (o Option[T]) Take() (T, error) { + if o.IsNone() { + var zero T + return zero, ErrOptionIsNone + } + + return *o.val, nil +} + +func (o *Option[T]) Set(val T) { + o.val = &val +} + +func (o *Option[T]) Clear() { + o.val = nil +} +``` + +Some other functions that will be useful will be an `IsSome` function +to tell if the `Option` contains a value. We can use this to also +implement an `IsNone` function that will let you tell if that `Option` +_does not_ contain a value. They will look like this: + +```go +func (o Option[T]) IsSome() bool { + return o.val != nil +} + +func (o Option[T]) IsNone() bool { + return !o.IsSome() +} +``` + +We can say that if an Option does not have something in it, it has +nothing in it. This will let us use `IsSome` to implement `IsNone`. + +Finally we can add all this up to a `Yank` function, which is similar +to +[`Option::unwrap()`](https://doc.rust-lang.org/rust-by-example/error/option_unwrap.html) +in Rust: + +```go +func (o Option[T]) Yank() T { + if o.IsNone() { + panic("gonads: Yank on None Option") + } + + return *o.val +} +``` + +This will all be verified in a Go test: + +```go +func TestOption(t *testing.T) { + o := NewOption[string]() + val, err := o.Take() + if err == nil { + t.Fatalf("[unexpected] wanted no value out of Option[T], got: %v", val) + } + + o.Set("hello friendos") + _, err = o.Take() + if err != nil { + t.Fatalf("[unexpected] wanted no value out of Option[T], got: %v", err) + } + + o.Clear() + if o.IsSome() { + t.Fatal("Option should have none, but has some") + } +} +``` + +I think that +Option[T] will be the most useful outside of this post. +It will need some work and generalization, but this may be something +that the Go team will have to make instead of some random +person. + +## `Thunk[T]` + +In computer science we usually deal with values and computations. +Usually we deal with one or the other. Sometimes computations can be +treated as values, but this is very rare. It's even more rare to take +a partially completed computation and use it as a value. + +A thunk is a partially evaluated computation that is stored as a +value. For an idea of what I'm talking about, let's consider this +JavaScript function: + +```javascript +const add = (x, y) => x + y; +console.log(add(2, 2)); // 4 +``` + +This creates a function called `add` that takes two arguments and +returns one argument. This is great in many cases, but it makes it +difficult for us to bind only one argument to the function and leave +the other as a variable input. What if computing the left hand side of +`add` is expensive and only needed once? + +Instead we can write `add` like this: + +```javascript +const add = (x) => (y) => x + y; +console.log(add(2)(2)); // 4 +``` + +This also allows us to make partially evaluated forms of `add` like +`addTwo`: + +```javascript +const addTwo = add(2); +console.log(addTwo(3)); // 5 +``` + +This can also be used with functions that do not take arguments, so +you can pass around a value that isn't computed yet and then only +actually compute it when needed: + +```javascript +const hypotenuse = (x, y) => Math.sqrt(x * x + y * y); +const thunk = () => hypot(3, 4); +``` + +You can then pass this thunk to functions _without having to evaluate +it_ until it is needed: + +```javascript +dominateWorld(thunk); // thunk is passed as an unevaluated function +``` + +We can implement this in Go by using a type like the following: + +```go +type Thunk[T any] struct { + doer func() T +} +``` + +And then force the thunk to evaluate with a function such as `Force`: + +```go +func (t Thunk[T]) Force() T { + return t.doer() +} +``` + +This works, however we can also go one step further than we did with +the JavaScript example. We can take advantage of the `Thunk[T]` +container to cache the result of the `doer` function so that calling +it multiple times will only actually it once and return the same +result. + +Keep in mind that this will +only work for _pure functions_, or functions that don't modify the +outside world. This isn't just global variables either, but any +function that modifies any state anywhere, including network and +filesystem IO. + +This would make `Thunk[T]` be implemented like this: + +```go +type Thunk[T any] struct { + doer func() T // action being thunked + o *Option[T] // cache for complete thunk data +} + +func (t *Thunk[T]) Force() T { + if t.o.IsSome() { + return t.o.Yank() + } + + t.o.Set(t.doer()) + return t.o.Yank() +} + +func NewThunk[T any](doer func() T) *Thunk[T] { + return &Thunk[T]{ + doer: doer, + o: NewOption[T](), + } +} +``` + +- [ ] Fibonacci example + - [ ] Why is this slow? I have no idea + - [ ] Numa\ this is the power of gonads! + +Now, for an overcomplicated example you can use this to implement the +Fibonacci function. We can start out by writing a naiive Fibonacci +function like this: + +```go +func Fib(n int) int { + if n <= 1 { + return n + } + + return Fib(n-1) + Fib(n-2) +} +``` + +We can turn this into a Go test in order to see how long it takes for +it to work: + +```go +func TestRecurFib(t *testing.T) { + t.Log(Fib(40)) +} +``` + +Then when we run `go test`: + +```console +$ go test -run RecurFib +=== RUN TestRecurFib + thunk_test.go:15: 102334155 +--- PASS: TestRecurFib (0.36s) +``` + +However, we can make this a lot more complicated with the power of the +`Thunk[T]` type: + +```go +func TestThunkFib(t *testing.T) { + cache := make([]*Thunk[int], 41) + + var fib func(int) int + fib = func(n int) int { + if cache[n].o.IsSome() { + return *cache[n].o.val + } + return fib(n-1) + fib(n-2) + } + + for i := range cache { + i := i + cache[i] = NewThunk(func() int { return fib(i) }) + } + cache[0].o.Set(0) + cache[1].o.Set(1) + + t.Log(cache[40].Force()) +} +``` + +And then run the test: + +``` +=== RUN TestThunkFib + thunk_test.go:36: 102334155 +--- PASS: TestThunkFib (0.60s) +``` + +Why is this so much slower? This +should be caching the intermediate values. Maybe something like this +would be faster? This should complete near instantly, +right? + +```go +func TestMemoizedFib(t *testing.T) { + mem := map[int]int{ + 0: 0, + 1: 1, + } + + var fib func(int) int + fib = func(n int) int { + if result, ok := mem[n]; ok { + return result + } + + result := fib(n-1) + fib(n-2) + mem[n] = result + return result + } + + t.Log(fib(40)) +} +``` + +```console +$ go test -run Memoized +=== RUN TestMemoizedFib + thunk_test.go:35: 102334155 +--- PASS: TestMemoizedFib (0.00s) +``` + +I'm not sure +either. + +If you change the `fib` function to this, it works, but it also steps +around the `Thunk[T]` type: + +```go +fib = func(n int) int { + if cache[n].o.IsSome() { + return *cache[n].o.val + } + + result := fib(n-1) + fib(n-2) + cache[n].o.Set(result) + return result +} +``` + +This completes instantly: + +``` +=== RUN TestThunkFib + thunk_test.go:59: 102334155 +--- PASS: TestThunkFib (0.00s) +``` + +To be clear, this isn't the fault of Go generics. I'm almost certain +that my terrible code is causing this to be much slower. + +This is the power of gonads: +making easy code complicated, harder to reason about and slower than +the naiive approach! Why see this as terrible code when it creates an +amazing opportunity for cloud providers to suggest that people use +gonads' `Thunk[T]` so that they use more CPU and then have to pay cloud +providers more money for CPU! Think about the children! + +--- + +I'm glad that Go has added generics to the language. It's certainly +going to make a lot of things a lot easier and more expressive. I'm +worried that the process of learning how to use generics in Go is +going to create a lot of churn and toil as people get up to speed on +when and where they should be used. These should be used in specific +cases, not as a bread and butter tool. + +I hope this was an interesting look into how you can use generics in +Go, but again please don't use these examples in production.