diff --git a/blog/olin-1-why-09-1-2018.markdown b/blog/olin-1-why-09-1-2018.markdown new file mode 100644 index 0000000..8e5a601 --- /dev/null +++ b/blog/olin-1-why-09-1-2018.markdown @@ -0,0 +1,267 @@ +--- +title: "Olin: 1: Why" +date: 2018-09-01 +--- + +# [Olin][olin]: 1: Why + +[Olin][olin] is an attempt at defining a radically new primitive to make it +easier to reason about, deploy and operate event-driven services that are +independent of the OS or CPU of the computer they are running on. It will have +components that take care of the message queue offsetting, retry logic, +parallelism and most other concerns except for your application's state layer. + +Olin is designed to work top on two basic concepts: types and handlers. Types +are some bit of statically defined data that has a meaning to humans. An example +type could be the following: + +``` +package example; + +message UserLoginEvent { + string user_id = 1; + string user_ip_address = 2; + string device = 3; + int64 timestamp_utc_unix = 4; +} +``` + +and when matching data is written to the queue for the event type `example.UserLoginEvent`, +all of the handlers registered to that data type will run with serialized protocol +buffer bytes as its standard input. If the handlers return a nonzero exit status, +they are retried up to three times, exponentially backing off. +Handlers need to deal with the fact they can be run out of order. + +## Background + +Very frequently, I end up needing to write applications that basically end up +waiting forever to make sure things get put in the right place and then the +right code runs as a response. I then have to make sure these things get put +in the right places and then that the right versions of things are running for +each of the relevant services. This doesn't scale very well, not to mention is +hard to secure. This leads to a lot of duplicate infrastructure over time and +as things grow. Not to mention adding in tracing, metrics and log aggreation. + +I would like to change this. + +I would like to make a perscriptive environment kinda like [Google Cloud Functions][gcf] +or [AWS Lambda][lambda] backed by a durable message queue and with handlers +compiled to webassembly to ensure forward compatibility. As such, the ABI +involved will be versioned, documented and tested. Multiple ABI's will eventually +need to be maintained in parallel, so it might be good to get used to that early +on. + +You should not have to write ANY code but the bare minimum needed in order to +perform your buisiness logic. You don't need to care about distributed tracing. +You don't need to care about logging. + +I want this project to last decades. I want the binary modules any user of Olin +would upload today to be still working, untouched, in 5 years, assuming its +dependencies outside of the module still work. + +Since this requires a stable ABI in the long run, I would like to propose the +following _unstable_ ABI as a particularly minimal starting point to work out +the ideas at play, and see how little of a surface area we can expose while +still allowing for useful programs to be created and run. + +## Dagger + +> The dagger of light that renders your self-importance a decisive death + +Dagger is the first ABI that will be used for interfacing with the outside world. +This will be mostly for an initial spike out of the basic ideas to see what it's +like while the rest of the plan is being stabilized and implemented. +The core idea is that everything is a file, to the point that the file descriptor +and file handle array are the only real bits of persistent state for the process. +HTTP sessions, logging writers, TCP sockets, operating system files, cryptographic +random readers, everything is done via filesystem system calls. + +Consider this the first draft of Dagger, everything here is subject to change. +This is going to be the experimental phase. + +### Base Environment + +When a dagger process is opened, the following files are open: + +- 0: standard input: the semantic "input" of the program. +- 1: standard output: the standard output of the program. +- 2: standard error: error output for the program. + +Any memory address above byte 4096 is free for implementing applications to use. +Memory addresses below byte 4096 are reserved for future internal-only use. + +### File Handlers + +In the open call (defined later), a file URL is specified instead of a file name. +This allows for Dagger to natively offer programs using it quick access to common +services like HTTP, logging or pretty much anything else. + +I'm playing with the following handlers currently: + +- http and https (Write request as http/1.1 request and sync(), Read response as http/1.1 response and close()) `http://ponyapi.apps.xeserv.us/newest` + +I'd like to add the following handlers in the future: + +- file - filesystem files on the host OS (dangerous!) `file:///etc/hostname` +- tcp - TCP connections `tcp://10.0.0.39:1337` +- tcp+tls - TCP connections with TLS `tcp+tls://10.0.0.39:31337` +- meta - metadata about the runtime or the event `meta://host/hostname`, `meta://event/created_at` +- project - writers of other event types for this project (more on this, again, in future posts) `project://example.UserLoginEvent` +- rand - cryptographically secure random data good for use in crypto keys `rand://` +- time - unix timestamp in a little-endian encoded int64 on every read() - `time://utc` + +### Handler Function + +Each Dagger module can only handle one data type. This is intentional. This +forces users to make a separate handler for each type of data they want to +handle. The handler function reads its input from standard input and then +returns `0` if whatever it needs to do "worked" (for some definition of success). +Each ABI, unfortunately, will have to have its own "main" semantics. For Dagger, +these semantics are used: + +- The entrypoint is exposed func `handle` that takes no arguments and returns an int32. +- The input message packet is on standard input implicitly. +- Returning 0 from func `handle` will mark the event as a success, returning anything else will mark it as a failure and trigger an automatic retry. + +In clang in C mode, you could define the entrypoint for a handler module like this: + +```c +// handle_nothing.c + +__attribute__ ((visibility ("default"))) +int handle() { + // read all of standard input to memory and handle it + return 0; // success +} +``` + +### System Calls + +A [system call][syscall] is how computer programs interface with the outside +world. When a dagger program makes a system call, the amount of time the program +spends waiting for that system call is collected and recorded based on what +underlying resource took care of the call. This means, in theory, users of olin +could alert on HTTP requests from one service to another taking longer amounts +of time very trivially. + +Dagger uses the following system calls: + +- open +- close +- read +- write +- sync + +Each of the system calls will be documented with their C and WebAssembly Text format +type/import definitions and a short bit of prose explaining them. A future +blogpost will outline the implementation of Dagger's system calls and why the +choices made in its design were made. + +#### open + +```c +extern int open(const char *furl, int flags); +(func $open (import "dagger" "open") (param i32 i32) (result i32)) +``` + +This opens a file with the given file URL and flags. The flags are only relevant +for some backend schemes. Most of the time, the flags argument can be set to `0`. + +#### close + +```c +extern int close(int fd); +(func $close (import "dagger" "close") (param i32) (result i32)) +``` + +Close closes a file and returns if it failed or not. If this call returns nonzero, +you don't know what state the world is in. Panic. + +#### read + +```c +extern int read(int fd, void *buf, int nbyte); +(func $read (import "dagger" "read") (param i32 i32 i32) (result i32)) +``` + +Read attempts to read up to count bytes from file descriptor fd into the buffer +starting at buf. + +#### write + +```c +extern int write(int fd, void *buf, int nbyte); +(func $write (import "dagger" "write") (param i32 i32 i32) (result i32)) +``` + +Write writes up to count bytes from the buffer starting at buf to the file +referred to by the file descriptor fd. + +#### sync + +```c +extern int sync(int fd); +(func $sync (import "dagger" "sync") (param i32) (result i32)) +``` + +This is for some backends to forcibly make async operations into sync operations. +With the HTTP backend, for example, calling sync actually kicks off the +dependent HTTP request. + +## Go ABI + +Olin also includes support for running webassembly modules created by [Go 1.11's webassembly support](https://golang.org/wiki/WebAssembly). +It uses [the `wasmgo` ABI][wasmgo] package in order to do things. Right now +this is incredibly basic, but should be extendable to more things in the future. + +As an example: + +```go +// +build js,wasm ignore +// hello_world.go + +package main + +func main() { + println("Hello, world!") +} +``` + +when compiled like this: + +```console +$ GOARCH=wasm GOOS=js go1.11 build -o hello_world.wasm hello_world.go +``` + +produces the following output when run with the testing shim: + +``` +=== RUN TestWasmGo/github.com/Xe/olin/internal/abi/wasmgo.testHelloWorld +Hello, world! +--- PASS: TestWasmGo (1.66s) + --- PASS: TestWasmGo/github.com/Xe/olin/internal/abi/wasmgo.testHelloWorld (1.66s) +``` + +Currently Go binaries cannot interface with the Dagger ABI. There is [an issue](https://github.com/Xe/olin/issues/5) +open to track the solution to this. + +Future posts will include more detail about using Go on top of Olin, including +how support for Go's compiled webassembly modules was added to Olin. + +## Project Meta + +To follow the project, check it on GitHub [here][olin]. To talk about it on Slack, +join the [Go community Slack][goslack] and join `#olin`. + +Thank you for reading this post, I hope it wasn't too technical too fast, but +there is a lot of base context required with this kind of technology. I will +attempt to make things more detailed and clear in future posts as I come up with +ways to explain this easier. Please consider this the 10,000 mile overview of +a very long-term project that radically redesigns how software should be written. + +[gcf]: https://cloud.google.com/functions/ +[lambda]: https://aws.amazon.com/lambda/ +[syscall]: https://en.wikipedia.org/wiki/System_call +[olin]: https://github.com/Xe/olin +[goslack]: https://invite.slack.golangbridge.org +[wasmgo]: https://github.com/Xe/olin/tree/master/internal/abi/wasmgo