forked from cadey/xesite
blog/lokahi: fix errors i thought i fixed
This commit is contained in:
parent
ce6f0462f8
commit
74ad1ae964
|
@ -6,21 +6,13 @@ github_issue: https://github.com/Xe/lokahi/issues/15
|
||||||
|
|
||||||
# Introducing Lokahi
|
# Introducing Lokahi
|
||||||
|
|
||||||
This week at Heroku, there was a hackweek. I decided to tackle a few problems at
|
|
||||||
once and this is the result. The two big things I wanted to tackle were building
|
|
||||||
a scalable HTTP health checking service and unlocking the "flow" state of
|
|
||||||
consciousness to make developing, understanding and improving this project a lot
|
|
||||||
easier.
|
|
||||||
|
|
||||||
## lokahi
|
|
||||||
|
|
||||||
Lokahi is a http service uptime checking and notification service. Currently
|
Lokahi is a http service uptime checking and notification service. Currently
|
||||||
lokahi does very little. Given a URL and a webhook URL, lokahi runs checks every
|
lokahi does very little. Given a URL and a webhook URL, lokahi runs checks every
|
||||||
minute on that URL and ensures it's up. If the URL goes down or the health
|
minute on that URL and ensures it's up. If the URL goes down or the health
|
||||||
workers have trouble getting to the URL, the service is flagged as down and a
|
workers have trouble getting to the URL, the service is flagged as down and a
|
||||||
webhook is sent out.
|
webhook is sent out.
|
||||||
|
|
||||||
### Stack
|
## Stack
|
||||||
|
|
||||||
| What | Role |
|
| What | Role |
|
||||||
| :-------- | :------------ |
|
| :-------- | :------------ |
|
||||||
|
@ -31,13 +23,13 @@ webhook is sent out.
|
||||||
| Nats | Message queue |
|
| Nats | Message queue |
|
||||||
| Cobra | CLI |
|
| Cobra | CLI |
|
||||||
|
|
||||||
### Components
|
## Components
|
||||||
|
|
||||||
Interrelation graph:
|
Interrelation graph:
|
||||||
|
|
||||||
![interrelation graph of lokahi components, see /static/img/lokahi.dot for the graphviz]("/static/img/lokahi.png")
|
![interrelation graph of lokahi components, see /static/img/lokahi.dot for the graphviz]("/static/img/lokahi.png")
|
||||||
|
|
||||||
#### lokahictl
|
### lokahictl
|
||||||
|
|
||||||
The command line interface, currently outputs everything in JSON. It currently
|
The command line interface, currently outputs everything in JSON. It currently
|
||||||
has a few options:
|
has a few options:
|
||||||
|
@ -69,7 +61,7 @@ Use "lokahictl [command] --help" for more information about a command.
|
||||||
|
|
||||||
Each of these subcommands has help and most of them have additional flags.
|
Each of these subcommands has help and most of them have additional flags.
|
||||||
|
|
||||||
#### lokahid
|
### lokahid
|
||||||
|
|
||||||
This is the main API server. It exposes twirp services defined in [`xe.github.lokahi`](https://github.com/Xe/lokahi/blob/master/rpc/lokahi/lokahi.proto)
|
This is the main API server. It exposes twirp services defined in [`xe.github.lokahi`](https://github.com/Xe/lokahi/blob/master/rpc/lokahi/lokahi.proto)
|
||||||
and [`xe.github.lokahi.admin`](https://github.com/Xe/lokahi/blob/master/rpc/lokahiadmin/lokahiadmin.proto).
|
and [`xe.github.lokahi.admin`](https://github.com/Xe/lokahi/blob/master/rpc/lokahiadmin/lokahiadmin.proto).
|
||||||
|
@ -93,19 +85,19 @@ PORT=9001
|
||||||
Every minute, lokahid will scan for every check that is set to run minutely and
|
Every minute, lokahid will scan for every check that is set to run minutely and
|
||||||
run them. Running checks any time but minutely is currently unsupported.
|
run them. Running checks any time but minutely is currently unsupported.
|
||||||
|
|
||||||
#### healthworker
|
### healthworker
|
||||||
|
|
||||||
healthworker listens on nats queue `check.run` and returns health information
|
healthworker listens on nats queue `check.run` and returns health information
|
||||||
about that service.
|
about that service.
|
||||||
|
|
||||||
#### webhookworker
|
### webhookworker
|
||||||
|
|
||||||
webhookworker listens on nats queue `webhook.egress` and sends webhooks based on
|
webhookworker listens on nats queue `webhook.egress` and sends webhooks based on
|
||||||
the input it's given.
|
the input it's given.
|
||||||
|
|
||||||
### Challenges Faced During Development
|
## Challenges Faced During Development
|
||||||
|
|
||||||
#### ORM Issues
|
### ORM Issues
|
||||||
|
|
||||||
Initially, I implemented this using [gorm](https://github.com/jinzhu/gorm) and
|
Initially, I implemented this using [gorm](https://github.com/jinzhu/gorm) and
|
||||||
started to run into a lot of problems when using it in anything but small
|
started to run into a lot of problems when using it in anything but small
|
||||||
|
@ -117,7 +109,7 @@ I rewrote this to use [`database/sql`](https://godoc.org/database/sql) and
|
||||||
[`sqlx`](https://godoc.org/github.com/jmoiron/sqlx) and all of the tests passed
|
[`sqlx`](https://godoc.org/github.com/jmoiron/sqlx) and all of the tests passed
|
||||||
the first time I tried to run this, no joke.
|
the first time I tried to run this, no joke.
|
||||||
|
|
||||||
#### Scaling to 50,000 Checks
|
### Scaling to 50,000 Checks
|
||||||
|
|
||||||
This one was actually a lot harder than I thought it would be, and not for the
|
This one was actually a lot harder than I thought it would be, and not for the
|
||||||
reasons I thought it would be. One of the main things that I discovered when
|
reasons I thought it would be. One of the main things that I discovered when
|
||||||
|
@ -134,7 +126,7 @@ This service can handle 50,000 HTTP checks in a minute. The only part that gets
|
||||||
backed up currently is webhook egress, but that is likely fixable with further
|
backed up currently is webhook egress, but that is likely fixable with further
|
||||||
optimization on the HTTP checking and webhook egress paths.
|
optimization on the HTTP checking and webhook egress paths.
|
||||||
|
|
||||||
### Basic Usage
|
## Basic Usage
|
||||||
|
|
||||||
To set up an instance of lokahi on a machine with [Docker Compose](https://docs.docker.com/compose/)
|
To set up an instance of lokahi on a machine with [Docker Compose](https://docs.docker.com/compose/)
|
||||||
installed, create a docker compose manifest with the following in it:
|
installed, create a docker compose manifest with the following in it:
|
||||||
|
@ -222,7 +214,7 @@ services:
|
||||||
|
|
||||||
Start this with `docker-compose up -d`.
|
Start this with `docker-compose up -d`.
|
||||||
|
|
||||||
#### Configuration
|
### Configuration
|
||||||
|
|
||||||
Open `~/.lokahictl.hcl` and enter in the following:
|
Open `~/.lokahictl.hcl` and enter in the following:
|
||||||
|
|
||||||
|
@ -232,7 +224,7 @@ server = "http://AzureDiamond:hunter2@127.0.0.1:24253"
|
||||||
|
|
||||||
Save this and then lokahictl is now configured to work with the local copy of lokahi.
|
Save this and then lokahictl is now configured to work with the local copy of lokahi.
|
||||||
|
|
||||||
#### Creating a check
|
### Creating a check
|
||||||
|
|
||||||
To create a check against duke reporting to samplehook:
|
To create a check against duke reporting to samplehook:
|
||||||
|
|
||||||
|
@ -260,7 +252,7 @@ $ docker-compose -f samplehook
|
||||||
playbook url: https://github.com/Xe/lokahi/wiki/duke-of-york-Playbook
|
playbook url: https://github.com/Xe/lokahi/wiki/duke-of-york-Playbook
|
||||||
```
|
```
|
||||||
|
|
||||||
### Webhooks
|
## Webhooks
|
||||||
|
|
||||||
Webhooks get a HTTP POST of a protobuf-encoded [`xe.github.lokahi.CheckStatus`](https://github.com/Xe/lokahi/blob/13bc98ff0665ab13044f08d51ed2141ca0c38647/rpc/lokahi/lokahi.proto#L83)
|
Webhooks get a HTTP POST of a protobuf-encoded [`xe.github.lokahi.CheckStatus`](https://github.com/Xe/lokahi/blob/13bc98ff0665ab13044f08d51ed2141ca0c38647/rpc/lokahi/lokahi.proto#L83)
|
||||||
with the following additional HTTP headers:
|
with the following additional HTTP headers:
|
||||||
|
@ -280,7 +272,7 @@ receivers.
|
||||||
JSON webhook support is not currently implemented, but is being tracked at
|
JSON webhook support is not currently implemented, but is being tracked at
|
||||||
[this github issue](https://github.com/Xe/lokahi/issues/4).
|
[this github issue](https://github.com/Xe/lokahi/issues/4).
|
||||||
|
|
||||||
### Call for Contributions
|
## Call for Contributions
|
||||||
|
|
||||||
Lokahi is pretty great as it is, but to be even better lokahi needs a bunch
|
Lokahi is pretty great as it is, but to be even better lokahi needs a bunch
|
||||||
of work, experience reports and people willing to contribute to the project.
|
of work, experience reports and people willing to contribute to the project.
|
||||||
|
|
Loading…
Reference in New Issue