iconia/doc/spec.md

117 lines
4.8 KiB
Markdown
Raw Normal View History

2019-11-05 16:27:00 +00:00
# iconia: A Service Gateway
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
interpreted as described in RFC 2119.
## Abstract
Operating TCP services in production environments can be a troublesome thing in
practice. At smaller scales it is easy to have a single server terminating a
single service. Once that scales however, that gets more and more difficult as
the complexity of the setup increases.
We need a better option.
2019-11-30 21:09:13 +00:00
Iconia is a service gateway that integrates into [Caddy][caddy] to allow for a
better experience operating complicated applications at scale. The ultimate
goal of Iconia is to allow services to not require a direct TCP line-of-fire
from the load balancer to the backend services. Instead the Iconia agent would
connect to the load balancer and redirect traffic to the backend, much like it
would in service meshes like Envoy or Istio.
### Name Origin
The name of Iconia is a reference to the [Iconian gateway][gateway] from the
Star Trek franchise. It is a gateway that allows for instant travel to distant
points in the galaxy in ways that bypass shields and other forms of security.
The name is used here because Iconia is a gateway to distant services that
bypasses NAT and other ways that ports get blocked.
## Components
Iconia is made up of several components:
- [Caddy][caddy] as the ingress/TLS terminating component
- The Iconia agent running alongside the application being exposed to the world
- The application being exposed via Iconia
### Caddy Plugin
The Caddy plugin for Iconia MUST perform the following operations:
- Gather metrics about the performance of each backend:
- Roundtrip time
- Time to first byte
- Number of connections
- Time it takes to do healthchecks
- A healthcheck metric defined by the backend application
- Gather metrics:
- Number of connected backends
- Number of authentication failures
- Have some method to intelligently select backends based on the following
criteria:
- Load of the backend service instance in question
- Client remote IP address and port number
- Backend health status (IE: it MUST NOT route to backends that are marked
as unhealthy)
- Expose a protocol or method for backend services to connect to the
concentrator
- Evaluate [smux][smux], [SSH][ssh] and [Quic][quic]
- Ensure that only authorized agents are allowed to register as backends
- Expose an API for controlling Iconia operations
- List active backends
- See details about a given backend by connection ID
- Kill an arbitrary backend by connection ID
- Log messages to the standard Caddy logging sink
- Route requests and responses to and from the discovered backend
efficiently
The Caddy plugin for Iconia MUST support allowing backends for multiple hosts
to connect via the same TCP/UDP port.
### Iconia Agent
The Iconia agent MUST perform the following operations:
- Discover/gather configuration information from the environment and filesystem
- Connect to the gateway server in a durable and fault-tolerant manner
- Authenticate to the gateway server
- Listen for incoming TCP sessions from the gateway server and route them to the
backend service
- Utilize the [PROXY protocol][proxyprotocol] to ensure that the backend service
has accurate information about client IP addresses
### Backend Service
Backend services for Iconia MUST have the following properties:
- Understand the [PROXY protocol][proxyprotocol] to ensure that the backend
service has accurate information about client IP addresses
- Expose a healthcheck route:
- On the HTTP host `iconia-healthcheck`
- With the path `/health`
- That MAY return `200` if everything is healthy with the body `OK`
- And also MAY return `500` if everything is NOT healthy with the body
containing a site-defined error message explaining what the issue is
- This healthcheck MAY include the response header `X-Iconia-Load` to send the
gateway a site-defined load metric to help the gateway choose the backend
with the least load
- Accepts connections over TCP to a known port
## Caveats
This will undoubtedly add a slight amount of latency to standard HTTP operations.
Applications that are latency sensitive should probably avoid using this tool in
favor of traditional exposure methods.
The Go HTTP/2 stack doesn't currently support connection hijacking. This will be
needed in order to have the most efficient routing possible.
[caddy]: https://caddyserver.com
[gateway]: https://memory-alpha.fandom.com/wiki/Iconian_gateway
[smux]: https://github.com/xtaci/smux
[ssh]: https://godoc.org/golang.org/x/crypto/ssh
[quic]: https://godoc.org/github.com/lucas-clemente/quic-go
[proxyprotocol]: https://github.com/joyent/haproxy-1.5/blob/master/doc/proxy-protocol.txt