talks: add systemd: the good parts

Signed-off-by: Christine Dodrill <me@christine.website>
2021-05-15 13:43:00 -04:00
1 changed files with 258 additions and 0 deletions
--- a/talks/systemd-the-good-parts-2021-05-16.markdown
+++ b/talks/systemd-the-good-parts-2021-05-16.markdown
@ -0,0 +1,258 @@
+---
+title: "systemd: The Good Parts"
+date: 2021-05-16
+slides_link: https://docs.google.com/presentation/d/1a0XaGu87xUcpQQVLkrnXKoKrdpN1ObiPrG9aGYVMw7k/edit?usp=sharing
+---
+
+# systemd: The Good Parts
+
+[Video](https://youtu.be/TJdKXq197Qk)
+
+<center><iframe width="560" height="315" src="https://www.youtube.com/embed/TJdKXq197Qk" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center>
+
+The slides link will be at the end of the post.
+
+Hello, I'm Xe and today I'm going to do a talk about systemd. More specifically
+the good parts of systemd. This talk is going to go fast because there's a lot
+of material to cover and the notes are going to be on my website. I have been an
+Alpine user for almost a decade and it's one of my favorite linux distributions.
+
+The best things in life come with disclaimers and here are the disclaimers for this talk:
+- This talk may contain opinions. These opinions are my own and not necessarily
+  the opinions of my employer.
+- This talk is not evangelism. This talk is intended to show how green the grass
+  is on the other side and how Alpine can benefit from these basic ideas.
+- This talk also contains images of cartoon marine animals.
+
+[What is systemd?](conversation://Mara/hmm)
+
+When doing a talk about a thing I find it helps to start with a good definition
+of what that thing is. Given this talk is about systemd let's start with what
+systemd is.
+
+<center>
+
+![A map of systemd components](https://www.linux.com/images/stories/41373/Systemd-components.png)
+
+</center>
+
+systemd is a set of building blocks that you can use to make a linux system.
+This diagram covers most of the parts of systemd. There is everything from
+service management to log management to boot time analysis, network
+configuration, and user logins; but we're only going to cover a tiny fraction of
+this diagram. At a high level systemd provides a common set of tools that you
+can build a linux system with; kind of like lego bricks. It does just manage
+services but it does more than just service management.
+
+Something else that's useful to ask is "why does systemd exist?" Well, looking
+back at that diagram, computers are actually fairly complicated. There's a lot
+going on over here. There's log management, there's disk management, there's
+service sequencing, network configuration, containers user sessions and most
+importantly all of these things need to happen in order or bad things can
+happen. I mentioned that systemd is more than just a service service manager
+because it has optional components that manage things like dns resolution,
+network devices, user sessions, and user level services among other things.
+
+One of the big differences between systemd and other things like OpenRC is that
+systemd is a very declarative environment. In declarative environments you
+specify what you want and the system will figure out what it needs to do to get
+there. In an imperative environment you specify all of the steps you need to do
+to get there. It's the difference between writing a sql statement and a for
+loop.
+
+So, pretend that this somewhat realistic scenario is happening to you: it's 4:00
+am you just got a panicked call from someone at your company that the website is
+down. You log into a server and you want to see if the website is actually down
+or if it's just dns. You probably want to know the answers to these basic
+questions:
+
+- Does the service manager think your service is running?
+- How much ram is it using?
+- Does it have any child processes?
+- Has it reported it is healthy?
+- How much traffic has it used?
+- What are the last few log lines?
+- If you need to reboot the server right now for some reason, will that service
+  come back up on reboot?
+  
+![](https://cdn.christine.website/file/christine-static/blog/Screen+Shot+2021-05-11+at+23.02.15.png)
+
+systemd includes a tool called systemctl that allows you to query the status of
+services as well as start and stop them; but for right now we're going to look
+at the systemctl status subcommand. Here is the output for the systemctl status
+command for the service powering christine.website. So let's go down the list:
+
+- Is the service running? If you look at the red box right there you can see
+  that it says say the service has been running for nine hours.
+- How much ram is it using? If you look at the red box there it says it's using
+  about 200 megs of ram.
+- How many child processes are there if you look at the red box it'll show you
+  all of the processes in the service's cgroup. In this case we'll see that
+  there's just one process.
+- How much network traffic has it been using? If we look here in the red box you
+  can see it's had about a megabyte of traffic in and somewhat less than a
+  megabyte of traffic out. My website serves everything over a unix socket and
+  those numbers aren't reflected here but it's actually much higher.
+- At the bottom we can see the last few log lines. These are just random
+  requests that people make to my blog.
+  
+[Where did it get those logs from?](conversation://Mara/hmm)
+
+If you haven't seen all of this in action before you might be wondering
+something like "Wait, where did it get those logs from?"
+
+I mentioned systemd does more than just start services. systemd has a common log
+sink called the journal. Logs from the kernel, network devices, services, and
+even some other system sources that you may not think are important
+automatically get put into the journal. It's similar to Windows event logs or
+the console app in macOS except it's implicit instead of explicit (Windows and
+macOS make you use some weird logging calls to make sure that log lines actually
+get in there, but systemd will capture the standard output, standard error and
+syslog for every service managed by systemd). Something neat about the journal
+is that it lets you tail the logs for the entire system with one command:
+`journalctl -f`. Here's that command running on a server of mine:
+
+![journalctl output](https://cdn.christine.website/file/christine-static/blog/Screen+Shot+2021-05-15+at+11.04.17.png)
+
+There's a lot more to the journal involving structured logging, automatically
+streaming the logs to places, and advanced filtering based off of different
+units, services, or other arbitrary fields; however that is out of scope for
+this talk. The important part is that it has support for that in case you
+actually need it.
+
+Now this is all great, and you might be think asking yourself "well, yeah, this
+stuff is cool; but how does Alpine fit into this? Alpine can't run systemd
+because systemd is glibc specific." However we're not talking about systemd
+directly, we're talking about the philosophies involved and the truth is that
+this kind of experience is what people already have elsewhere. By not having
+something competitive Alpine is less and less attractive for newer production
+deployments.
+
+Now there's at least four classes of benefits for systemd and I'm going to break
+them down into the following groups:
+- developers
+- packagers
+- system administrators
+- users
+ 
+In general people that are developing services that run on systemd get the following benefits:
+
+- Predictability. systemd configuration files are declarative rather than
+  imperative. You declare units instead of imperatively building up init
+  scripts. Options are declared and enforced by the service manager. This makes
+  it a lot easier to review changes for correctness. 
+- Portability. when setting up a service with systemd there's only one syntax to
+  learn across 15 plus different distributions. This means that you don't have
+  to maintain a giant pile of hacks to make the program just start consistently
+  across different distributions and you can only care about the systemd unit
+  that will make everything happen for you. Before systemd was widespread every
+  distribution had their own unique special snowflake configuration for init
+  systems and it really just wasn't that nice to deal with. Ubuntu had different
+  opinions from Debian, Debian and opensuse had different opinions, and centos
+  was way out in the weeds and it just became hard to do this consistently
+  across distributions. Something declarative like systemd makes doing it across
+  distributions a lot easier by comparison.
+- One of the other big things that it has is a api for controlling things with
+  dbus. Now, say what you will about dbus but dbus does have some very rich
+  introspection capabilities, as well as giving you the ability to integrate
+  with system services at a level that more closely resembles what you get on
+  windows or macOS (or even something like sel4 with microkernel message
+  passing). You don't have to shell out to commands and pray the output format
+  didn't change. You don't have to do some weird calls to unix sockets. It uses
+  standard apis and allows you to integrate things more tightly with the system.
+  Gnome for example uses systemd to trigger suspend and shutdown, as well as
+  having a way a little gui to query the systemd journal. Server software can
+  subscribe to units being started for auditing purposes and such.
+
+Packagers or people that are putting software into packages get the following benefits:
+
+- It is a lot easier to write a systemd unit than it is to write an OpenRC
+  script. systemd units are very bland and boring, they look like ini files. It is
+  going to be pretty obvious that it just does what it does and there's nothing
+  special going on. And because of this declarative syntax it makes human error
+  a lot more obvious and it is a lot easier for other humans to review.
+- Now, don't get me wrong, shell scripts for service definitions have gotten
+  us a very long way and are likely to stay around for a very long time (I
+  actually use shell scripts with most of my systemd services to do weird things
+  with environment variables for configuration). However, shell scripting is a
+  very, very subtle art and it is very easy to mess up and do things that are
+  very unpredictable if you are not extremely careful. The declarative syntax of
+  systemd removes the ability for you to mess up formatting shell scripts; or at
+  the very least it isolates the flaws of the shell script to the exact service
+  running and not things like the user that the service is running under.
+
+system administrators of systemd systems also get the following benefits:
+
+- systemctl status and a lot of other parts of systemctl let you see what the
+  system or an individual service is doing without having to wonder if it's
+  actually working or not. In general the lazy the lazy thing is the thing that
+  you want to optimize for because people are distracted. There is a lot going
+  on sometimes and if you optimize it so that the easiest thing to do is the
+  correct thing then it is a lot easier to deal with when you have a distracted
+  operator. systemd is set up so that it's hard to do the wrong thing. It is
+  hard to have logs go anywhere about the system journal. It is hard to write a
+  unit that doesn't tell you if the service is actually running or not. And it
+  makes it so that the path of least resistance will do most of what you want. 
+- Sometimes system administrators have opinions that are different than the
+  opinions of the packager. Sometimes you need to change environment variables
+  for http proxies or something and sometimes you believe the packager has
+  different opinions than you do about how something should be run. In OpenRC
+  you'd have to make a copy of the init script, make your changes, and then hope
+  those changes don't get blown away when the package updates. systemd has a
+  first-class mechanism for doing this called drop-in units that allow you to
+  customize parts of a systemd service so that you can override exactly what you
+  need to (and only that) and systemd will turn the all of those into one big
+  logical unit and actually go off and run that. This has been very useful in
+  practice.
+- Another thing that is kind of endemic to sysvinit and OpenRC systems is the
+  fact that unless you are careful and configure it right cron job output will
+  just go to nowhere and there is not really an easy way to figure out if a cron
+  job actually ran and if it errored or if it did exactly what you wanted. If I
+  recall there was actually an entire small startup that was formed around just
+  alerting for cron jobs that were not doing what they should be doing. systemd
+  changes this systemd because all of the logs are in the journal. If you set up
+  a systemd timer (which is the systemd land equivalent to a cron job) all of
+  the output for the service associated with that timer gets put into the
+  journal and you can see exactly what went wrong so you can go off and fix it.
+  This has saved me so much time and headache trying to do this stuff manually.
+- Another thing that you can do is you can group services together with targets
+  which are kind of like named runlevels. Targets let you specify the difference
+  between the system booting the network stack is configured and all of the
+  services needed for your app are running. You can get a list of dependencies
+  from systemd for any service and you can also use that to help you plan
+  incident response, so it is more difficult to have hidden dependencies.
+
+As far as users go:
+- systemd is not limited to just managing system level services systemd can also
+  manage user services with systemd user mode. I use this on my Linux system in
+  order to have a couple services running in the background querying for weather
+  or a couple other api calls to put them into my status bar on my tiling window
+  manager (sway). I have another one that runs emacs in server mode so that I can
+  have one giant emacs session that will automatically start on login. I can put
+  hundreds and hundreds of buffers in there and not have to worry about it. I can
+  spawn new emacs frames instantly, it's really beautiful.
+- You can also query all of the system journal logs as a normal user and you
+  don't have to sudo up and go into the logs folder. So if you just want to take
+  a quick look at something, you don't have to type in your password or hit a
+  yubikey press or whatever you have configured.
+
+[I hope Alpine ends up with something similar!](conversation://Mara/happy)
+
+I really hope alpine comes up with something similar to systemd. Alpine can
+really benefit from a tightly integrated service manager that does at least some
+of the things that systemd does. Declarative really is better than imperative
+because declarative is easier for distracted operators.
+
+People get distracted. It happens, and when distracted people do things it can
+sometimes have bad consequences. So if we make the tools powerful, but
+implicitly correct, then it will just be a lot better overall and users will
+have a lot less worry involved.
+
+On that note we are very close to hitting time so here's my shout outs to people
+who either help make this talk happen or I think are cool.
+
+if you have any questions please feel free to ping me on twitter, in the irc
+room, or on the compact page on my website. I enjoy these kinds of questions and
+I openly welcome you to ask them.
+
+Thank you, have a good day.