--- title: My Magical Adventure With cloud-init date: 2021-06-04 --- # My Magical Adventure With cloud-init > "If I had a world of my own, everything would be nonsense. Nothing would be > what it is, because everything would be what it isn't. And contrary wise, what > is, it wouldn't be. And what it wouldn't be, it would. You see?" - The Mad Hatter, Alice's Adventures in Wonderland The modern cloud is a magical experience. You take a template, give it some SSH keys and maybe some user-data and then you have a server running somewhere. This is all powered by a tool called [cloud-init](https://cloud-init.io/). cloud-init is the most useful in actual datacenters with proper metadata services, but what if you aren't in a datacenter with a metadata service? Recently I wanted to test a [script](https://github.com/tailscale/tailscale/blob/main/scripts/installer.sh) a coworker wrote that allows users to automatically install Tailscale on every distro and version Tailscale supports. I wanted to try and avoid having to install each version of every distribution manually, so I started looking for options. [This may seem like overkill (and at some level it probably is), however as a side effect of going through this song and dance you can spin up a bunch of VMs pretty easily.

pic.twitter.com/yays27Wmes

— Xe from Within (@theprincessxena) May 17, 2021
](conversation://Mara/hacker) cloud-init has a feature called the [NoCloud](https://cloudinit.readthedocs.io/en/latest/topics/datasources/nocloud.html) data source. To use it, you need to write two yaml files, put them into a specially named ISO file and then mount it to the virtual machine. cloud-init will then pick up your configuration data and apply it. [Wait...really? What.](conversation://Mara/hmm) [Yes, really.](conversation://Cadey/coffee) Let's make an [Amazon Linux 2](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/amazon-linux-2-virtual-machine.html) virtual machine as an example. Amazon offers their Linux distribution for download so you can run it on-premises (I don't really know why you'd want to do this outside of testing stuff on Amazon Linux). In this blog we use KVM, so keep that in mind when you set things up yourself. First you need to make a `meta-data` file, this will contain the VM's hostname and the "instance ID" (this makes sense in cloud contexts however you can use whatever you want): ```yaml local-hostname: mayhem instance-id: 31337 ``` [You can configure networking settings here, but our VM is going to get an address over DHCP so you don't really need to care about that in this case](conversation://Mara/hacker) Next you need to make a `user-data` file, this will actually configure your VM: ```yaml #cloud-config #vim:syntax=yaml cloud_config_modules: - runcmd cloud_final_modules: - [users-groups, always] - [scripts-user, once-per-instance] users: - name: xe groups: [ wheel ] sudo: [ "ALL=(ALL) NOPASSWD:ALL" ] shell: /bin/bash ssh-authorized-keys: - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPYr9hiLtDHgd6lZDgQMkJzvYeAXmePOrgFaWHAjJvNU cadey@ontos write_files: - path: /etc/cloud/cloud.cfg.d/80_disable_network_after_firstboot.cfg content: | # Disable network configuration after first boot network: config: disabled ``` Please make sure to change the username and swap out the SSH key as needed, unless you want to get locked out of your VM. For more information about what you can do from cloud-init, see the list of modules [here](http://cloudinit.readthedocs.io/en/latest/topics/modules.html). Now that you have the two yaml files you can make the seed image with this command (Linux): ```console $ genisoimage -output seed.iso \ -volid cidata \ -joliet \ -rock \ user-data meta-data ``` [In NixOS you may need to run it inside nix-shell: `nix-shell -p cdrkit`.](conversation://Mara/hacker) Or this command (macOS): ```console $ hdiutil makehybrid \ -o seed.iso \ -hfs \ -joliet \ -iso \ -default-volume-name cidata \ user-data meta-data ``` Now you can download the KVM image from that [Amazon Linux User Guide page from earlier](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/amazon-linux-2-virtual-machine.html) and then put it somewhere safe. This image will be written into a [ZFS zvol](https://pthree.org/2012/12/21/zfs-administration-part-xiv-zvols/). To find out how big the zvol needs to be, you can use `qemu-img info`: ```console $ qemu-img info amzn2-kvm-2.0.20210427.0-x86_64.xfs.gpt.qcow2 image: amzn2-kvm-2.0.20210427.0-x86_64.xfs.gpt.qcow2 file format: qcow2 virtual size: 25 GiB (26843545600 bytes) disk size: 410 MiB cluster_size: 65536 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false refcount bits: 16 corrupt: false extended l2: false ``` The virtual disk image is 25 gigabytes, so you can create it with a command like this: ```console $ sudo zfs create -V 25G rpool/safe/vms/mayhem ``` Then you use `qemu-img convert` to copy the image into the zvol: ```console $ sudo qemu-img convert \ -O raw \ amzn2-kvm-2.0.20210427.0-x86_64.xfs.gpt.qcow2 \ /dev/zvol/rpool/safe/vms/mayhem ``` If you don't use ZFS you can make a layered disk using `qemu-img create`: ```console $ qemu-img create \ -f qcow2 \ -o backing_file=amzn2-kvm-2.0.20210427.0-x86_64.xfs.gpt.qcow2 \ mayhem.qcow2 ``` Open up virt-manager and then create a new virtual machine. Make sure you select "Manual install".
![The first step of the "create a new virtual machine" wizard in virt-manager with "manual install" selected](https://cdn.christine.website/file/christine-static/blog/20210604_06h43m27s_grim.png)
virt-manager will then ask you what OS the virtual machine is running so it can load some known working defaults. It doesn't have an option for Amazon Linux, but it's kinda sorta like CentOS 7, so enter CentOS 7 here.
![The second step of the "create a new virtual machine" wizard in virt-manager with "CentOS 7" selected as the OS the virtual machine will be running](https://cdn.christine.website/file/christine-static/blog/20210604_06h45m35s_grim.png)
The default amount of ram and CPU are fine, but you can choose other options if you have more restrictive hardware requirements.
![The third step of the "create a new virtual machine" wizard in virt-manager with 1024 MB of ram and 2 virtual CPU cores selected](https://cdn.christine.website/file/christine-static/blog/20210604_06h50m09s_grim.png)
Now you need to select the storage path for the VM. virt-manager will helpfully offer to create a new virtual disk for you. You already made the disk with the above steps, so enter in `/dev/zvol/rpool/safe/vms/mayhem` (or the path to your custom layered qcow2 from the above `qemu-img create` command) as the disk location.
![The fourth step of the "create a new virtual machine" wizard in virt-manager with `/dev/zvol/rpool/safe/vms/mayhem` selected as the path to the disk](https://cdn.christine.website/file/christine-static/blog/20210604_06h53m58s_grim.png)
Finally, name the VM and then choose "Customize configuration before install" so you can mount the seed data.
![The last step of the "create a new virtual machine" wizard in virt-manager, setting the virtual machine name to "mayhem" and indicating that you want to customize configuration before installation](https://cdn.christine.website/file/christine-static/blog/20210604_06h56m54s_grim.png)
Click on the "Add Hardware" button in the lower left corner of the configuration window.
![](https://cdn.christine.website/file/christine-static/blog/20210604_06h58m53s_grim.png)
Make a new CDROM storage device that points to your seed image:
![](https://cdn.christine.website/file/christine-static/blog/20210604_07h01m24s_grim.png)
And then click "Begin Installation". The virtual machine will be created and its graphical console will open. Click on the info tab and then the NIC device. The VM's IP address will be listed:
![](https://cdn.christine.website/file/christine-static/blog/20210604_07h05m28s_grim.png)
Now SSH into the VM: ```console $ ssh xe@192.168.122.122 The authenticity of host '192.168.122.122 (192.168.122.122)' can't be established. ED25519 key fingerprint is SHA256:TP7dWLkHOixx5tr78qn0yvDQKttH0yWz6IBvbadEqcs. This key is not known by any other names Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '192.168.122.122' (ED25519) to the list of known hosts. __| __|_ ) _| ( / Amazon Linux 2 AMI ___|\___|___| https://aws.amazon.com/amazon-linux-2/ 8 package(s) needed for security, out of 17 available Run "sudo yum update" to apply all updates. [xe@mayhem ~]$ ``` And voila! A new virtual machine that you can do whatever you want with, just like you would any other server. [Do you really need to make an ISO file for this? Can't I just use HTTP like the AWS metadata service?](conversation://Mara/hmm) Yes and no. You can have the configuration loaded over HTTP/S, but without special network configuration you won't be able to have `http://169.254.169.254` work like the AWS metadata service without a fair bit of effort. Either way, you are going to have to edit the virtual machine's XML though. [XML? Why is XML involved?](conversation://Mara/wat) virt-manager is a frontend to [libvirt](https://libvirt.org/index.html). libvirt uses XML to describe virtual machines. [Here](https://gist.github.com/Xe/f870ebb2d9dce0929a35a4ba347cbda3) is the XML used to describe the VM you made earlier. This looks like a lot (because frankly it is a lot, computers are complicated), however this is a lot more manageable than the equivalent qemu flags. [What do the qemu flags look like?](conversation://Mara/hmm) [Like this](https://gist.githubusercontent.com/Xe/2eba35ec6cbd54becf9fca02f6d69f0b/raw/89d68424c0ae26333d798bd9bd6a224dfec844d7/qemu%2520flags.txt). It is kind of a mess that I would rather have something made by people smarter than me take care of. To enable cloud-init to load over HTTP, you are going to have to add the qemu XML namespace to mayhem's configuration. At the top you should see a line that looks like this: ```xml ``` Replace it with one that looks like this: ```xml ``` This will allow you to set the cloud-init seed location information using a [SMBIOS value](https://en.wikipedia.org/wiki/System_Management_BIOS). To enable this, add the following to the _bottom_ of your XML file, just before the closing ``: ```xml ``` Make sure the data is actually being served on that address. Here's a nix-shell python one-liner HTTP server: ```console $ nix-shell -p python3 --run 'python -m http.server 8000' ``` Then you will need to either load the base image back into the zvol or recreate the qcow2 file to reset the VM back to its default state. Reboot the VM and wait for it to connect to your "metadata server": ```console 192.168.122.122 - - [04/Jun/2021 11:41:10] "GET /mayhem/meta-data HTTP/1.1" 200 - 192.168.122.122 - - [04/Jun/2021 11:41:10] "GET /mayhem/user-data HTTP/1.1" 200 - ``` Then you can SSH into it like normal: ```console $ ssh xe@192.168.122.122 The authenticity of host '192.168.122.122 (192.168.122.122)' can't be established. ED25519 key fingerprint is SHA256:eJRjDsvnVrXfntVtNVN6N+JdakaA+dvGKWWQP5OFkeA. This key is not known by any other names Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '192.168.122.122' (ED25519) to the list of known hosts. __| __|_ ) _| ( / Amazon Linux 2 AMI ___|\___|___| https://aws.amazon.com/amazon-linux-2/ 8 package(s) needed for security, out of 17 available Run "sudo yum update" to apply all updates. [xe@mayhem ~]$ ``` [Can I choose other distros for this?](conversation://Mara/hmm) Yep! Most distributions offer cloud-init enabled images. They may be hard to find, but they do exist. Here's some links that will help you with common distros: - [Arch Linux](https://mirror.pkgbuild.com/images/) (use the `cloudimg` ones) - [CentOS 7](https://cloud.centos.org/centos/7/images/) (use the `GenericCloud` one) - [CentOS 8](https://cloud.centos.org/centos/8-stream/x86_64/images/) (use the `GenericCloud` one) - [Debian 9](http://cloud.debian.org/images/cloud/OpenStack/9.13.22-20210531/) (use the `openstack` one) - [Debian 10](http://cloud.debian.org/images/cloud/buster/20210329-591/) (use the `generic` one) - [Debian 11](http://cloud.debian.org/images/cloud/bullseye/daily/) (use the `generic` one) - [Fedora 34](https://alt.fedoraproject.org/cloud/) (use the Openstack image) - [OpenSUSE Leap 15.2](https://download.opensuse.org/repositories/Cloud:/Images:/Leap_15.2/images/) (use the `OpenStack` image) - [OpenSUSE Leap 15.3](https://get.opensuse.org/leap/) (use the JeOS one labeled `OpenStack-Cloud`) - [OpenSUSE Tumbleweed](https://download.opensuse.org/tumbleweed/appliances/) (use the JeOS one labeled `Openstack-Cloud`) - [Ubuntu](https://cloud-images.ubuntu.com/) (use the `server-cloudimg` image for your version of choice) In general, look for images that are compatible with OpenStack. OpenStack uses cloud-init to configure virtual machines and the NoCloud data source you're using ships by default. It usually works out, except for cases like OpenSUSE Leap 15.1. With Leap 15.1 you have to [pretend to be OpenStack a bit more](https://github.com/tailscale/tailscale/blob/aa6abc98f30df67a0d86698b77932d4d9cc45ac0/tstest/integration/vms/opensuse_leap_15_1_test.go) for some reason. [What if I need to template the userdata file?](conversation://Mara/hmm) [You really should avoid doing this if possible. Templating yaml is a delicate process fraught with danger. The error conditions in things like Kubernetes are that it does the wrong thing and you need to replace the service. The error condition with this is that you lose access to your server.](conversation://Cadey/facepalm) [Let's say that Facts and Circumstances™ made me have to template it.](conversation://Mara/happy)
Cadey is percussive-maintenance
When you are templating yaml, you have to be really careful. It is very easy to incur [the wrath of Norway and Ontario](https://hitchdev.com/strictyaml/why/implicit-typing-removed/) on accident with yaml. Here are some rules of thumb (unfortunately gained from experience) to keep in mind: - yaml has implicit typing, quote everything to be safe. - ensure that every value you pass in is yaml-safe - ensure that the indentation matches for every value Something very important is to test the templating on a virtual machine image that you have a back door into. Otherwise you will be locked out. You can generally hack around it by adding `init=/bin/sh` in your kernel command line and changing your password from there. When you mess it up you will need to get into the VM somehow and do one of a few things: 1. Run `cloud-init collect-logs` to generate a log tarball that you can export to your host machine and dig into from there 2. Look through the system journal for any errors 3. Look in `/var/log` for files that begin with `cloud-init` and page through them If all else fails, start googling. If you are running commands against a VM with the `runcmd` feature of cloud-init, I'd suggest going through the steps on a manually installed virtual machine image at least once so you can be sure the steps work. I have lost 4 hours of time to this. Also keep in mind that in the context that `runcmd` runs from, there is no standard input hooked up. You will need to pass `-y` everywhere. If you want a simple Alpine Linux image to test with, look [here](https://github.com/Xe/alpine-image) for the Alpine Linux images I test with. You can download this image from [here](https://xena.greedo.xeserv.us/pkg/alpine/img/alpine-edge-2021-05-18-cloud-init-within.qcow2) in case you trust that I wouldn't put malware in that image and don't want to make your own. --- In the future I plan to use cloud-init _extensively_ within my [new homelab cluster](https://twitter.com/theprincessxena/status/1400592778309115905). I have plans to make a custom VM management service I'm calling [waifud](https://github.com/Xe/waifud). I will write more on that as I have written the software. I currently have a minimum viable prototype of this tool called `mkvm` that I'm using today without any issues. I also will be writing up how I built the cluster and installed NixOS on all the systems in a future article. cloud-init is an incredible achievement. It has its warts, but it being used in so many places enables you to make configuring virtual machines so much easier. It [even works on Windows!](https://cloudbase.it/cloudbase-init/). As much as I complain about it in this post, life would be so much worse without it. It allows me to use the magic of the cloud in my local virtual machines so I can get better use out of my hardware.