No-ops Linux part 1: Automation, security and essentials
In Running containers on no-ops linux in 2025 I wrote about moving my hobby projects to a European cloud provider. I did an initial, manual setup in Hetzner, which I've now automated. This weekend, I tested the setup. It takes me a few minutes now to get everything moved to a new host, and most of that has to do with DNS. I've got a reproducible setup, I can quickly provision up a machine locally or in any cloud that has Ubuntu 24.04. Reproducible infrastructure is ✨liberating✨
The goal is to document how to make a fire-and-forget Linux server that will mostly take care of its own ops, with an acceptable uptime and security level for my hobby projects. Once we're done, the server will:
- Update its packages, and reboot during nighttime, when required by updates.
- Run a caddy proxy on the host system, ensuring everything's hosted on TLS with certificates issued from letsencrypt.
- Run several persistent rootless containers using podman, with a setup that brings them up on boot.
- Run a nightly job that updates some data and performs a rolling restart of some containers. The job runs in a container too. The data lives in S3-compatible object storage.
- Take care of automatically updating the containers.
- We'll also set up some monitoring so we can have 📨email notifications if the containers aren't working right.
This covers a lot of ground, some of it is quite detailed. If you'd like to skip this incredibly long and technical read, you could go over and investigate the source code over at github instead. It has been my goal to make something that other people can easily adapt to their own use. This description of how I made it might be a helpful guide to see which parts you should change or tinker with.
This post got so long that I broke it into three parts:
- This post covers local development of linux server configuration and essentials.
- The next post covers installation of podman and caddy. It concludes by deploying a very simple stateless webapp in a container.
- The final post covers a more challenging deployment with jobs and rolling restarts, and discusses the strengths and weaknesses of this approach to hosting.
Configuring Linux machines
For the longest time, I had hoped to do this with cloud-init. You can provide a user-data script when you order a machine with almost every cloud provider. When you get access to the machine, it's already configured (or very nearly done).
You shouldn't be putting secrets in user-data scripts. You can make user accounts, add public keys to them and install software and all sorts of useful things. I played with the idea of just setting up a new VM with cloud-init every time I needed a config change and just replacing the old one. I like that idea a lot. I still want to do it. But I really didn't find any good way of injecting the secrets that I require without manual intervention or reaching for a tool other than cloud-init.
For that reason I decided to settle for Ansible to configure the VMs, using ansible-vault
to handle my secrets. There seemed little point in using cloud-init after that.
Both cloud-init and Ansible run counter to my goal of using as little YAML as possible, so that was deeply unsatisfying. But it works great! The last 8–12 years have given me a huge tolerance for YAML anyway. Surely, I'll survive? 😵
Developing Ansible code
I had already decided to use Ubuntu 24.04 with unattended upgrades and unattended reboots. I wanted to develop locally, only ordering a server in a cloud when I was fairly confident I had a secure setup. The best way I know how to do that is to use Vagrant. This tool can manage virtual machines locally, applying tons of different configuration management tools to configure them. Crucially, it supports ansible. After installing VirtualBox I found a suitable Vagrant box and ran:
vagrant init bento/ubuntu-24.04
This produces a Vagrantfile in the current directory. I edited mine to look like this:
Vagrant.configure("2") do |config|
config.vm.box = "bento/ubuntu-24.04"
config.vm.provision "ansible" do |ansible|
ansible.playbook = "initialize.yml"
end
end
I put the following in initialize.yml
, which is an ansible playbook:
---
- name: Initialize Ubuntu host
hosts: all
become: true
tasks:
- name: Upgrade apt packages
apt:
update_cache: yes
upgrade: yes
- name: Ensure base packages
apt:
name:
- vim
- curl
- git
- unattended-upgrades
- btop
- iotop
- nethogs
- emacs-nox
- strace
- gnupg
- lsb-release
- systemd-container
- nmap
- fail2ban
- ufw
state:
present
Here's a breakdown of the packages I put there:
unattended-upgrades
ensures that packages are upgraded without me doing anything on the machine. This represents a leap of faith that package upgrades won't break anything. We will be taking some precautions to make sure we discover breakage, later.strace
,nmap
,nethogs
,btop
andiotop
are tools that I can use to investigate the state of affairs on the machine. When Issh
to a machine to debug it, I usually assume that I have these available.curl
is also invaluable to check proxy configuration.vim
,git
,emacs-nox
are for interacting with source code or configuration on the machine. I don't anticipate using these, but I'd rather not have to install them manually later. I put both vim and emacs there because I was feeling a little impulsive and wild. No sane person would usevim
, of course. 🤪 I do sometimes, but please don't tell any of my friends.fail2ban
andufw
are security measures I want to use.fail2ban
can be used to ban IPs that try to mess with the machine. Like bots that attempt to brute-force login tossh
.ufw
is short for uncomplicated firewall, and that's what it is. It's a stateful firewall that's straightforward to use.- We need
systemd-container
to obtainmachinectl
, which is useful for letting ansible control user-specific systemd settings without dealing with D-Bus ourselves. lsb-release
andgnupg
are useful if we want to add more apt sources, to install non-standard packages.
After this, we can bring up a virtual machine with:
vagrant up
It'll work for a while, then tell us the machine is configured. If we want to start over, we can use:
vagrant destroy -f; vagrant up
To reapply the playbook, we can use:
vagrant provision
To access an ssh-session, we can use:
vagrant ssh # logs in with the vagrant@ user, who has passwordless sudo
Now we have a nice and short feedback loop to iterate on the ansible configuration! No cloud required. Yet.
Ansible best practices
There are some best practices we can choose to follow. Usually, that's a good idea. Since this is my hobby project, I will sometimes ignore some of them and undoubtedly regret it later. I simply haven't worked with Ansible for almost 10 years, so I'm probably very outdated.
But there's one best practice we'll try to follow closely: Keep the playbook lean, put most of the things into "roles." An ansible role is like a module. It can accept parameters and do many things. This gives us a way to assemble different playbooks out of different roles later, without doing so much work. So let's introduce some more structure:
mkdir -p roles/{caddy,base-install,app-user,podman}/{tasks,files,templates,handlers}
find roles
roles
roles/podman
roles/podman/tasks
roles/podman/files
roles/podman/templates
roles/podman/handlers
roles/base-install
roles/base-install/tasks
roles/base-install/files
roles/base-install/templates
roles/base-install/handlers
roles/app-user
roles/app-user/tasks
roles/app-user/files
roles/app-user/templates
roles/app-user/handlers
roles/caddy
roles/caddy/tasks
roles/caddy/files
roles/caddy/templates
roles/caddy/handlers
Why would you make so many folders??
An ansible role is a directory that can consist of many parts. The most important ones are:
tasks/main.yml
defines (ideally idempotent) tasks that the role must runtemplates
contain jinja2 templates that can be used to render configuration in tasksfiles
can contain stuff that we want the tasks to copy.handlers/main.yml
defines some handlers that can react to tasks that have run; for example to reload some service if the configuration changed.
Let's build out the roles one at a time. First, we'll edit initialize.yml
so that it contains only this:
---
- name: Initialize Ubuntu host
hosts: all
become: true
roles:
- base-install
We're moving the base packages to the base-install role, in roles/base-install/tasks/main.yml
:
---
- name: Upgrade apt packages
apt:
update_cache: yes
upgrade: yes
- name: Ensure base packages
apt:
name:
- vim
- curl
- git
- unattended-upgrades
- btop
- iotop
- nethogs
- emacs-nox
- strace
- gnupg
- lsb-release
- systemd-container
- nmap
- fail2ban
- ufw
state:
present
We can use vagrant provision
to check it. There shouldn't be any changes.
Securing base-install
I want the base-install
role to take care of a few more things:
- Hardening
sshd
, so that it's more challenging to break into the machine that way - Creating an
admin
user so that we can disable ssh-login forroot
- Configuring
fail2ban
andufw
These are things I'd like to set up on every machine, even if I end up with more machines eventually. Let's get to work.
Handlers
First, we'll make some handlers that we're going to want to notify. We can put these into roles/base-install/handlers/main.yml
:
---
- name: Reload unattended-upgrades
service:
name: unattended-upgrades
state: restarted
- name: Reload sshd
service:
name: ssh
state: reloaded
- name: Reload systemd
systemd:
daemon_reload: yes
- name: Restart fail2ban
ansible.builtin.service:
name: fail2ban
state: restarted
unattended-upgrades
We will want to add more tasks to roles/base-install/tasks/main.yml
. Let's start small by setting the hostname, timezone and enabling unattended upgrades. Let's add these:
- name: Set hostname to inventory hostname
hostname:
name: "{{ inventory_hostname }}"
- name: Set timezone to Europe/Oslo
community.general.timezone:
name: Europe/Oslo
- name: Enable unattended upgrades
service:
name: unattended-upgrades
enabled: yes
notify: Reload unattended upgrades
- name: Configure unattended-upgrades
copy:
dest: /etc/apt/apt.conf.d/99unattended-upgrades-custom
content: |
Unattended-Upgrade::Automatic-Reboot "true";
Unattended-Upgrade::Automatic-Reboot-WithUsers "true";
Unattended-Upgrade::Automatic-Reboot-Time "02:00";
Unattended-Upgrade::SyslogEnable "true";
notify: Reload unattended upgrades
We'll use vagrant provision
to test that this does the expected thing. At this point, we could let the machine run for a few days and verify that it reboots when there's a new security update, but we're not going to do that. There's still more work to do.
Securing sshd
Let's proceed to secure sshd
a little more by adding new tasks to roles/base-install/tasks/main.yml
again. I would like to block ssh-login with the root
account, but before we can do that, we need to have an alternate account. vagrant
has a dedicated vagrant
user it can use, but we won't be so lucky when we're running ansible standalone against a cloud server, so we need to take care of this right away. Let's set it up:
- name: Create admin
user:
name: admin
state: present
shell: /bin/bash
createhome: yes
groups: sudo
append: yes
- name: Allow passwordless sudo for admin
copy:
dest: /etc/sudoers.d/admin
content: "admin ALL=(ALL) NOPASSWD:ALL"
mode: "0440"
- name: Configure authorized_keys for admin
with_items: "{{ authorized_keys }}"
authorized_key:
user: admin
key: "{{ item }}"
state: present
Running vagrant provision
now fails:
fatal: [default]: FAILED! => {"msg": "'authorized_keys' is undefined"}
That's because we didn't pass in any authorized_keys
variable, but we use one in this role. Back to initialize.yml
at the root:
---
- name: Initialize Ubuntu host
hosts: all
become: true
roles:
- name: base-install
vars:
authorized_keys:
- ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIE9K1p8B8FRCWJ0Ax4obDu+UsLzGgXDIdTYkCZ8FF54b
💡This changed the type of the list element in roles to an object, instead of a string. That's because we need to send it vars
now.
This is how we can pass a variable into a role. This isn't the best place to put this key, but we'll find a better place later. vagrant provision
now creates the admin
user. You'll probably want to use a different public key unless you want me to have access to your vagrant box. We can check ssh access by running:
# 2222 is the default port vagrant uses to communicate with the box
ssh -p 2222 admin@localhost whoami
admin
Fantastic. It's time to configure ssh. Let's add this to roles/base-install/tasks/main.yml
:
- name: Require key for ssh-login
copy:
dest: /etc/ssh/sshd_config.d/harden.conf
content: |
PermitEmptyPasswords no
LoginGraceTime 30s
PasswordAuthentication no
MaxAuthTries 3
MaxSessions 3
PermitRootLogin no
notify: Reload sshd
We put this configuration into /etc/ssh/sshd_config.d/harden.conf
, because the standard sshd_config
automatically loads all snippets it finds in there. That way, we don't have to worry about package upgrades overwriting our configuration. This disables authenticating with passwords, and blocks attempts to log directly into the root
account.
fail2ban
and ufw
It's time to get fail2ban
up! This package scans the server logs and blocks IPs that attempt to break in. This reduces the amount of noisy logs on the server, so it's a good idea to set it up even if you're not worried about breakin attempts actually succeeding.
Here's what we'll add to roles/base-install/tasks/main.yml
:
- name: Create Fail2Ban jail.local configuration
copy:
dest: /etc/fail2ban/jail.local
content: |
[DEFAULT]
bantime = 1h
# An IP is banned if it has generated "maxretry" during the last "findtime"
findtime = 10m
maxretry = 5
[sshd]
enabled = true
port = ssh
maxretry = 3
bantime = 24h
owner: root
group: root
mode: '0644'
notify: Restart fail2ban
- name: Enable fail2ban
service:
name: fail2ban
enabled: yes
Once again, we can run vagrant provision
to verify that everything looks okay. This fail2ban
config will block IPs that attempt to break into sshd
for 24 hours. At this point, we can lock ourselves out of our server very easily if we use the wrong username or ssh-key. So let's not do that.
Let's get uncomplicated firewall up with some sane defaults by adding even more YAML to roles/base-install/tasks/main.yml
:
- name: Set UFW default policies
community.general.ufw:
default: deny
direction: incoming
- name: Set UFW default outgoing policy
community.general.ufw:
default: allow
direction: outgoing
- name: Allow SSH on port 22/tcp (standard port)
community.general.ufw:
rule: allow
port: '22'
proto: tcp
comment: 'Allow SSH access'
- name: Allow HTTP on port 80/tcp (standard port)
community.general.ufw:
rule: allow
port: '80'
proto: tcp
comment: 'Allow HTTP access'
- name: Allow HTTPS on port 443/tcp (standard port)
community.general.ufw:
rule: allow
port: '443'
proto: tcp
comment: 'Allow HTTPS access'
- name: Enable UFW
community.general.ufw:
state: enabled
This blocks all incoming ports other than ssh
, http
and https
. Since ufw
is a stateful firewall, it'll notice that we're using services like NTP and DNS and accept incoming packets from those when we want to use them, so it shouldn't interfere with anything.
Wrapping up base-install
At this point I would feel comfortable deploying this onto a server in the cloud. It's very unlikely that anyone could succeed in breaking in, especially because there's almost no attack surface. We can't use this server for much other than reading manpages! In the next part, we'll make it a little more capable by setting up podman, a proxy and a simple container deployment.