POSTS

Terraforming my Droplet

No more manual config

I want to run my droplet with my blog and all my games and silly one-off web apps without ever touching it manually. This would allow me to destroy and rebuild it on a whim, using config as code, CI/CD, etc..

I recently messed around with user access and locked myself out of my droplet. Around the same time, my draft blog stopped updating properly. The update failure is likely a file ownership problem. The cron jobs are owned by root, but the scripts are now located elsewhere and are owned by my user. Also, one of my silly web apps, 13thagegen, is now throwing a 502 Bad Gateway, probably because I messed with the permissions on the nginx dirs.

I’m going to use terraform, ansible, and bash to get this sorted. Then hopefully I’ll never have to touch the droplet again, and the droplet itself won’t matter; I’d just spin up a new one and update the DNS.

Terraform basics

Terraform wraps many web resource provisioning APIs with a common interface that can be interacted with through JSON or HCL (Hashicorp Configuration Language). It’s config as code, which means aspects of the web resource are described in code, and terraform takes care of interacting with the target platform and setting everything up. This way there is never any doubt about the state of the system, and new systems can be created with the exact same characteristics.

  • Official DigitalOcean Terraform Docs
  • applying a terraform config using a DigitalOcean token and ssh key

    terraform apply -var "do_token=${DO_TOKEN}" -var "pvt_key=$HOME/.ssh/id_rsa_do -var "ssh_key_md5=${PUB_KEY_MD5}"
    
  • destroying an existing resource (key may not be necessary)

    terraform destroy -var "do_token=${DO_TOKEN}" -var "pvt_key=$HOME/.ssh/id_rsa_do -var "ssh_key_md5=${PUB_KEY_MD5}"
    

Ansible basics

Ansible’s main feature is its capacity for idempotence. Idempotence just means something can be executed an unlimited number of times, and the end result will be the same.

Another key feature is jinja templating. Ansible variables can be referenced in any of ansible’s inventory, config, and task files by using jinja templating. This means only one nginx template is needed for each type of app, because the parts that change (ports, paths, subdomains) can just swapped out for a variable. Adding a new app of an existing type is as simple as adding an entry to the content dictionary.

Running scripts

All ansible executions are handled with bash wrappers to guarantee environment stability and repeatability.

Modules used

  • apt
  • authorized_key
  • cron
  • copy
  • file
  • git
  • service
  • shell
  • template
  • uri
  • user

Building and connecting to the droplet

Terraform config

The terraform config describes what the fresh droplet should look like.

The following configuration uses a DO provider, droplet resource, file provisioner, output, DO domain resource, and DO record resource.

web.tf

# https://www.terraform.io/docs/providers/do/index.html

# Set the variable value in *.tfvars file
# or using -var="do_token=..." CLI option
variable "do_token" {}
variable "pvt_key" {}
variable "ssh_key_md5" {}

# Configure the DigitalOcean Provider
provider "digitalocean" {
  token = "${var.do_token}"
}

resource "digitalocean_droplet" "web" {
  name = "uninterested-turkey"
  region = "nyc3"
  size = "s-1vcpu-1gb"
  image = "ubuntu-18-04-x64"
  ssh_keys = ["${var.ssh_key_md5}"]

  provisioner "file" {
    source      = "install"
    destination = "/root/install/"

    connection {
      user     = "root"
      host     = "${self.ipv4_address}"
      type     = "ssh"
      password = "${file(var.pvt_key)}"
    }
  }
}

output "instance_ip" {
  value = "${digitalocean_droplet.web.ipv4_address}"
}

# Create a new domain record
resource "digitalocean_domain" "default" {
  name = "matthewodle.com"
  ip_address = "${digitalocean_droplet.web.ipv4_address}"
}

resource "digitalocean_record" "CNAME-subdomain" {
  domain = "${digitalocean_domain.default.name}"
  type = "CNAME"
  name = "*"
  value = "@"
}

Ubuntu was chosen solely because the original manually-configured droplet used Ubuntu.

There are two input variables, do_token and pvt_key.

      password = "${file(var.pvt_key)}"

The ssh_keys value is just the MD5 of the public key that matches the pvt_key. The MD5 can be determined with: ssh-keygen -E md5 -lf /path/to/your/ssh/key. The result is the same whether targeting the public or private key with this command. If you’re using an ssh key that is not the default name (~/.ssh/id_rsa), you may also need to add the key identity to the ssh-agent: ssh-add ~/.ssh/id_rsa_do.

Terraform will export several variables during execution, and these variables can be used in the terraform config. Here I’ve used the ip4v address variable host = "${self.ipv4_address}" to immediately connect to the droplet and create the script we’ll need to run before ansible will be able to connect to the droplet.

The file provisioner is used to copy a local script to the droplet. The command contained in this script is required for ansible to function (it installs python). This file copy could also easily be handled by an scp command, but terraform is a more ‘config as code’ approach, as it’s declarative rather than scripted. Conversely, instead of running the script with ssh, terraform could run it directly on pod creation.

Terraform’s output block can capture any terraform variables and export them for use by other scripts. This automatically creates a terraform_output file which contains a JSON representation of the exported variables. Later, this file is read to automatically determine the IP address and make it available to the rest of the scripts.

Finally, a DNS record is created with a CNAME using digitalocean_domain and digitalocean_record so my domain automatically gets applied to the new droplet. TTL is defaulted to 1800s (30 minutes) and there does not appear to be an option to change this value. Since this is a personal droplet, the downtime is perfectly fine.

Ansible plays

The rest of the heavy lifting is done by ansible.

Ansible does several things for my droplet:

  • sets up a non-root user account, including necessary keys and passwords to connect to the droplet, and to clone from gitlab
  • installs system packages with apt and uri
    • hugo for compiling static blog files
    • nginx for serving apps and web pages
    • pipenv for running flask apps
  • copies content and sets up update scripts and cron jobs for that content

Here’s a play that verifies ansible can be used to connect to the droplet, ensuring the proper keys and packages are in place to run user configuration, utility setup, and application deploy plays.

---
- hosts: all
  remote_user: "{{ ssh_user }}"
  tasks:
    - name: testing ansible
      debug:
        msg: "hello from ansible"

The commands

The various commands are packaged into shell scripts, and those scripts are invoked through a wrapper script to ensure the droplet is set up the same way every time.

The organizational strategy is:

  • terraform - destroy the droplet
  • terraform - create the droplet
  • ssh - test the connection to ensure access is correctly configured
  • ssh - (this is the last time ssh is used by itself; complex system changes beyond initial creation and connection are all handled by ansible) install base packages - in this case just python (thanks for not including it Ubuntu)
  • ansible - verify the ansible connection (anything beyond this point is 100% ansible)
  • ansible - set up user access
  • ansible - deploy applications and content

recreate.sh

# exit when any command fails
set -e

source load_env.sh                   # loads and validates the required env vars

source run_terraform.sh destroy      # destroys the droplet
source run_terraform.sh apply        # creates the droplet
source test_connection.sh            # tests the ssh connection
source install.sh                    # installs initial system packages
source test_ansible.sh               # tests the ansible connection
source configure_access.sh           # sets up a non-root user on the droplet
source deploy.sh                     # deploys content (with cron update scripts) to the droplet

run_terraform.sh

if [ $1 != "plan" ] && [ $1 != "apply" ] && [ $1 != "destroy" ]; then
    echo "expected one of: ['plan', 'apply', 'destroy'], got $1"
    exit 1
fi

source load_env.sh

key_path="$HOME/.ssh/id_rsa_do"
key_exists=$(ls $key_path | wc -l)

if [ "$key_exists" -eq "0" ]; then
    echo "ssh key does not exist"
    exit 1
fi

terraform $1 -auto-approve -var "do_token=${DO_TOKEN}" -var "pvt_key=$key_path" -var "ssh_key_md5=${PUB_KEY_MD5}"

test_connection.sh

source load_env.sh

IP_ADDR=$(source get_ip.sh)

echo testing connection to $IP_ADDR

ssh $ROOT_USER@$IP_ADDR -oStrictHostKeyChecking=no ls -al install

install.sh

# only include dependencies needed to run ansible
# everything else should be handled by ansible itself

# run on client workstation

IP_ADDR=$(source get_ip.sh)
echo connecting to $IP_ADDR

install_script=install/python-install.sh
target_user=root

echo installing dependencies:
ssh $target_user@$IP_ADDR -oStrictHostKeyChecking=no cat $install_script
ssh $target_user@$IP_ADDR -oStrictHostKeyChecking=no chmod u+x $install_script
ssh $target_user@$IP_ADDR -oStrictHostKeyChecking=no source $install_script

test_ansible.sh

echo testing ansible connection

source load_env.sh

ansible-playbook ansible/hello-world.yml -i $(source get_ip.sh), -e ssh_user=$ROOT_USER

Adding a non-root user

Generally, a remote host should be accessed with a non-root user for reasons. This ansible play will add the user, with the hashed password provided in an extra variable.

configure_access.sh

echo configuring users

source load_env.sh

ansible-playbook ansible/configure_access.yml -i $(source get_ip.sh), -e ssh_user=$ROOT_USER

The password will need to be stored locally; The task uses an environment variable, but there are other approaches. The shell is set to bash because the default with Ubuntu is sh and nobody wants that on purpose. The UID can be anything over 1000.

    - name: add remote user
      user:
        name: "{{ target_username }}"
        password: "{{ lookup('env', 'REMOTE_PASS') | password_hash('sha512') }}"
        shell: /bin/bash
        group: admin
        uid: 1079

configure_access.yml

---
- hosts: all
  remote_user: "{{ ssh_user }}"
  vars:
    gitlab_private_key: id_rsa_gitlab
    do_public_key: id_rsa_do.pub
  tasks:
    - name: testing ansible
      debug:
        msg: "hello from ansible"

    - name: get current user
      set_fact:
        current_user: "{{ lookup('env', 'USER') }}"
    - name: set local private gitlab rsa key path
      set_fact:
        source_gitlab_key_path: "/home/{{ current_user }}/.ssh/{{ gitlab_private_key }}"

    - name: set target username
      set_fact:
        target_username: "{{ lookup('env', 'REMOTE_USER') }}"
    - name: set remote user home path
      set_fact:
        remote_user_home: "/home/{{ target_username }}"
    - name: set remote private gitlab rsa key path
      set_fact:
        dest_private_gitlab_key_path: "{{ remote_user_home }}/.ssh"
    - name: set local public do rsa key path
      set_fact:
        source_public_do_key_path: "/home/{{ current_user }}/.ssh/{{ do_public_key }}"

    - name: add remote user
      user:
        name: "{{ target_username }}"
        password: "{{ lookup('env', 'REMOTE_PASS') | password_hash('sha512') }}"
        shell: /bin/bash
        group: admin
        uid: 1079

    - name: ensure remote .ssh directory exists
      file:
        dest: "{{ remote_user_home }}/.ssh"
        mode: 0700
        owner: "{{ target_username }}"
        state: directory

    - name: add public do key to remote user's authorized keys
      authorized_key:
        user: "{{ target_username }}"
        state: present
        key: "{{ lookup('file', source_public_do_key_path) }}"

    - name: add private gitlab key to remote user's ssh dir
      copy:
        src: "{{ source_gitlab_key_path }}"
        dest: "{{ remote_user_home }}/.ssh/id_rsa"
        owner: "{{ target_username }}"
        mode: 0600
    - name: add public gitlab key to remote user's ssh dir
      copy:
        src: "{{ source_gitlab_key_path }}.pub"
        dest: "{{ remote_user_home }}/.ssh/id_rsa.pub"
        owner: "{{ target_username }}"
        mode: 0600

This should allow passwordless connection via ssh, elevation to sudo using the provided REMOTE_PASS, and cloning from gitlab.

Serving content

The necessasry components to be able to serve content are nginx, pipenv, hugo, and the content files.

deploy.sh invokes the deploy.yml ansible play, which is set up as a role. Roles make it easy to manage variables and templates, since ansible assumes anything inside the roles/$target_role directory belongs to that role, and will look for directories named defaults, templates, and plays (among others). Ansible will load main.yml from within these directories if found.

Configuring nginx

Nginx config is set up by ansible.

Installation is done with apt. The default site is removed (file: state=absent) and the default nginx.conf is updated to a custom configuration (template). Then the service is restarted. If the restart fails, it means there’s a problem with the config, and the play will stop.

nginx.yml

- name: install nginx
  apt:
    name: nginx
    state: latest
  become: yes

- name: remove default from sites-enabled
  file:
    path: "{{ nginx_sites_enabled }}/default"
    state: absent
  become: yes

- name: update nginx.conf
  template:
    src: nginx.conf.j2
    dest: "{{ nginx_home }}/nginx.conf"
  become: yes

- name: restart nginx
  service:
    name: nginx
    state: restarted
  become: yes

Installing Hugo

Hugo is a markdown to static file compiler. It allows me to write blog posts in markdown, then at deploy time compiles those files into shiny HTML/CSS.

hugo.yml

- name: install hugo
  apt:
    name: hugo
    state: latest
  become: yes

The content play deploys the files nginx will serve and sets up sites-available and sites-enabled.

There are 4 main types of sites served:

  • (many) static html files served from subdomains
  • (many) flask applications
  • (unique) landing page (no subdomain)
  • (unique) a catch-all that redirects to the landing page

Installing Pipenv

I’m a bit ashamed of this play. It downloads a script and runs it (scary) to install pipenv. I’ll address this later.

Pipenv is used to run python applications. I’ve used it to run my flask apps. The key feature of pipenv is that it uses actual shell sessions with updated paths instead of faking the paths in the current session the way virtualenv does. Whether this is better I still don’t know. I know I’m annoyed every time deactivate removes the (venv is active) indicator, but non-obviously leaves the pipenv shell session open, which functions just fine, until the next time pipenv shell is ran and fails with a cryptic error.

Still on the fence about pipenv.

pipenv.yml

# needed by pipenv
- name: install python3-distutils
  apt:
    name: python3-distutils
    state: latest
  become: yes

# gross
- name: download get-pipenv.py
  uri:
    url: https://raw.githubusercontent.com/kennethreitz/pipenv/master/get-pipenv.py
    dest: /tmp/get-pipenv.py
  become: yes

# gross
- name: execute get-pipenv.py
  shell:
    cmd: python /tmp/get-pipenv.py
  become: yes

Updating apps and blog content automatically

The first pass at automating my droplet was to get my blog content to update automatically when content was pushed to the blog repo. A bash script with some directory syncing trickery and some trap commands to handle rollbacks was used. It was converted and enhanced from a sample on the web. A no-op check was added so that when there were no git repo changes, unnecessary file operations would be avoided. There’s also a hack-job of a logging system that dumps any messages to a maint.log file, which truncates itself when it gets too long.

It would be possible to handle the same content management approach using ansible, but since I don’t have a deploy server, I’d have to run ansible on the droplet itself. This would be fine, but bash works just as well. The script was adapted for all types of content. This script is called by scripts that have custom logic for the various types of content (hugo compile, pipenv install, etc.). Here’s the script.

To deploy the update script, ansible uses templating. A cron job is added (also ansible) to run it on a timer.

- name: add common update script
  template:
    src: "update_scripts/common.sh.j2"
    dest: "{{ automation_path }}/common.sh"
    mode: u=rwx,g=r,o=r

- name: add update scripts
  template:
    src: "{{ item.value.update_script }}"
    dest: "{{ automation_path }}/{{ item.key }}.sh"
    mode: u=rwx,g=r,o=r
  loop: "{{ lookup('dict', content) }}"
  when: item.value.update_script is defined

- name: add crontab entries
  cron:
    name: "update {{ item.key }}"
    minute: "*/5"
    job: "{{ automation_path }}/{{ item.key }}.sh"
  loop: "{{ lookup('dict', content) }}"
  when: item.value.update_script is defined

Finally, ansible drops the app configs for nginx (more templates) into sites-available, symlinks them to sites-enabled, and restarts nginx.

- name: add conf files to sites-available
  template:
    src: "{{ item.value.nginx_template | default('nginx_server_subdomain.j2') }}"
    dest: "{{ nginx_sites_available }}/{{ item.key }}"
  become: yes
  loop: "{{ lookup('dict', content) }}"

- name: symlink conf files to sites-enabled
  file:
    src: "{{ nginx_sites_available }}/{{ item.key }}"
    dest: "{{ nginx_sites_enabled }}/{{ item.key }}"
    state: link
  become: yes
  loop: "{{ lookup('dict', content) }}"

- name: restart nginx
  service:
    name: nginx
    state: restarted
  become: yes

Adding new content

It’s really easy to add new content with this setup.

For example, a new flask app just requires this to be added to roles/deploy/defaults/main.yml in the content dictionary:

  baseflask:
    test_port: 82
    app_run_port: 8002
    subdomain: testflask
    repo: git@gitlab.com:modle13/base-flask.git
    update_script: update_scripts/flask_app.sh.j2
    nginx_template: nginx_server_flask_app.j2

The only thing that varies between other flask apps are the name, ports, subdomain, and repo. It uses the exact same templates for updates and nginx as other flask app.

Here’s the template for a flask app:

server {
    listen 80;

    location / {
        proxy_pass http://127.0.0.1:{{ item.value.app_run_port }};
    }

    server_name {{ item.value.subdomain }}.{{ domain }} www.{{ item.value.subdomain }}.{{ domain }};
}

The server_name entries could be converted into a list in the config, which would allow the other three types (static, landing, and default pages) to be combined into one template.

Static content is added the same way. Most of my static content is HTML5 games and hugo blogs. The only thing in the template that varies is the name, port, subdomain, and repo. Their config looks like this:

  centipede:
    test_port: 9200
    subdomain: centipede
    update_script: update_scripts/html5canvas.sh.j2
    repo: git@gitlab.com:taciturn-pachyderm/centipede.git
  blog:
    test_port: 9077
    subdomain: blog
    repo: git@gitlab.com:modle13/blog-dot-matthewodle-dot-com-hugo.git
    update_script: update_scripts/blog.sh.j2

This is the static content nginx template:

server {
	listen 80;

	root {{ content_path }}/{{ item.key }}/public_html;

	index index.html index.htm;

	server_name {{ item.value.subdomain }}.{{ domain }} www.{{ item.value.subdomain }}.{{ domain }};
}