Security is of utmost importance in any system or platform that we build. We build and deploy our systems. Mission accomplished, high-five! Then comes Day-2 operations. We need to keep our systems floating, we need to make sure our systems do not derail into insecure and neglected messes.
When we use a public cloud provider, we might opt-in to their platform service offerings. I am talking about what is known as platform-as-a-service (PaaS). We might create serverless functions with AWS Lambda or Azure Functions. We might use an application platform such as AWS Elastic Beanstalk or Azure App Service. When we do this we implicitly trust the cloud provider to take care of many of the security aspects involved in the underlying platform. However, we are still responsible for the security of the application code that we write.
Although these platform services are useful in the right context they are not always the right choice. Architecture decisions has a lot to do with context. There are still many use-cases where traditional virtual machines is still a valid choice. Now I am instead talking about infrastructure-as-a-service (IaaS). If you host virtual machines in a public cloud you have a lot more responsibility when it comes to security compared to when deploying to a PaaS. Read any of the major cloud provider’s shared responsibility model to find out more about this.
Background#
In this post I will use Ansible, Terraform, and GitHub Actions to set up patch management for virtual machines in AWS.
- Terraform will be used to create a few virtual machines with different flavors of Linux
- Ansible will be used to update packages on each virtual machine
- GitHub Actions will be used to periodically trigger Ansible
Ansible#
Ansible is a tool that could be placed somewhere between infrastructure-as-code and configuration-as-code. You give Ansible a list of your host machines, with connection details (hostname, username/password, certificate, etc), and instructions of what to do in the form of Ansible playbooks. Ansible then connects to your host machines and performs the tasks that are specified in the playbooks. Ansible is an agent-less tool, so you do not need to install Ansible on all of your host machines, you only need it on a single machine where the playbooks are executed.
Terraform#
Terraform is a tool for infrastructure-as-code. Terraform can work with anything that exposes an API. Can you configure your lawnmower using an API? Then you could use Terraform to do just that. More commonly you would use Terraform to set up resources in public clouds, such as AWS. Terraform uses providers which are abstractions on top of a given API you want to work with. AWS has a Terraform provider that allows you to easily create resources in AWS. Apart from creating resources in AWS I will create a local file using Terraform as well.
GitHub Actions#
GitHub Actions is a platform for building workflows that run jobs that perform certain tasks. Commonly you would build workflows for continuous integration and continuous delivery (CI/CD) for your source code. A workflow can be triggered by certain events, for instance when you push code to your main branch. In this post we will use a schedule trigger, that runs at certain times that we will specify.
Let’s get technical#
To achieve what I want to achieve I have to go through a number of steps. There are a number of tools you must install and configure to be able to follow along this guide. These tools are the aws
CLI, terraform
, the GitHub cli gh
, and git
. Details on how to install these tools are not provided here. I am also assuming you use Linux or macOS.
Setting up a new GitHub repository#
I begin by creating a new directory called linux-patch-management
, initializing a new git repository and adding a README.md
file:
$ mkdir linux-patch-management && cd linux-patch-management
$ git init
$ echo "# Linux Patch Management" > README.md
$ git add . && git commit -m "Initial commit"
Next I use the GitHub CLI to initialize a new GitHub repository from my local repository:
$ gh repo create linux-patch-management \
--private \
--remote origin \
--description "Linux patch management demo" \
--source . \
--push
Creating an SSH key-pair#
To allow Ansible to connect to my virtual machines I will create an SSH key-pair and add the public key to each virtual machine I create. I use ssh-keygen
to generate a new key-pair:
$ mkdir -p terraform/cert
$ ssh-keygen -b 4096 -t rsa -f ./terraform/cert/terraform_ec2_key -q -N ""
To avoid accidentally committing my certificates to git I add a .gitignore
file in the root of my repository:
# .gitignore
terraform/cert
Setting up infrastructure in AWS with Terraform#
Now it is time to create a few virtual machines. I have already created a directory named terraform
in the root of my directory. In this directory I will add a few files to define all the resources I need. I start with main.tf
:
# main.tf
terraform {
required_version = "~> 1.3.1"
required_providers {
aws = {
source = "hashicorp/aws"
version = "4.45.0"
}
local = {
source = "hashicorp/local"
version = "2.2.3"
}
}
}
provider "aws" {
region = "eu-north-1"
}
Here I have added a terraform
block where I specify what version of Terraform I require, as well as what providers I will be working with. I include both the aws
provider, and the local
provider. I will use the local
provider to create a file containing the endpoints for my virtual machines.
Next I create ubuntu.tf
where I define my virtual machines running Ubuntu:
# ubuntu.tf
data "aws_ami" "ubuntu" {
most_recent = true
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-*"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
owners = ["099720109477"] # Canonical
}
resource "aws_instance" "ubuntu" {
count = 3
ami = data.aws_ami.ubuntu.id
instance_type = "t3.micro"
key_name = aws_key_pair.instance_key.key_name
security_groups = ["${aws_security_group.sg.name}"]
tags = {
Name = "Ubuntu Server ${count.index + 1}"
}
}
I use Ubuntu 18.04 as the image, just to keep things more interesting. I create three instances (count = 3
) and I refer to a security group resource and an instance key resource, both of which I will define soon (see below).
With my Ubuntu machines out of the way I can set up a few RHEL instances in the same way in rhel.tf
:
# rhel.tf
data "aws_ami" "rhel" {
most_recent = true
filter {
name = "name"
values = ["RHEL-8.5*"]
}
filter {
name = "architecture"
values = ["x86_64"]
}
filter {
name = "root-device-type"
values = ["ebs"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
owners = ["309956199498"] # Red Hat
}
resource "aws_instance" "rhel" {
count = 3
ami = data.aws_ami.rhel.id
instance_type = "t3.micro"
key_name = aws_key_pair.instance_key.key_name
security_groups = ["${aws_security_group.sg.name}"]
tags = {
Name = "RHEL Server ${count.index + 1}"
}
root_block_device {
volume_size = 20
volume_type = "gp2"
delete_on_termination = true
encrypted = true
}
ebs_block_device {
device_name = "/dev/xvda"
volume_size = 10
volume_type = "gp2"
delete_on_termination = true
encrypted = true
}
}
The RHEL instances are created more or less in the same way as the Ubuntu instances, with a small difference that I do some additional configuration for them (configuring block devices).
The next file I create is common.tf
:
# common.tf
resource "aws_security_group" "sg" {
name = "webserver_access"
description = "allow ssh"
ingress {
cidr_blocks = ["0.0.0.0/0"]
protocol = "tcp"
from_port = 22
to_port = 22
}
egress {
cidr_blocks = ["0.0.0.0/0"]
protocol = "-1"
from_port = 0
to_port = 0
}
}
resource "local_file" "hosts" {
filename = "${path.module}/hosts"
content = <<EOF
[ubuntu]
${aws_instance.ubuntu[0].public_dns} ansible_user=ubuntu
${aws_instance.ubuntu[1].public_dns} ansible_user=ubuntu
${aws_instance.ubuntu[2].public_dns} ansible_user=ubuntu
[rhel]
${aws_instance.rhel[0].public_dns} ansible_user=ec2-user
${aws_instance.rhel[1].public_dns} ansible_user=ec2-user
${aws_instance.rhel[2].public_dns} ansible_user=ec2-user
EOF
}
resource "aws_key_pair" "instance_key" {
key_name = "terraform_ec2_key"
public_key = file("cert/terraform_ec2_key.pub")
}
Deploying Terraform#
To save some time I deploy my Terraform resources from my local computer, keeping the Terraform state local as well:
$ terraform init
$ terraform validate
$ terraform plan -out "plan.tfplan"
$ terraform apply plan.tfplan
After a short while my virtual machines are created. If I look in my terraform
directory I see that I now have a new file called hosts
:
[ubuntu]
ubuntu.hostname.1 ansible_user=ubuntu
ubuntu.hostname.2 ansible_user=ubuntu
ubuntu.hostname.3 ansible_user=ubuntu
[rhel]
rhel.hostname.1 ansible_user=ec2-user
rhel.hostname.2 ansible_user=ec2-user
rhel.hostname.3 ansible_user=ec2-user
This is the file that tells Ansible what machines to connect to.
When I run Terraform there are a number of things created in the terraform
directory that I do not want to add to git, so I add a .gitignore
file in this directory with the following content:
.terraform
*.tfplan
terraform.tfstate
terraform.tfstate.backup
Creating the Ansible Playbook#
Now it is time to create the Ansible Playbook that will do the patching on my virtual machines. I begin by creating a directory for Ansible in the root of my repository and a few other directories inside of the ansible
directory:
$ mkdir ansible
$ mkdir ansible/inventories
$ mkdir -p ansible/roles/patch-ubuntu/tasks
$ mkdir -p ansible/roles/patch-rhel/tasks
I copy the output file named hosts
from Terraform to my ansible/inventories
directory:
$ cp terraform/hosts ansible/inventories
Ubuntu#
I create a role (oddly named thing in Ansible) containing a number of tasks to perform patching on my Ubuntu machines, I store this file in ansible/roles/patch-ubuntu/tasks/main.yaml
:
---
- name: update package list
become: true
apt:
update_cache: yes
- name: upgrade packages
become: true
apt:
upgrade: full
I have added two tasks. First to update the available packages list, followed by a full upgrade of all packages. This is the bare minimum required to do what I want to do, but of course additional steps can be required (e.g. if you want to store the output from the upgrade procedure to a file).
Red Hat Enterprise Linux (RHEL)#
I do the same thing for RHEL as I did for Ubuntu, leading me to the file ansible/roles/patch-rhel/tasks/main.yaml
:
---
- name: update package list
become: true
dnf:
update_cache: yes
- name: upgrade packages
become: true
dnf:
name: "*"
state: latest
The steps are more or less identical as for Ubuntu, with the main difference that I use the dnf
package manager for RHEL instead of apt
in Ubuntu.
Putting it all together#
Inside of the ansible
directory I add a file named ansible.cfg
:
[defaults]
host_key_checking = False
This config file disables host key checking, which would otherwise prompt for manual confirmations the first time we connect to our machines. This is not especially friendly in automation scenarios.
I add one last file in my ansible
directory, the file that ties it all together, I call it patch.yaml
:
---
- name: check package updates for ubuntu hosts
gather_facts: false
hosts: ubuntu
roles:
- patch-ubuntu
- name: check package updates for rhel hosts
gather_facts: false
hosts: rhel
roles:
- patch-rhel
Now is an excellent time to commit all my changes to GitHub:
$ git add . && git commit -m "Add terraform and ansible" && git push
Setting up GitHub actions to periodically run Ansible#
To allow Ansible to connect to my virtual machines I need to be able to access my certificate from GitHub Actions. I convert my certificate to base64 and add the result as a GitHub Actions secret using the GitHub CLI:
$ base64 -i terraform/cert/terraform_ec2_key > b64cert
$ gh secret set EC2_PRIVATE_KEY < b64cert
$ rm b64cert
Now it is time to add the last piece of the puzzle. I create the required directory structure for GitHub Actions:
$ mkdir -p .github/workflows
In the .github/workflows
directory I create a file named patch-management.yaml
with the following content:
on:
workflow_dispatch:
schedule:
- cron: "0 0 * * *"
jobs:
patch-management:
runs-on: ubuntu-latest
defaults:
run:
working-directory: ansible
steps:
- name: Check out git repository
uses: actions/checkout@v3
- name: Install Ansible
run: python3 -m pip install --user ansible
- name: Store certificate as a file
run: echo "${{ secrets.EC2_PRIVATE_KEY }}" | base64 --decode > ec2_private_key
- name: Restrict permissions on certificate file
run: chmod 400 ec2_private_key
- name: Run Ansible playbook
run: |
ansible-playbook patch.yaml \
-i inventories/hosts \
--key-file "ec2_private_key"
In the schedule
trigger I specify that I want the workflow to run once per day at 00:00. I also add a manual workflow_dispatch
trigger. In the job named patch-management
I add the required steps to check out the source code, install Ansible, fetch the certificate from GitHub secrets and storing it as a local file, and finally run the Ansible playbook.
The first time this workflow is run it will take some time, but the following runs will be quicker depending on how many updates are available.
Conclusions#
If you follow this exact guide to set up patch management for your Linux virtual machines then you will at most lag 24 hours behind a new version of a package being released before you have it installed. Of course I have glossed over details that could be interesting for you, and I will go through all the details in my coming 24-part Linux series of posts! (No I won’t)