Skip to main content

HashiTalks Secure 2024 - Unlocking Privileged Access Management: HCP Boundary with Terraform

·6211 words·30 mins
Hashicorp Boundary Hcp Vault Terraform Github Issueops

Before you get started on this post, read more on Boundary in my previous post:

A tour of HCP Boundary for just-in-time access for on-call engineers
·5979 words·29 mins
Hashicorp Boundary Hcp Vault Terraform

Introduction
#

[HashiCorp Boundary is] an identity-aware proxy aimed at simplifying and securing least-privileged access to cloud infrastructure1

This blog post is the written version of my talk at HashiTalks Secure 2024. If you are completely unfamiliar with what Boundary is this post can work as an inspiration for learning more, but it will be helpful for your understanding if you have some experience with Boundary already.

The full source code for my demo is available at the accompanying repository.

mattias-fjellstrom/hashitalks-secure-2024

HashiTalks Secure 2024: Unlocking Privileged Access Management: HCP Boundary with Terraform

HCL
1
1

What will we do?
#

First I will go through how you set up resources in Boundary using Terraform, including scopes, auth methods, groups, users, and accounts. Once the basics are in place we will move on to configuring Boundary workers. Workers make up a major part of Boundary, and they are what allow us to ultimately access our resources. Once our workers are in place I will go through what is required to set up access to resources running in AWS. In Boundary this include setting up host catalogs, host sets, hosts, targets, aliases, credential stores, and credential libraries.

Once all of that is set up I will look at two different use-cases for Boundary: self-service access management through IssueOps on GitHub, and providing on-call engineers access to targets during incidents.

The demo architecture I use is illustrated in the following image:

architecture

On the left I have the HashiCorp Cloud Platform (HCP) with my Boundary cluster and my Vault cluster inside of a HashiCorp Virtual Network (HVN). On the right I have my AWS environment with a Virtual Private Cloud (VPC) with two subnets, one private and one public. In each of the subnets I have a Boundary worker running as an EC2 instance. In the private subnet I have two resources I want to access using Boundary: an EC2 instance and an Aurora postgres cluster. A private peering connection between the HVN and the AWS VPC is in place to allow private communication with my Vault cluster.

Setting up Boundary
#

Below follows a brief look at many of the resources that make up a successful Boundary deployment. The official documentation for Boundary is a good source of information to deep-dive into the details, and have a look at the accompanying source code that I have provided to see the full picture.

HCP Boundary cluster
#

To begin we need a Boundary cluster. I am using the managed HCP Boundary version:

resource "hcp_boundary_cluster" "this" {
  cluster_id = "hashitalks-secure-2024"
  username   = "admin"
  password   = var.hcp_boundary_admin_password
  tier       = "Plus"

  maintenance_window_config {
    day          = "TUESDAY"
    start        = 2
    end          = 12
    upgrade_type = "SCHEDULED"
  }
}

I am using the Plus tier of HCP Boundary, allowing me to use premium features such as session recordings. I provide my cluster with a username and password for the initial admin user. Apart from setting up HCP Boundary there are a few more things to set up at this stage: an AWS VPC, an HCP Vault cluster in a HashiCorp Virtual Network (HVN), and peering between the AWS VPC and the HCP HVN. The details of these resources is left out of this post, but you can see the full setup in the GitHub repository.

When you create an HCP Boundary cluster it comes with a few things to get you started:

  • The global scope.
  • A password auth method at the global scope.
  • Two managed Boundary workers.

Important to know is that Terraform can only use the password auth method to interact with Boundary.

Scopes
#

Scopes in Boundary are similar to directories in a file system. The root of the scope hierarchy is the global scope. Sub-scopes in the global scope are called organizations. Sub-scopes of organizations are called projects. You can also nest other projects inside of a project.

You will want to set up your own scopes to organize resources as best fits your needs. One example is to create a single organization with a number of projects based on what type of targets they will contain:

# The main organization
resource "boundary_scope" "organization" {
  scope_id = "global"
  name     = "hashitalks-secure-2024-organization"
}

# GCP project
resource "boundary_scope" "gcp" {
  scope_id = boundary_scope.organization.id
  name     = "gcp-resources"
}

# Azure project
resource "boundary_scope" "azure" {
  scope_id = boundary_scope.organization.id
  name     = "azure-resources"
}

A different approach is to create multiple organizations, perhaps one for each business unit in your organization, and then a project for each team in the business unit:

resource "boundary_scope" "business_unit_1" {
  scope_id = "global"
  name     = "Business Unit X"
}

resource "boundary_scope" "business_unit_1_team_1" {
  scope_id = boundary_scope.business_unit_1.id
  name     = "Team Alpha"
}

resource "boundary_scope" "business_unit_1_team_2" {
  scope_id = boundary_scope.business_unit_1.id
  name     = "Team Beta"
}

resource "boundary_scope" "business_unit_2" {
  scope_id = "global"
  name     = "Business Unit Y"
}

resource "boundary_scope" "business_unit_2_team_1" {
  scope_id = boundary_scope.business_unit_2.id
  name     = "Team Gamma"
}

resource "boundary_scope" "business_unit_2_team_2" {
  scope_id = boundary_scope.business_unit_2.id
  name     = "Team Delta"
}

Scopes are used for permission and resource management, there are no wrong designs but try to keep the number of scopes at a reasonable level to avoid an overly complex hierarchy.

In the rest of this post I will use a single organization with a single project:

resource "boundary_scope" "organization" {
  scope_id                 = "global"
  name                     = "hashitalks-secure-2024-organization"
  description              = "Organization for HashiTalks Secure 2024 demo"
  auto_create_admin_role   = true
  auto_create_default_role = true
}

resource "boundary_scope" "project" {
  scope_id                 = boundary_scope.organization.id
  name                     = "aws-resources"
  description              = "Project for all demo AWS resources"
  auto_create_admin_role   = true
  auto_create_default_role = true
}

I have configured auto_create_admin_role and auto_create_default_role to true for my scopes. This means Boundary helps me to set up roles for me for the scopes to avoid having to do it myself. If you require full control you should set these flags to false and create the roles yourself.

Auth methods
#

The next major part to set up is one or more auth methods.

I mentioned that the password auth method enabled at the global scope comes with the HCP Boundary cluster, but this is seldom the ideal auth method for your users. I recommend using the password auth method for automation accounts (I will do that later in this post), and instead set up an OIDC (or LDAP) auth method for your human users.

I have my user identities in Microsoft Entra ID, so it makes sense to set up an OIDC auth method for that:

resource "boundary_auth_method_oidc" "provider" {
  name                 = "Entra ID"
  scope_id             = boundary_scope.project.scope_id
  is_primary_for_scope = true
  state                = "active-private"

  client_id          = azuread_application.oidc.client_id
  client_secret      = azuread_application_password.oidc.value
  issuer             = "https://sts.windows.net/${data.azuread_client_config.current.tenant_id}/"
  signing_algorithms = ["RS256"]
  api_url_prefix     = data.hcp_boundary_cluster.this.cluster_url
  claims_scopes      = ["groups"]
  prompts            = ["select_account"]
  account_claim_maps = ["oid=sub"]
}

I add this auth method to a project scope. You could have different (or multiple) auth methods for each scope, if you wish. A lot of the configuration is not really related to Boundary, instead it is coming from the identity provider. You can see how I have set up Entra ID to allow this auth method to work in the GitHub repository. One important point to make here is that I have added the configuration account_claim_maps = ["oid=sub"], this is to make sure the correct value for the username is mapped between Entra ID and Boundary. I have also added prompts = ["select_account"], this is useful if your users have multiple accounts in the same identity provider (and it was useful for me during my demo because I have multiple test accounts in the same identity provider.)

Later on I want to create automation users, so it would be a good idea to have a data source for the password auth method:

data "boundary_auth_method" "password" {
  name = "password"
}

The default password auth method has the name password, so it is easy to reference it when setting up the data source.

Managed groups, users, accounts
#

A user in Boundary can have one or more accounts associated with it. An account is a set of credentials in a given auth method. Our human users will only have a single account. When you are using an OIDC auth method you do not need to create users and accounts yourself, they are automatically created when the user signs in. You can still set up accounts yourself if you want to customize them in some way:

resource "boundary_account_oidc" "lane_buckwindow" {
  name           = data.azuread_user.lane_buckwindow.mail_nickname
  auth_method_id = boundary_auth_method_oidc.provider.id
  issuer         = boundary_auth_method_oidc.provider.issuer
  subject        = data.azuread_user.lane_buckwindow.object_id
}

resource "boundary_account_oidc" "margarete_gnaw" {
  name           = data.azuread_user.margarete_gnaw.mail_nickname
  auth_method_id = boundary_auth_method_oidc.provider.id
  issuer         = boundary_auth_method_oidc.provider.issuer
  subject        = data.azuread_user.margarete_gnaw.object_id
}

In this case I set up two accounts for two of my users in order to make sure I control the value of the subject field. This might be possible to achieve in other ways as well, but for my situation this works exactly as I intend.

When it comes to users in the password auth method you do need to create both users and accounts. I will create two automation accounts in a later section, so I skip those details for now.

It is highly likely that you have groups in your identity provider, and that you would like to mirror these groups in Boundary. This can be achieved with managed groups:

resource "boundary_managed_group" "all" {
  auth_method_id = boundary_auth_method_oidc.provider.id
  description    = "Group for all users"
  name           = "all-users"
  filter         = "\"${data.azuread_group.all.object_id}\" in \"/token/groups\""
}

resource "boundary_managed_group" "oncall" {
  auth_method_id = boundary_auth_method_oidc.provider.id
  description    = "Group for on-call engineers"
  name           = "on-call-group"
  filter         = "\"${azuread_group.oncall.object_id}\" in \"/token/groups\""
}

Here I have mirrored two groups from my Entra ID tenant, one for all of my users and one for my on-call engineers. The mapping is done in the filter argument.

Workers
#

Of all the Boundary resources a worker is the most complex to create. This is not because the Terraform resource for a worker is complex, in fact this is how we define worker resources:

resource "boundary_worker" "this" {
  scope_id                    = "global"
  name                        = var.worker_name
  worker_generated_auth_token = ""
}

The complexity comes due to the fact that a worker should run on a virtual machine or as a container. This means we must create the infrastructure for the workers. How this is done will differ depending on where your worker will run. For this demonstration I will run my workers as virtual machines (EC2) on AWS. To successfully do this I will need:

  • An EC2 instance with Boundary installed and configured using a Boundary worker configuration file
  • A security group to allow the required traffic to and from the Boundary worker

I will also need to plan where I will deploy my workers. If you have a complex network architecture you might need many workers to segment how traffic is allowed to flow. In general, a Boundary worker must have outbound (egress) access to another Boundary worker or to the Boundary control plane. In reality the worker might also need outbound access to things like AWS CloudWatch (for logs and metrics) and similar things. Of course the worker must also be allowed to connect to the targets you want to access.

Workers can form a chain where you have ingress workers, egress workers, and any number of workers in-between. An ingress worker is a worker that connects to the Boundary control plane, it is the first worker in the chain. An egress worker is a worker that connects to the target. You will also hear about upstream and downstream workers. Imagine Boundary as a river where the river starts flowing at the Boundary control plane, and flows outwards to the chain of workers. All workers are downstream of the control plane. A worker is upstream of other workers further down the chain. When you configure your workers you must make sure that there is outbound access to an upstream worker (or the control plane).

In the configuration of my worker above I specified the argument worker_generated_auth_token = "". What this does is esentially saying that instead of configuring this worker with a worker generated auth token coming from the Boundary process I want to instead configure the Boundary process to use a controller generated token. Either way is possible. Doing it this way means I can create the Boundary worker resource in Boundary first, then configure the actual worker process later. Doing it the other way would mean I first set up my worker instance, then I need to provide a token back to Terraform before I can create the worker resource. For some use-cases this might be a better fit, but I find the controller led worker registration to be more convenient.

My workers have certain network requirements:

  • The worker in the public subnet must have:
    • outbound access to the Boundary control plane
    • inbound access from Boundary clients
  • The worker in the private subnet must have:
    • outbound access to the worker in the public subnet
    • outbound access to the targets in the private subnet

These requirements come from how my network architecture look like. The interesting thing is how the worker in the private subnet have no inbound access requirements.

There are some subtle differences between how the two workers are configured. The worker in the public subnet has the following (shortened) configuration file:

disable_mlock = true

hcp_boundary_cluster_id = "<cluster id>"

listener "tcp" {
  address = "0.0.0.0:9202"
  purpose = "proxy"
}

worker {
  public_addr                           = "<public ip>"
  auth_storage_path                     = "/etc/boundary.d/worker"
  recording_storage_path                = "/tmp/session-recordings"
  controller_generated_activation_token = "<token>"
  
  tags {
    // key-value pairs 
  }
}

Specifically, this worker configures the hcp_boundary_cluster_id argument containing the cluster ID. The private worker instead connects to the public worker, and this is configured using the initial_upstreams argument:

disable_mlock = true

listener "tcp" {
  address = "0.0.0.0:9202"
  purpose = "proxy"
}

worker {
  # use the private ip, even if the name of the argument is public_addr
  public_addr = "<private ip>"

  # connect to the public worker as an upstream worker
  initial_upstreams = ["<ip of the public worker>"]
  
  auth_storage_path                     = "/etc/boundary.d/worker"
  recording_storage_path                = "/tmp/session-recordings"
  controller_generated_activation_token = "<token>"
}

Most of the secret sauce of Boundary workers is specified in the configuration file. How you build the worker instance is up to you, but a good approach for production scenarios is to build an AWS AMI (or similar for other cloud providers) configured for your needs, then inject the configuration file dynamically configured with the correct settings. In the accompanying GitHub repository I have a module for Boundary workers where I configure the workers using the cloudinit provider.

Host catalogs, host sets, hosts
#

A host is the IP or URL for the virtual machine, database, Kubernetes cluster, or whatever else it is that you want to access. If you have multiple hosts that are equivalent (think of an autoscaling group for virtual machines) these can be defined as a host set. You can further group your hosts and host sets into host catalogs. An idea could be to have a host catalog for all your AWS resources, or a host catalog for each team in your organization.

A host catalog is created like so:

resource "boundary_host_catalog_static" "ec2" {
  name     = "aws-ec2-static-host-catalog"
  scope_id = boundary_scope.project.id
}

An example host for an EC2 instance:

resource "boundary_host_static" "ec2" {
  name            = "aws-ec2"
  address         = aws_instance.private_target.private_ip
  host_catalog_id = boundary_host_catalog_static.ec2.id
}

Assuming I had multiple identical EC2 instances I could add them to a host set:

resource "boundary_host_set_static" "ec2" {
  name            = "aws-ec2-static-host-set"
  host_catalog_id = boundary_host_catalog_static.ec2.id
  host_ids = [
    boundary_host_static.ec2.id,
    # other hosts ...
  ]
}

In the example above I used a static host catalog, but there are also dynamic host catalogs. The difference is that Boundary can use tags to discover hosts automatically (hence dynamic). This currently only works for AWS and Azure, and only for virtual machines. Configuring dynamic host catalogs requires a bit more work since we must set up permissions so that Boundary can search for instances in your cloud environment.

Credential stores and credential libraries
#

You can use Boundary to store simple credentials for you, or you can use HashiCorp Vault to do the job for you. The latter opens up a lot more possibilities.

A credential store specifies details of where to connect to access secrets. As mentioned this is either the Boundary cluster itself, or a separate Vault cluster. For Vault credential stores you specify an address to the Vault cluster, what Vault namespace to use, a Vault token with required permissions to work with Vault, and possibly other details that are required to connect to Vault.

In your credential stores you can have one or many credential libraries. A credential library produces a set of credentials with the same permissions. If you are using a Boundary credential store you can only specify static credentials in your corresponding credential libraries. If you are using a Vault credential store you can use dynamic credentials where Vault generates the credentials for you as needed. Just remember that for a given credential library all the credentials have the same permissions, so you can’t use one credential library to generate database secrets with two or more different sets o permissions. If this is required you would instead create multiple credential libraries in your credential store.

The reason to use multiple credential stores has a few reasons. First of all if you are connecting to multiple different Vault clusters you would need a credential store for each. You should also limit the blast radius of a leaked Vault token, so use a credential store for a specific purpose only instead of using a single credential store for a range of different secrets.

As a concrete example, let’s see how we can configure credentials for a Postgres database. First we need to configure resources in Vault, starting with the database secrets engine for Postgres:

resource "vault_database_secrets_mount" "postgres" {
  path = "database"

  postgresql {
    name                 = "postgres"
    username             = "boundary"
    password             = var.aws_rds_master_password
    connection_url       = "postgresql://{{username}}:{{password}}@${aws_rds_cluster.this.endpoint}:5432/appdb?sslmode=disable"
    verify_connection    = false
    max_open_connections = 5

    allowed_roles = [
      "write",
      "read",
    ]
  }
}

I have provided everything required for Vault to be able to connect to the Postgres database. I have specified two allowed roles: write and read. Let’s look at the write role as an example:

resource "vault_database_secret_backend_role" "write" {
  name    = "write"
  backend = vault_database_secrets_mount.postgres.path
  db_name = vault_database_secrets_mount.postgres.postgresql[0].name

  creation_statements = [
    "CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}' INHERIT;",
    "GRANT CONNECT ON DATABASE appdb TO \"{{name}}\";",
    "REVOKE ALL ON SCHEMA public FROM \"{{name}}\";",
    "GRANT CREATE ON SCHEMA public TO \"{{name}}\";",
    "GRANT ALL ON ALL TABLES IN SCHEMA public TO \"{{name}}\";",
    "GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO \"{{name}}\";",
    "ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO \"{{name}}\";",
    "ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO \"{{name}}\";",
  ]

  default_ttl = 180
  max_ttl     = 300
}

This role specifies how new users for this role is created in the database. In essence it allows working with the database named appdb, both reading and writing data.

Boundary requires permissions in Vault to work with the database engine, this is provided with a policy:

resource "vault_policy" "database" {
  name   = "database"
  policy = file("policy/postgres-policy.hcl")
}

The content of the policy:

path "database/creds/read" {
  capabilities = ["read"]
}

path "database/creds/write" {
  capabilities = ["read"]
}

Apart from this Boundary must be able to manage its own Vault token, this is codified in a boundary-controller policy:

path "auth/token/lookup-self" {
  capabilities = ["read"]
}

path "auth/token/renew-self" {
  capabilities = ["update"]
}

path "auth/token/revoke-self" {
  capabilities = ["update"]
}

path "sys/leases/renew" {
  capabilities = ["update"]
}

path "sys/leases/revoke" {
  capabilities = ["update"]
}

path "sys/capabilities-self" {
  capabilities = ["update"]
}

The final piece of the puzzle in Vault is to create an initial token for Boundary having these two policies:

resource "vault_token" "postgres" {
  display_name = "postgres"
  policies = [
    vault_policy.boundary_controller.name, # manage token
    vault_policy.database.name,            # work with the postgres secrets engine
  ]

  no_default_policy = true
  no_parent         = true
  renewable         = true
  ttl               = "24h"
  period            = "1h"
}

Are we done? Only in Vault. Now we must configure the credential store and credential libraries in Boundary! We start with the credential store, this resource basically configures how Boundary can communicate with Vault:

resource "boundary_credential_store_vault" "postgres" {
  name     = "boundary-vault-credential-store-postgres"
  scope_id = boundary_scope.project.id

  address   = data.hcp_vault_cluster.this.vault_private_endpoint_url
  namespace = "admin"
  token     = vault_token.postgres.client_token

  worker_filter = "\"true\" in \"/tags/vault\""
}

The address, namespace, and token arguments is what lets Boundary know where Vault is and allows it to communicate with it. The worker_filter argument configures what Boundary worker can communicate with Vault. In this case I only want to use workers with a tag of vault with the value of true. This tag is set on my worker in the private subnet and the communication happens over the peering connection.

For the postgres database I have two credential libraries. Each credential library produces credential with a set of permission, either read or write. The credential store for the write credentials:

resource "boundary_credential_library_vault" "write" {
  name                = "write"
  credential_store_id = boundary_credential_store_vault.postgres.id
  path                = "database/creds/write"
}

Basically this resource specifies the API-interaction with Vault, but for this example I only need to specify the path in Vault to work with. The credential store for the read credentials look very similar:

resource "boundary_credential_library_vault" "read" {
  name                = "read"
  credential_store_id = boundary_credential_store_vault.postgres.id
  path                = "database/creds/read"
}

In essence, think of a credential store like the endpoint and token for Vault, and think of a credential library like a specific API call to Vault.

Targets and aliases
#

A target in Boundary is the combination of a host, any required credential libraries, and a port to connect to. You can have multiple targets that each end up at the same host, using different credentials to connect to it. A good example of this is a database. You can have multiple targets accessing the same database, on the same port, but each target uses different credentials with different permissions.

Since we have already discussed hosts and credentials we just need to bring it all together into targets. For the postgres database I have two targets, one for reading and one for writing. The write target looks like this:

resource "boundary_target" "write" {
  name     = "aws-aurora-write"
  type     = "tcp"
  scope_id = boundary_scope.project.id

  ingress_worker_filter = "\"public\" in \"/tags/subnet\""
  egress_worker_filter  = "\"private\" in \"/tags/subnet\""

  brokered_credential_source_ids = [
    boundary_credential_library_vault.write.id
  ]
  host_source_ids = [
    boundary_host_set_static.aurora.id,
  ]

  default_port             = 5432
  session_connection_limit = -1
  session_max_seconds      = 3600
}

The credentials we want to add is specified in brokered_credential_source_ids and the hosts are specified in host_source_ids. The port to connect to is specified in default_port. The ingress_worker_filter and egress_worker_filter are both interesting. This allows you to specify the entry point and exit point of the network communication between Boundary and the target.

Boundary 0.16 introduced the concept of aliases. This allows you to set up a friendly name for a target, so that users does not have to know the ID of the target to connect to it:

resource "boundary_alias_target" "write" {
  name                      = "aws-postgres-write"
  scope_id                  = "global"
  value                     = "aws.postgres.write"
  destination_id            = boundary_target.write.id
  authorize_session_host_id = boundary_host_static.postgres.id
}

Use-case: self-service access management using IssueOps on GitHub
#

Now I will move away from configuring resources in Boundary with Terraform.

I want to set up a self-service system for access management where users can request access to a given target in Boundary. IssueOps on GitHub is the idea of triggering GitHub Actions workflows based changes to issues on GitHub. Note that you could do IssueOps on other platforms than GitHub. This section assumes some familiarity with GitHub Actions, but you should be able to follow along most of the content even if this is not the case.

Imagine that we want to trigger a GitHub Actions workflow every time a user creates a new issue in our GitHub repository. We can achieve this with the following trigger:

on:
  issues:
    types: [opened]

To successfully provide access for a specific user to a specific target in Boundary I will need to know a few details. A great approach to do this is to use an issue template customized to gather the information that I need:

name: Boundary access request
description: Request access to a target through Boundary.
title: "[Access Request]: "
labels: ["boundary"]
body:
  - type: dropdown
    id: target
    attributes:
      label: What do you need to access?
      description: Self-service is available for the following targets
      options:
        - aws.ec2
        - aws.postgres.read
        - aws.postgres.write
      default: 0
    validations:
      required: true
  - type: textarea
    id: motivation
    attributes:
      label: Motivation
      description: Why do you need access to this resource?
    validations:
      required: true
  - type: dropdown
    id: time
    attributes:
      label: For how long do you need access?
      description: What version of our software are you running?
      options:
        - 1 hour
        - 3 hours
        - 8 hours
      default: 0
    validations:
      required: true

I provide some metadata for my issue template such as name, description and a placeholder for the issue title. I also specify that issues created using this template should be labeled with a boundary label. In the body of my issue template I collect three pieces of information:

  • I ask for the target the user wants to access. This list is currently hard-coded with three options: aws.ec2, aws.postgres.read, and aws.postgres.write. If the number of targets is large I could generate this list using a custom build step.
  • I ask the user to provide a motivation to why this access is needed. This is good for future audit purposes, even though I will not do anything with this information in my demo.
  • Finally I ask the user to specify for how long the access is required. I have opted for using three predefined options of 1, 3, or 8 hours. You could let the user fill in the required time in a text input instead, but be prepared to deal with parsing this information.

Give this file an appropriate name and place it in the .github/ISSUE_TEMPLATE directory.

Remember the workflow trigger I showed above? This trigger does not care about what kind of issue has been created. It will be triggered for every issue that is opened. To avoid running the workflow steps unnecessarily you can have a condition on the job that checks if the correct label (boundary) is set, or checks for any other information that can help identify the type of issue. I will simply look for the boundary label in my demo:

jobs:
  provide-access:
    if: ${{ contains(github.event.issue.labels.*.name, 'boundary') }}

Do we want users to be able to just ask for access and then get it, no questions asked? Most likely not. In GitHub Actions we can use the concept of environments to put protective rules in place:

jobs:
  provide-access:
    environment: boundary

We can specify rules for the boundary environment. A good rule is to require that someone with an admin role reviews the request and allows or denies it. I create my environment using Terraform:

resource "github_repository_environment" "boundary" {
  environment = "boundary"
  repository  = data.github_repository.this.name

  prevent_self_review = true
  can_admins_bypass   = true

  reviewers {
    users = [
      data.github_user.current.id,
    ]
  }
}

I have added my own account as the required reviewer. In a production scenario you would add a team of approvers instead.

The workflow itself consists of the following high-level steps:

  1. Parse the triggering event for the issue body and extract the relevant information
  2. Install the Boundary CLI
  3. Authenticate to Boundary using the global password auth method
  4. Assign the user to the correct role depending on what target is selected
  5. Post a comment to the issue to signal that the access have been provided

The first step consists of parsing the issue body to fetch the relevant information. In this case we need to obtain the desired duration of time and the target name:

steps:
  - run: |
      regex="([0-9]) hour"
      if [[ "${{ github.event.issue.body }}" =~ $regex ]]
      then
          hours="${BASH_REMATCH[1]}"
          echo "Will give access for ${hours} hour(s)"
      else
          echo "Could not parse the number of hours, defaults to 1 hour"
          hours=1
      fi

      regex="(aws.[a-z0-9]+.[a-z0-9]+)"
      if [[ "${{ github.event.issue.body }}" =~ $regex ]]
      then
          target="${BASH_REMATCH[1]}"
          echo "Will give access to target ${target}"
      else
          echo "Could not parse the target, exits"
          exit 1
      fi

      echo "BOUNDARY_TARGET_ALIAS=$target" >> $GITHUB_ENV
      echo "DELTA_TIME=$hours" >> $GITHUB_ENV      

How you do this is up to you. I opted for using Bash instead of using a third-party action. For each piece of information I need I define a regular expression and look through the value of github.event.issue.body. I can be certain that the values will exist as long as the issue is created using my custom issue template. I end the step by adding the target name and the duration as environment variables for later steps. This is done by appending to the $GITHUB_ENV variable.

The next step is to install the Boundary CLI. I got the instructions for how to do this from the official documentation:

steps:
  # ...
  - name: Install Boundary
    run: |
      wget -O- https://apt.releases.hashicorp.com/gpg | \
        sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
      echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | \
        sudo tee /etc/apt/sources.list.d/hashicorp.list
      sudo apt update && sudo apt install boundary      

Next up is authenticating to Boundary:

steps:
  # ...
  - name: Sign-in to Boundary
    run: |
      BOUNDARY_TOKEN=$(boundary authenticate password \
        -format=json \
        -keyring-type=none \
        -login-name=${{ vars.BOUNDARY_USERNAME }} \
        -password=env://BOUNDARY_PASSWORD | jq -r '.item.attributes.token')

        # set the boundary token as en environment variable for later steps
        echo "BOUNDARY_TOKEN=$BOUNDARY_TOKEN" >> $GITHUB_ENV      

I am using a variable named BOUNDARY_USERNAME containing the account name of my GitHub user in Boundary. I also have a secret environment variable named BOUNDARY_PASSWORD containing the GitHub user password. Note that I specify -keyring-type=none in the authenticate command, this is because otherwise Boundary tries to store this in a keyring on the machine that runs the command. This is not possible for a GitHub-hosted runner. I publish the resulting token as en environment variable by appending it to $GITHUB_ENV.

Next is the largest step that handles the interaction with Boundary:

- steps:
  # ...
  - name: Configure access
    run: |
      echo "::add-mask::$BOUNDARY_TOKEN"
      
      # find the correct user
      BOUNDARY_ACCOUNT_ID=$(boundary accounts list \
        -token=env://BOUNDARY_TOKEN \
        -format=json \
        -auth-method-id=${{ vars.BOUNDARY_AUTH_METHOD_ID }} \
        -filter '"/item/name" == "${{ github.event.issue.user.login }}"' | \
        jq -r '.items[0].id')
      echo "Account ID: $BOUNDARY_ACCOUNT_ID"
      
      BOUNDARY_USER_ID=$(boundary users list \
        -token=env://BOUNDARY_TOKEN \
        -recursive \
        -format=json \
        -filter="\"/item/primary_account_id\" == \"$BOUNDARY_ACCOUNT_ID\"" | \
        jq -r '.items[0].id')
      echo "User ID: $BOUNDARY_USER_ID"
      
      BOUNDARY_ROLE_ID=$(boundary roles list \
        -token=env://BOUNDARY_TOKEN \
        -recursive \
        -format=json \
        -filter='"/item/name" == "${{ env.BOUNDARY_TARGET_ALIAS }}"' | \
        jq -r '.items[0].id')
      echo "Role ID: $BOUNDARY_ROLE_ID"
      
      # add principal to the role
      boundary roles add-principals \
        -token=env://BOUNDARY_TOKEN \
        -id="$BOUNDARY_ROLE_ID" \
        -principal="$BOUNDARY_USER_ID"      

Note how I start the step with the command echo "::add-mask::$BOUNDARY_TOKEN". This tells GitHub Actions to not output the value of the Boundary token in the logs. This was required because otherwise the value was indeed printed in the logs. You could avoid this by moving the authentication to this step as well. The rest of the step consists of a few Boundary commands summarized below:

  • Find the correct account ID of the user who is asking for access. I look for an account with the same name as the GitHub account provided in the event: github.event.issue.user.login. This will only work if you have set up the accounts in Boundary to match what they are in GitHub. In this case I am using the same identity provider in the back so it was not too difficult to achieve this.
  • Using the account ID, find the corresponding user ID.
  • Find the correct role ID. I have named the role IDs the same name as the target aliases, which simplifies this a bit.
  • Finally add the user ID to the list of principals assigned to the role.

The last step consists of posting a comment in the original issue to let the user know that access has been granted:

- steps:
  # ...
  - name: Post comment to issue confirming access
    uses: actions/github-script@v7
    with:
      script: |
        github.rest.issues.createComment({
          issue_number: context.issue.number,
          owner: context.repo.owner,
          repo: context.repo.repo,
          body: 'Access has been granted ✅'
        })        

I am using the actions/github-script@v7 action to do this. This action allows you to easily work with the GitHub API.

I glossed over a few details. How can the workflow authenticate to Boundary? A separate user and account in the password auth method have been set up for a specific GitHub user:

resource "boundary_account_password" "github" {
  name           = "github"
  auth_method_id = data.boundary_auth_method.password.id
  login_name     = "github"
  password       = random_password.github.result
}

resource "boundary_user" "github" {
  name     = "github"
  scope_id = "global"
  account_ids = [
    boundary_account_password.github.id
  ]
}

I have set up specific roles in Boundary for each of the available targets:

resource "boundary_role" "ec2" {
  name     = "aws.ec2"
  scope_id = boundary_scope.project.id
  grant_strings = [
    "ids=${boundary_target.ec2.id};actions=read,authorize-session",
  ]
  grant_scope_ids = [
    boundary_scope.project.id,
  ]
}

resource "boundary_role" "postgres_read" {
  name     = "aws.postgres.read"
  scope_id = boundary_scope.project.id
  grant_strings = [
    "ids=${boundary_target.read.id};actions=read,authorize-session",
  ]
  grant_scope_ids = [
    boundary_scope.project.id,
  ]
}

resource "boundary_role" "postgres_write" {
  name     = "aws.postgres.write"
  scope_id = boundary_scope.project.id
  grant_strings = [
    "ids=${boundary_target.write.id};actions=read,authorize-session",
  ]
  grant_scope_ids = [
    boundary_scope.project.id,
  ]
}

Note how none of the roles have assigned any principals to it. This is handled using the GitHub workflow. Each role has a single grant allowing read and authorize-session actions on the specific target.

A visual demo of how this workflow works in practice can be seen in the recorded version of this talk2.

Use-case: provide access for on-call engineers during incidents
#

I have written about this before in my post A tour of HCP Boundary for just-in-time access for on-call engineers. The difference here is mainly the network infrastructure where the targets are located in a private subnet and I am using self-managed workers. Read that post for additional details, below follows a summary of a few important pieces.

The idea is to provide access to targets for on-call engineers during an incident. To achieve this I have set up an AWS Lambda function that is triggered for all state changes of an alarm. When an incident is ongoing the on-call Boundary managed group is assigned a role that is allowed to access targets in Boundary. When the alarm is no longer active the permissions are removed.

Similarly to how the GitHub example above worked I have set up a Boundary user for my Lambda function:

resource "boundary_account_password" "lambda" {
  name           = "aws-lambda-admin"
  description    = "Account for AWS Lambda for on-call administration"
  auth_method_id = data.boundary_auth_method.password.id
  login_name     = "aws-lambda"
  password       = random_password.lambda.result
}

resource "boundary_user" "lambda" {
  name        = "aws-lambda-admin"
  description = "User for AWS Lambda for on-call administration"
  scope_id    = "global"
  account_ids = [
    boundary_account_password.lambda.id
  ]
}

This user has permissions to work with the on-call role in Boundary:

resource "boundary_role" "lambda" {
  name        = "aws-lambda-admin"
  description = "Role for AWS Lambda to administer the on-call role assignment"
  scope_id    = "global"
  grant_strings = [
    "ids=${boundary_role.oncall.id};type=role;actions=read,list,add-principals,remove-principals",
  ]
  principal_ids = [
    boundary_user.lambda.id,
  ]
}

The on-call role itself allows principals to connect to any target:

resource "boundary_role" "oncall" {
  name        = "on-call"
  description = "Role for on-call engineers"
  scope_id    = "global"
  grant_strings = [
    "ids=*;type=target;actions=read,authorize-session",
  ]
  grant_scope_ids = [
    "this", "descendants"
  ]
}

You could use specific roles for each of your targets, but in this case I want the on-call engineers to have access to everything.

You have a number of options when it comes to using Boundary in automation. In the previous use-case I utilized the Boundary CLI, and that is very powerful and easy to work with. Sometimes it makes more sense to work with the API directly, or through an SDK. In this use-case I will work with the Go SDK for Boundary. As of May 2024 there are no other official SDKs for Boundary available, but I like Go so for me this is perfect.

Once again I refer to the accompanying GitHub repository for the full details, and my previous post covering this topic as well.

Summary
#

It is quick and easy to get started using HCP Boundary for accessing your infrastructure wherever it is located. In this post I walked through most components of a successful Boundary setup for accessing resources in AWS. The same workflow is applicable wherever your resources are located, except for the details on how to create the target infrastructure itself which will vary from provider to provider.

Once you have your Boundary infrastructure set up you can start creating interesting use-cases around it. In this post I demonstrated two use-cases:

  • Self-service access management using IssueOps on GitHub.
  • Automatically providing access to targets during incidents for on-call engineers.

What did I not cover? Not a lot. One thing I did skip was session recordings, storage buckets, and policies. These are included in the accompanying GitHub repository so you can see how it is configured there. However, you should know that if you enable session recordings you will not be able to destroy the infrastructure using Terraform. Currently there is no support to disable session recordings and delete storage buckets once they are set up. You can always delete the HCP Boundary cluster of course.

If you’ve read this far, kudos to you! Thank you for your interest! If you have any questions you can reach out to me directly on LinkedIn.


  1. From the official documentation https://developer.hashicorp.com/boundary/docs/overview/what-is-boundary↩︎

  2. I will add a link to the talk once it is available. ↩︎

Mattias Fjellström
Author
Mattias Fjellström
Cloud architect consultant and an HashiCorp Ambassador