Skip to main content

Build a Self-Service Access Management Solution with GitHub and Boundary [HashiConf 2024]

·4735 words·23 mins
Hashiconf Hashicorp Boundary Github Terraform Vault Aws Azure Gcp
This is the accompanying blog post for my talk at HashiConf 2024 in Boston.

HashiConf

Problem Statement
#

I don’t know about you, but when I sit down to do my work I prefer to use as few tools, applications, and portals as possible. Generally, I want as few distractions as possible. Preferably, I only have my code editor and terminal open1.

However, that ideal situation is seldom a reality. A typical workday involves a lot more than my terminal and my editor.

This problem is multiplied in the context of access management2.

I argue that there is a problem of cognitive overload in the context of access management. This cognitive overload comes in two forms:

  1. Cognitive overload from portals3. In order to request access to a resource there are a number of gates that you have to pass:
    • The process is poorly documented, meaning you have to dig through multiple wikis and similar documentation portals.
    • You need to communicate with stakeholders through one or more chat applications or other communication tools.
    • You must add a work item to a board to describe what you are working on, to have a work item number to connect your access request to. If you are lucky, the board is part of your source code system to lower the portal count by one, but that is not always the case.
    • You need to submit tickets in one or more ticketing systems, and wait for your ticket to be prioritized and processed.
    • You need to use cloud portals or consoles to find connection details for the resource you want to access.
    • You need to find the credentials needed to access the resource using a secrets manager application or cloud key-vault.
  2. Cognitive overload from network architecture complexity.
    • You need to understand how your organization’s network architecture looks across all the target platforms and on-prem data centers.
    • You often need to know what VPN client to use, or other network solutions (e.g. Bastion host) you need to be connected to.
    • You might need to open up the network to allow you to reach the resource, this often requires another ticket to be submitted (bringing you back to point number 1 above).

In this post I would like to present a solution to the problem outlined above.

The Solution
#

A good solution to the problem outlined in the previous section has four key characteristics.

The solution should:

  • Unify access management for all resources. The process should be the same no matter what type of resource it is, or where the resource is located.
  • Allow for self-service. A developer should be able to request access, and have that access request approved within seconds using automation. This should be true within certain limits that I will come back to later in this post.
  • Abstract away the network architecture. A developer requesting access to a resource should not need to know anything about the network architecture, what ports must be opened, or what VPN solution to connect to.
  • Abstract away secrets management. A developer should not need to dig through secrets management portals to find the correct credentials or keys required to connect to a target. Secrets management should take place in the background with no required intervention from the developer.

My solution is based on GitHub and Boundary, with support from Vault and Terraform. GitHub is the frontend of the solution, it is where access requests take place. Boundary make up the backend of the solution, providing the secure networking features that are required.

This section introduces my solution in brief. In the following section I will go through the key parts of how to be successful with this solution.

GitHub is a modern developer platform for much more than storing your source code. I use GitHub for my source code, as well as hosting this very blog you are reading on GitHub Pages.

GitHub comes with many features. Of special interest for the solution presented in this blog post are issues and GitHub Actions workflows. You can use issues to drive automation workflows in GitHub Actions and allow for the self-service experience that I want to achieve. This is the practice of IssueOps.


HashiCorp Boundary is a solution for human-to-machine privileged access management.

Boundary can replace your VPN solutions and other less secure and more complicated network solutions. Boundary allows your developers to access what they need, when they need it, without exposing resources to the public internet.


HashiCorp Vault is a comprehensive secrets management system. Vault can generate short-lived credentials when they are needed, and revoke them after they have been used.

Together with Boundary, your developers do not need to know the credentials required to gain access to remote systems.


HashiCorp Terraform is a tool for declarative infrastructure-as-code.

You can build the full solution presented in this blog post using Terraform, allowing it to scale for infrastructure of any size. Terraform can interact with GitHub, Boundary and Vault through their respective Terraform providers.


Let us put ourselves in the shoes of Jane Umbrella, a cloud security expert working at Umbrella Security. Umbrella Security is a solutions provider for all your cloud security needs.

Umbrella

It is Monday morning, Jane arrives at her office and lights a fire in her fireplace. Jane prefers to start her days with a large cup of coffee, and today is no exception. She sits down with her laptop to go through the emails that have arrived during the weekend.

A morning coffee and emails
Jane enjoying a large cup of morning coffee in her office, while going through her emails

She quickly discovers an email from her friend and colleague John Umbrella that grabs her attention:

Possible security vulnerability on our Linux machines

Jane,

I have identified a potential security vulnerability that may affect our Linux machines.

Could you run tests in our environment to see if this is a cause for concern?

Best regards,
John

The platform engineers at Umbrella Security have worked hard to introduce a zero-trust network architecture. Jane has no direct access to any resources where she can run the required tests. Luckily, the platform engineers have set up a self-service access management solution on GitHub that Jane can use.

Jane opens up her favorite browser and goes to GitHub.com. She signs in to Umbrella Security’s GitHub Enterprise Cloud environment.

She navigates to the self-service repository.

Repository landing page
The self-service access management repository on GitHub

Jane opens the Issues tab in the repository and clicks on New issue. She can select between two issue templates:

  1. Cloud resource access request for targets that do not require explicit approvals.
  2. Sensitive resource access request for targets that do require explicit approvals. This includes targets with access to sensitive data (e.g. customer data), and targets located in sensitive network partitions.

Select an issue template
Two issue template options to choose from

Jane selects Cloud resource access requests and clicks on Get started. She fills in the details of the issue form to specify what target she wants access to, for how long, and why the access is required.

Fill in the issue form
Jane fills in the details of the issue form

She clicks on Submit new issue. The new issue is automatically labeled as Boundary.

New issue on GitHub
The issue has been submitted

Behind the scenes a GitHub Actions workflow is triggered by the creation of an issue labeled as Boundary. The workflow is responsible for configuring the requested access in Boundary.

Shortly after the issue has been submitted it is labeled as active and a helpful GitHub Actions bot informs Jane that the access has been granted.

Access has been granted
The requested access has been granted

Jane opens up her terminal and authenticates her local Boundary CLI client:

$ boundary authenticate oidc
Opening returned authentication URL in your browser...

...

Authentication information:
  Account ID:      acctoidc_MIriUh3pIO
  Auth Method ID:  amoidc_TXM2R2EZyT
  Expiration Time: Tue, 16 Oct 2024 21:30:49 CEST
  User ID:         u_HxU1AjAKoi

Once authenticated, she issues a Boundary SSH command4 to connect to the target:

$ boundary connect ssh azure.linux.eu
Welcome to Ubuntu 22.04.5 LTS (GNU/Linux 6.8.0-1014-azure x86_64)
ubuntu@boundary-target:~$
ubuntu@boundary-target:~$ echo "Hello HashiConf!"
Hello HashiConf!
ubuntu@boundary-target:~$

Jane is signed in to the Azure Linux virtual machine, and she starts performing her tests. Jane does not like to share the secrets of her security testing workflow outside of the Umbrella Security organization, so this story does not convey what she is up to.

As it turns out, thanks to the extensive automation work performed by the platform engineers at Umbrella Security, the vulnerability has already been patched and is of no cause for concern.

Jane is a responsible developer. Since she was done early she navigates back to her issue on GitHub and closes it. This triggers a new GitHub Actions workflow that revokes Jane’s access to the target. Shortly thereafter the helpful GitHub Actions bot informs her that her access has been revoked.

Access has been revoked

She takes another sip of her coffee. The coffee is still hot since she did not have to wait for an administrator to approve her access request. Everything was automated, fast, and secure. She did not have to know the details of the target she was accessing, and she did not have to provide any specific credentials to access the target.


The workflow to set up access to a given resource is shown in the following figure.

The workflow for setting up access
The workflow for setting up access

  1. Jane creates a new issue on GitHub. The issue contains details of what she is requesting access for, for how long and why the access is required.
  2. A GitHub Actions workflow is triggered by the creation of the issue.
  3. The GitHub Actions workflow interacts with Boundary to set up the requested access.

This is an example of IssueOps: using issues to drive automation workflows. In this case it was the creation of an issue that started a GitHub Actions workflow. You can also trigger automation based on when an issue is labeled or when someone adds a comment to an issue.

Once access has been granted, Jane can access the target via her local Boundary client. The workflow of connecting to the target is shown in the following figure.

Connection workflow

  1. Jane initiates the connection by authenticating to Boundary and issuing a connection command (e.g. boundary connect ssh).
  2. Boundary interacts with Vault to obtain unique credentials for the session.
  3. Boundary injects the credentials into the session.
  4. The session is established and Jane can access the target.

The only time Jane must provide credentials is when she signs in to GitHub and Boundary using her own identity. This is the only credentials5 that Jane should have to know about.

Key Points For Success
#

There are a few key points to make the solution above a success.

In the following subsections these key points are discussed.

Secure Network Communication
#

How do we ensure that the network communication between the end-user (e.g. Jane) and the target is secure?

This is addressed by HashiCorp Boundary. Through Boundary you can access isolated resources in your cloud environments.

By strategically placing Boundary workers in your network architecture you can connect to arbitrary targets. Chaining multiple Boundary workers together allows you to reach targets in even completely isolated parts of your network infrastructure. You do not have to expose the targets to the internet.

Secure network communication with Boundary

In my solution I am using HCP Boundary which simplifies getting started. You must still deploy your own Boundary workers as virtual machines or containers. You could also host the Boundary control plane yourself if you want to lock down access to it.

Secrets Management
#

How can we simplify credentials management for our cloud resources with this solution?

HashiCorp Vault and Boundary work great together. You can use Vault as a comprehensive secrets management solution. You can use dynamic short-lived credentials unique to the user who requested them.

Boundary can inject or broker credentials from Vault to the user, without requiring the user to have any direct access to Vault. In the case of SSH certificates you can make the experience completely transparent to the end-user, without exposing the certificate at any point in the process.

You can use the same network of Boundary workers to reach a Vault cluster wherever it is deployed.

Inject or broker secrets using HCP Vault

You can use a self-hosted Vault cluster, or use HCP Vault. In my solution I have used HCP Vault.

Identity
#

How do we know who requested access to what?

This question might at first appear like a non-issue. We know who created a given issue on GitHub, right?

First of all, you should use the same identity provider (IDP) for both Boundary and GitHub.

This solution requires the use of GitHub Enterprise. I will tell you why.

GitHub has two ways to handle identities for your enterprise:

Bring-Your-Own GitHub Accounts
#

Users sign in using their organization IDP, but each user must then connect their organization identity with a personal GitHub account.

The benefit of this approach is that your developers do not need to have separate GitHub accounts for work and for their private endeavors. The drawback is that it will be impossible to reliably connect an identity in your GitHub environment with a corresponding identity in Boundary. It will only reliably work if you have a list mapping an organization identity to a GitHub identity.

This solution does not scale very well.

Enterprise Managed Users
#

Enterprise Managed Users (EMU) allows enterprise administrators to be in control of how your organization users are provisioned to your GitHub Enterprise. You can set up automatic provisioning from your identity provider.

A user signs in using their organization IDP, and they are signed in as a unique user in the GitHub Enterprise.

Enterprise Managed Users on GitHub

This solution scales very well and you can reliably connect identities in your identity provider with identities in your GitHub Enterprise.


For my solution I have gone with the second alternative: Enterprise Managed Users. I am using Entra ID as my identity provider.

Configuring how users are provisioned to GitHub from Entra ID

Setting up EMU is simple, once you have enabled EMU on your enterprise. However, you cannot do this by yourself. You will need assistance from GitHub. If you already have a GitHub Enterprise and would like to use EMU, reach out to your GitHub point of contact.

Interacting With Boundary
#

How should the interaction between GitHub and Boundary work?

There are two main ways you can set up the interaction between GitHub and Boundary:

  1. Use the Boundary CLI and issue CLI commands to set up and revoke the requested access.
  2. Use the HCP Terraform API6 to set up workspaces in HCP Terraform to declaratively create the requested access in Boundary.

Both approaches are valid but each have drawbacks that I will come back to in clean-up below.

For my solution I went with the first approach, to interact with Boundary using the Boundary CLI.

To do this, GitHub must have credentials to work with Boundary. I set up a dedicated Boundary user with a username/password. The credentials are provided to GitHub as GitHub Actions secrets. The values are provisioned using Terraform together with everything else I provision for this repository using Terraform.

There are a number of Boundary CLI commands you need to use to successfully add a user to a role in Boundary:

  1. Authenticate to Boundary with boundary authenticate password.
  2. Obtain the OIDC auth method ID using boundary auth-methods list.
  3. Find the correct user in the auth method using boundary accounts list followed by boundary users list.
  4. Get the role ID using boundary roles list.
  5. Finally, add the user principal to the role using boundary roles add-principals.

Helping Users Submit a Request
#

How can we help a user to submit a valid access request?

In a GitHub repository, a user can submit an issue. Issues are commonly used to report problems, bugs, or feature requests related to the source code hosted in the repository.

In its most basic form, issues consist of a title and a body. Both are text fields without any restrictions. This is not good for a self-service access management scenario. It would be better if the issue body followed a certain template to ask for the required information.

To achieve this, GitHub has the concept of issue templates and issue forms. An issue template is a codified (YAML) template for issues. Issue templates are rendered as an issue form in the GitHub UI. Through issue templates you can build a simple form where you ask the user for the required information.

An example of an issue template written in YAML:

name: Cloud resource access request
description: Request access to a cloud resource.
title: "[Access Request]: "
labels:
  - boundary
body:
  - type: markdown
    attributes:
      value: Specify the details of the resource you need to access.
  - type: dropdown
    id: target
    attributes:
      label: Target
      description: What are you requesting access for?
      options:
        - target1
        - target2
      default: 0
  - type: dropdown
    id: time
    attributes:
      label: Duration
      description: For how long do you need access?
      options:
        - 1 hour
        - 2 hours
      default: 0
  - type: textarea
    id: motivation
    attributes:
      label: Motivation
      placeholder: E.g. I need to run system tests
      description: Why do you need access to this resource?

This issue template asks for three things:

  1. What target you wish to access.
  2. For how long the access is requested.
  3. A motivation for why the access is needed.

When an issue following this template is submitted, it is automatically labeled with the labels listed under labels.

The use of dropdown fields in the form allows you to require the user to select from allowed and valid values.

The issue template is rendered as an issue form in GitHub.

A rendered issue form in GitHub

Keeping the List of Targets Updated
#

How do we make sure that the list of available targets is up to date? The issue templates discussed in the previous subsection contain static data. So we need a process to keep them updated since Boundary targets can come and go.

GitHub has a Terraform provider, so we can build our issue templates using Terraform. When we add a target to Boundary we can make sure that the target is propagated to the issue template. When a target is removed, we can make sure to remove it from the issue template.

As you might be aware, Terraform does not appreciate it when you manually change the things that are under Terraform management. This means that we should create the repository and everything inside of it using Terraform. Terraform should be the only one committing to this repository. You can control this using access management in GitHub.

A simple approach to do this with Terraform is to add outputs for all workspaces in HCP Terraform (or similarly wherever you manage Terraform) that create the targets you want to add to the issue templates7:

output "targets" {
  value = concat(
    boundary_alias_target.azure_vm.value,
    boundary_alias_target.azure_sql.value,
    # ...
  )
}

Next, you can use a separate HCP Terraform workspace with a run trigger that watches the relevant workspaces containing targets. This other workspace should update the issue template with the available targets.

You should utilize Boundary host sets to group similar hosts instead of defining individual hosts for your targets.

You can create a template file for your issue template:

name: ${name}
description: ${description}
title: "${title_prefix}"
labels:
  %{ for tag in tags }
  - ${tag}
  %{ endfor }
body:
  - type: markdown
    attributes:
      value: |
        Specify the details of the resource you need to access.
  - type: dropdown
    id: target
    attributes:
      label: Target
      description: What are you requesting access for?
      options:
        %{ for target in targets }
        - ${target}
        %{ endfor }
      default: 0
  - type: dropdown
    id: time
    attributes:
      label: Duration
      description: For how long do you need access?
      options:
        %{ for duration in durations }
        - ${duration}
        %{ endfor }
      default: 0
  - type: textarea
    id: motivation
    attributes:
      label: Motivation
      placeholder: E.g. I need to run system tests
      description: Why do you need access to this resource?

Next, in your GitHub Terraform configuration you create the issue template from this template file:

resource "github_repository_file" "issue_template" {
  repository = var.github_repository_name
  file       = ".github/ISSUE_TEMPLATE/request-access.yaml"
  content = templatefile("./templates/issue.yaml.tftpl", {
    name         = "Cloud resource access request"
    title_prefix = "[Access Request]: "
    description  = "Request access to a cloud resource."
    targets      = sort(local.targets)
    tags         = ["boundary"]
    durations = [
      "1 hour",
      "2 hours",
      "3 hours",
      "6 hours",
      "8 hours",
    ]
  })
}

Onboarding New Targets
#

How can we simplify the process of enabling self-service for new targets?

We can build our Terraform modules in such a way that Boundary targets are added to GitHub automatically, if required. A part of this was covered in the previous subsection.

An instance of a pseudo module in Terraform to onboard new targets looks like this:

module "aws_target" {
  source  = "my-organization/targets/aws-instance"
  version = "1.0.0"

  self_service = true
}

By setting the self_service flag to true, the required Boundary resources (roles, target, alias, host, host sets, etc) should be set up to allow users to access it.

In Boundary, we can add a new role specifically for each target. The role should have no principals to start with. When an access request comes in we add the requester principal to the role to provide the principal with the required access.

resource "boundary_role" "azure" {
  name     = boundary_alias_target.azure.value
  scope_id = boundary_scope.azure.id
  
  grant_strings = [
    "ids=${boundary_target.azure.id};actions=read,authorize-session",
  ]
  grant_scope_ids = ["this"]

  lifecycle {
    ignore_changes = [
      principal_ids,
    ]
  }
}

I give my boundary_role resources the same name as the boundary_alias_target resources. GitHub is interacting with Boundary using the Boundary CLI, so to avoid Terraform removing any principals from the role I must ignore any changes to the principal_ids argument for the boundary_role resource.

Clean Up
#

How do we clean up expired access requests?

In the example above we saw how Jane closed her issue, and this triggered a GitHub Actions workflow which revoked her access.

You can trigger a GitHub Actions workflow when an issue is closed:

on:
  issues:
    types: [closed]

The workflow should delete the principal from the corresponding role in Boundary.

Once the requested access duration has expired, there must be a way to detect this and remove the access automatically. Unfortunately there is no built-in mechanism in GitHub for triggering a workflow a set duration after an issue has been created.

There are two main solutions to consider for this problem.

Scheduled GitHub Actions Workflow
#

Use a scheduled GitHub Actions workflow that runs every X minutes. Pick an X that you are comfortable with.

The drawback with using a small X is that there will be a large number of scheduled workflows triggered each day, most of them with nothing to do. On the plus side with a small X is that expired access will be removed swiftly.

On the other hand, a large X will run fewer workflows but might give the users extra long access to the requested resource, in the worst case they will have access for (requested duration) + X.

Third-Party Schedule Tool
#

Schedule the deletion using a third-party tool. Here there are many, many, options to consider.

I want to highlight one possibility: instead of setting up the access using the Boundary CLI you could instead create a new workspace in HCP Terraform and have the access set up using a Terraform configuration. Configure the workspace as ephemeral, with an exact expiration time. Once this expiration time is up, the workspace resources are automatically destroyed. This solution is very elegant and uses a declarative approach.

The drawback with this solution is that the workspaces themselves will not be deleted from HCP Terraform, so you need some clean-up mechanism in HCP Terraform.


For my solution I have used the first approach. Scheduled workflows together with the GitHub CLI allow us to build a succinct workflow.

I opted for triggering this workflow every ten minutes. This is fine for the kind of resources I allow the developers to gain access to without approval.

To do this with GitHub Actions, add the following trigger to your workflow:

on:
  schedule:
    - cron: "*/10 * * * *"

Protecting Privileged Targets
#

Some targets should be protected more than others.

A few targets are placed in development environments, and you do not require any specific approval process for your developers to gain access to them. However, other targets might be more sensitive. This could include databases where you store potentially sensitive customer data, or virtual machines located in a sensitive network partition.

To require explicit approvals for these targets you should separate them from the other targets and use a specific issue template. Issues created through this issue template are tagged with a specific sensitive label.

You can achieve protection for given targets by the use of GitHub environments. There are different types of environment protections. For this situation an appropriate protection is to require approval of an administrator.

We can configure a GitHub actions workflow to run in the context of an environment as follows:

jobs:
  provide-sensitive-access:
    if: contains(github.event.issue.labels.*.name, 'sensitive')
    runs-on: ubuntu-latest
    environment: sensitive
    steps:
      - ...

Before the provide-sensitive-access job starts, any environment protections will take effect.

A workflow requiring approval before it continues

The environment is configured in Terraform like this:

resource "github_repository_environment" "sensitive" {
  repository          = github_repository.this.name
  environment         = "sensitive"
  
  can_admins_bypass   = false
  prevent_self_review = true

  deployment_branch_policy {
    protected_branches     = true
    custom_branch_policies = false
  }

  reviewers {
    teams = [github_team.owners.id]
  }
}

There are a number of important pieces to look at here:

  • can_admin_bypass is set to false. This prevents administrators from bypassing the protections of this environment.
  • prevent_self_review is set to true. This requires at least four eyes to be involved in approving access to a given target.
  • reviewers include a team for repository owners. This means any member of the owners team can approve access requests to sensitive targets.

Audit Log
#

How can we keep track of who requested access to what, when, and why?

This information will be collected in the form of issues in the GitHub repository.

The list of issues in GitHub serves as an audit log

The list of issues serves as an audit log of who requested access to what, when, and why.

We can use the information in the issues to see what platforms and what types of targets are most often accessed, allowing us to plan ahead of what resources to provision and where.

Summary
#

What have we achieved with this solution?

We have:

  • Unified access management for all resources: The workflow is the same for targets in AWS, Azure, GCP, and wherever else they are.
  • Allowed for self-service: Developers can request access by creating an issue in a GitHub repository. The access is set up automatically.
  • Abstracted away the network architecture: Using Boundary workers we can reach targets in any network architecture. Developers only need to connect to a target with a name (an alias), but do not need to know how the connection traverses the network architecture.
  • Abstracted away secrets management: Vault can generate short-lived credentials that Boundary injects into the session. Developers do not need to know anything about where to find the credentials or even be aware that there are credentials involved.

Can I sit back and just use my terminal and my editor from now on? Perhaps. If I replace the GitHub UI with the GitHub CLI, then I believe I can get pretty far. At least in the context of access management.


  1. Some of you reading this will say something along the lines of: “Why don’t you run an editor inside of your terminal, that way you only have a single tool open?” – That is a valid question! One thing that I have come to accept about myself is that I am not a Neovim person, or a vim person, or use any other editor that runs directly in my terminal environment. I admire those who do, but that is not for me. ↩︎

  2. When I say access management I am talking about accessing cloud resources (virtual machines, databases, etc). More generally, it can mean any resource available over a network. ↩︎

  3. I use the term portal meaning any website, application, system or physical form you have to interact with in the context of access management. ↩︎

  4. I am using a Boundary alias in this command. However, I am not using transparent sessions. If I did, I could have used any SSH client together with the Boundary alias. ↩︎

  5. I am using credentials in a broad sense here. It could include anything required to prove her identity, like MFA. ↩︎

  6. You can also use Terraform outside of HCP Terraform. ↩︎

  7. I am using Boundary alias targets. This allows me to easily connect to a target by using a friendly alias instead of a Boundary target ID. ↩︎

Mattias Fjellström
Author
Mattias Fjellström
Cloud architect consultant and an HashiCorp Ambassador