Over the past few years there have been many features related to configuration-driven state manipulation. These types of operations involve using moved
blocks to move a resource from one address in your state to a new address (e.g. migrating a resource from the root module to a new child-module), removed
blocks to remove a resource from the state file (e.g. the resource should no longer be managed by the current Terraform configuration), and import
blocks to import resources into your state file (e.g. bring resources provisioned using some other means under management by the current Terraform configuration).
One of the latest additions to Terraform is Terraform search. Although not a feature for directly manipulating your state file, it will likely be involved in the process of bringing existing infrastructure under management by Terraform.
In this blog post we will learn what Terraform search is, how it works, and see a few examples of it in use.
What is Terraform search?#
Terraform search is declarative resource discovery. This feature allows you to discover resources through special types of queries against your Terraform providers. The end goal is to bring the discovered resources under management by Terraform.
The declarative part of Terraform search is that you define your search queries using the HCL language similar to how you configured the desired state of your infrastructure in your Terraform configuration.
Provider Support#
A common bottleneck for many new features in Terraform is that they must be implemented by the providers (think of actions, ephemeral resource arguments, and identity-based imports to name a few recent cases).
These new features are often co-developed with one or two providers to ensure there is support for the feature from day one. This time the AWS provider already has some support for Terraform search in the following list resources: aws_instance
, aws_iam_role
, and aws_cloudwatch_log_group
.
It will likely take some time before other providers are onboarded, and even more time before there is widespread support for list resources.
This post will focus on what is available today.
How Terraform search works#
The workflow for Terraform search consists of the following high-level steps:
- Configure one or more queries using the new
list
block in.tfquery.hcl
files. - Run
terraform query
to discover resources fulfilling the queries. - (Optional) Bring these resources in under management by Terraform.
The new list
block has the following general structure:
list "<list type>" "<symbolic name>" {
provider = ... # required argument
# query arguments depending on list type
}
The list
block has two labels. The first label is the list block type, e.g. aws_instance
. The second label represents the symbolic name of the list
block. The combination of list block type and symbolic name must be unique within all your .tfquery.hcl
files.
Search queries (list
blocks) are defined in a new file type ending with .tfquery.hcl
. This means that Terraform search is not part of the normal Terraform plan and apply workflow.
A full example of a .tfquery.hcl
file that configures the AWS provider and includes a query for all the EC2 instances in the configured region is shown below:
provider "aws" {
region = "us-west-1"
}
list "aws_instance" "all" {
provider = aws
}
The .tfquery.hcl
files also has support for locals
blocks and variable
blocks.
Executing a basic Terraform search query#
To execute the queries you have configured in your .tfquery.hcl
files you run the new terraform query
CLI command. All .tfquery.hcl
files in the same directory where you run the command are included.
You have to run queries from a directory where you have initialized a Terraform configuration.
So before you do anything else, (at minimum) create a main.tf
file with the following contents:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 6.0"
}
}
}
Now run terraform init
to download the AWS provider and initialize the configuration.
Running terraform query
for the configuration shown in the previous section gives us the following results1:
$ terraform query
list.aws_instance.all account_id=123456789012,id=i-0835b41ff06f2b6cf,region=us-west-1 frontend
list.aws_instance.all account_id=123456789012,id=i-0e7ad4412b60c75f5,region=us-west-1 frontend
list.aws_instance.all account_id=123456789012,id=i-066e446260eb7f82b,region=us-west-1 backend
list.aws_instance.all account_id=123456789012,id=i-064fd00d079825559,region=us-west-1 backend
list.aws_instance.all account_id=123456789012,id=i-01ea716dd96e54d01,region=us-west-1 backend
The columns in the output show the following data:
- The address of the query (e.g.
list.aws_instance.all
). - An object containing the identity attributes of the discovered resource (e.g. for EC2 instances this includes the AWS account ID, the instance ID, and the AWS region).
- The
Name
tag of the discovered resource (e.g.web
).
The results are printed to the terminal where you ran the command.
You can refine and update the query and re-run terraform query
as many times as you like.
Generating configuration for the discovered resources#
You can ask Terraform to generate Terraform configuration and import blocks for the resources that are discovered by a query using the -generate-config-out=path
flag to the terraform query
command. The path
value must be the name of a file that does not yet exist (i.e. Terraform can’t append to a file or replace an existing file).
Since the generated configuration also includes import
blocks you can easily import the discovered resources into your Terraform state and start managing them using Terraform going forward.
To generate the configuration for the previous example, add the -generate-config-out
flag to the command:
$ terraform query -generate-config-out=instances.tf
list.aws_instance.all account_id=123456789012,id=i-03a2dfb151604938a,region=us-west-1 web
list.aws_instance.all account_id=123456789012,id=i-003d6dc317a52fe40,region=us-west-1 web
list.aws_instance.all account_id=123456789012,id=i-0d28e2fb14ba0ed4f,region=us-west-1 web
The generated file instances.tf
contains the following code (truncated for brevity):
# __generated__ by Terraform
# Please review these resources and move them into your main configuration files.
# __generated__ by Terraform
resource "aws_instance" "all_0" {
# details omitted ...
}
import {
to = aws_instance.all_0
provider = aws
identity = {
account_id = "123456789012"
id = "i-0835b41ff06f2b6cf"
region = "us-west-1"
}
}
# ... three resources omitted for brevity ...
resource "aws_instance" "all_4" {
# details omitted ...
}
import {
to = aws_instance.all_4
provider = aws
identity = {
account_id = "123456789012"
id = "i-01ea716dd96e54d01"
region = "us-west-1"
}
}
For each EC2 instance there is a resource
block and an import
block.
The resource
blocks contain all available attributes of the resource type, which are a lot for EC2 instances. Before you move this into your Terraform configuration you might want to remove all unnecessary default values from the configuration.
The generated import
blocks use the new import by identity feature.
Note that the generated configuration is somewhat experimental, and at the time of writing the generated EC2 configuration is invalid and requires some hands-on modifications. In fact, for this specific configuration I got 95 individual errors when I tried to run terraform plan
.
I believe all of these errors disappear if you remove all unnecessary default values from each aws_instance
resource.
Using meta-arguments in list blocks#
A list
block has support for the count
and for_each
meta-arguments.
In the following example we use for_each
with a list of region names to query for EC2 instances in three different regions:
provider "aws" {
region = "us-west-1" # default region
}
locals {
regions = ["us-west-1", "us-east-1", "eu-west-1"]
}
list "aws_instance" "all" {
for_each = toset(local.regions)
provider = aws
config {
region = each.value
}
}
When I run terraform query
for my AWS account I get the following output:
$ terraform query
list.aws_instance.all["eu-west-1"] account_id=123456789012,id=i-045d428c88b12f39e,region=eu-west-1 backup
list.aws_instance.all["us-east-1"] account_id=123456789012,id=i-089f3a5328681f9bb,region=us-east-1 web01
list.aws_instance.all["us-east-1"] account_id=123456789012,id=i-08298eac244d627ec,region=us-east-1 web02
list.aws_instance.all["us-west-1"] account_id=123456789012,id=i-0835b41ff06f2b6cf,region=us-west-1 frontend
list.aws_instance.all["us-west-1"] account_id=123456789012,id=i-0e7ad4412b60c75f5,region=us-west-1 frontend
list.aws_instance.all["us-west-1"] account_id=123456789012,id=i-066e446260eb7f82b,region=us-west-1 backend
list.aws_instance.all["us-west-1"] account_id=123456789012,id=i-064fd00d079825559,region=us-west-1 backend
list.aws_instance.all["us-west-1"] account_id=123456789012,id=i-01ea716dd96e54d01,region=us-west-1 backend
The list of results is separated based on the region that was queried. Note that the use of for_each
creates three separate queries.
Using count
works in a similar way.
Using variables in query files#
You can define variables using variable
blocks in the .tfquery.hcl
files. However, these variables must also be defined in the root module.
You can pass values to these variables using the -var 'myvar=myvalue'
or -var-file=filename
flags. You can also set values for the variables using the terraform.tfvars
or the *.auto.tfvars
variables files that are picked up automatically by Terraform when you run terraform query
.
An example of using a variable for the AWS region name in the query looks like this:
variable "aws_region" {
type = string
default = "eu-west-1"
}
provider "aws" {
region = var.aws_region
}
list "aws_instance" "all" {
provider = aws
}
To execute the query for a region other than the default region run:
$ terraform query -var='aws_region=us-east-1'
list.aws_instance.all account_id=123456789012,id=i-089f3a5328681f9bb,region=us-east-1 web01
list.aws_instance.all account_id=123456789012,id=i-08298eac244d627ec,region=us-east-1 web02
Using filters to discover EC2 instances#
So far we have queries for EC2 instances based in a given region only. Commonly you would like to query for other attributes.
In the following example we query for all EC2 instances in the given region that has a tag with key Owner
and value platform-team
:
provider "aws" {
region = "us-west-1"
}
list "aws_instance" "platform_team" {
provider = aws
config {
filter {
name = "tag:Owner"
values = ["platform-team"]
}
}
}
$ terraform query
list.aws_instance.platform_team account_id=123456789012,id=i-066e446260eb7f82b,region=us-west-1 backend
list.aws_instance.platform_team account_id=123456789012,id=i-064fd00d079825559,region=us-west-1 backend
list.aws_instance.platform_team account_id=123456789012,id=i-01ea716dd96e54d01,region=us-west-1 backend
You can include any number of filters in a query to narrow down the results.
The filters that you can use include all the normal EC2 instance filter names, you can find a list of available filters in the documentation.
How to handle negative filters to discover EC2 instances#
Unfortunately, negative filters are not supported (e.g. show me all instances that do not have this tag). Currently I do not know if there is a good solution for how to solve this.
The following contrived example will run three separate queries:
- Find all EC2 instances in a region.
- Find all instances that have a given tag key/value pair set.
- Find all instances that are part of the first query, but not part of the second query. In fact, this query will be split into one query for each instance that was not part of the second query.
region = "us-west-1"
}
list "aws_instance" "all" {
provider = aws
}
list "aws_instance" "platform_team" {
provider = aws
config {
filter {
name = "tag:Owner"
values = ["platform-team"]
}
}
}
locals {
# all query results for each query
all = list.aws_instance.all.data
platform_team = list.aws_instance.platform_team.data
# I have not found out what the column name is, but I can extract it like this
column_name = keys(local.all[0])[1]
# extract instance ids from the queries
all_ids = [for i in list.aws_instance.all.data : i[local.column_name].id]
platform_team_ids = [for i in list.aws_instance.platform_team.data : i[local.column_name].id]
# compute the missing instance ids
missing = setsubtract(local.all_ids, local.platform_team_ids)
}
list "aws_instance" "test" {
for_each = local.missing
provider = aws
config {
filter {
name = "instance-id"
values = [each.value]
}
}
}
The magic is in the locals
block. Note that this is an example I hacked together, and it is possible that it could be simplified. Running terraform query
for this example gives me the following output:
$ terraform query
list.aws_instance.platform_team account_id=123456789012,id=i-066e446260eb7f82b,region=us-west-1 backend
list.aws_instance.platform_team account_id=123456789012,id=i-064fd00d079825559,region=us-west-1 backend
list.aws_instance.platform_team account_id=123456789012,id=i-01ea716dd96e54d01,region=us-west-1 backend
list.aws_instance.all account_id=123456789012,id=i-0835b41ff06f2b6cf,region=us-west-1 frontend
list.aws_instance.all account_id=123456789012,id=i-0e7ad4412b60c75f5,region=us-west-1 frontend
list.aws_instance.all account_id=123456789012,id=i-066e446260eb7f82b,region=us-west-1 backend
list.aws_instance.all account_id=123456789012,id=i-064fd00d079825559,region=us-west-1 backend
list.aws_instance.all account_id=123456789012,id=i-01ea716dd96e54d01,region=us-west-1 backend
list.aws_instance.test["i-0835b41ff06f2b6cf"] account_id=123456789012,id=i-0835b41ff06f2b6cf,region=us-west-1 frontend
list.aws_instance.test["i-0e7ad4412b60c75f5"] account_id=123456789012,id=i-0e7ad4412b60c75f5,region=us-west-1 frontend
The first set of results show the instances that have the Owner
tag with a value of platform-team
. The second set of results are all my EC2 instances. The third and fourth result are the two instances that do not have the required tag.
A caveat with this approach is that if you export Terraform configuration with this command it will include all instances, even the same instance multiple times.
Limit the number of search results#
If you want to limit the number of search results that are returned from a query you can add the limit
argument:
list "aws_instance" "most" {
provider = aws
limit = 10 # return at most 10 results
}
Terraform search for other AWS resources#
In the previous sections we have seen many examples of queries for EC2 instances. At the time of wiring there are two other supported list
resource types.
You can discover IAM roles using the aws_iam_role
list resource type. The following example discovers all IAM roles in the AWS account:
provider "aws" {
region = "us-west-1"
}
list "aws_iam_role" "all" {
provider = aws
}
The results from terraform query
for this configuration are shown below:
$ terraform query
list.aws_iam_role.all account_id=123456789012,name=AmazonEKSAutoClusterRole AmazonEKSAutoClusterRole
list.aws_iam_role.all account_id=123456789012,name=AmazonEKSAutoNodeRole AmazonEKSAutoNodeRole
list.aws_iam_role.all account_id=123456789012,name=Amazon_EventBridge_Invoke_Api_Destination_1271079307 Amazon_EventBridge_Invoke_Api_Destination_1271079307
list.aws_iam_role.all account_id=123456789012,name=slack-role-9jzjy7qt slack-role-9jzjy7qt
The query does not include service-linked roles.
Finally, you can also query for AWS CloudWatch log groups in a given region:
provider "aws" {
region = "eu-west-1"
}
list "aws_cloudwatch_log_group" "all" {
provider = aws
}
The results from terraform query
for this configuration are shown below:
$ terraform query
list.aws_cloudwatch_log_group.all account_id=123456789012,name=/aws/lambda/my-http-function,region=eu-west-1 /aws/lambda/my-http-function
list.aws_cloudwatch_log_group.all account_id=123456789012,name=/aws/lambda/my-sqs-function,region=eu-west-1 /aws/lambda/my-sqs-function
list.aws_cloudwatch_log_group.all account_id=123456789012,name=/aws/lambda/secret-rotation-lambda,region=eu-west-1 /aws/lambda/secret-rotation-lambda
list.aws_cloudwatch_log_group.all account_id=123456789012,name=/aws/lambda/slack,region=eu-west-1 /aws/lambda/slack
The aws_iam_role
and aws_cloudwatch_log_group
list resources do currently not support any further configuration.
Key takeaways#
Terraform search will simplify bringing unmanaged resources into your Terraform state so that you can properly managed them going forward.
With Terraform search you define queries using the new list
block. You add one or more queries in .tfquery.hcl
files and you run terraform query
to execute the queries. To generate resource
and import
blocks for each discovered resource you can add the -generate-config-out=<filename>
flag to the command.
At the time of writing there are only three supported list resource types: aws_instance
, aws_iam_role
, and aws_cloudwatch_log_group
. This list will be extended in the coming days, weeks, and months. New providers will also be onboarded in due time.
Your results will differ of course, this depends on how many EC2 instances you have in the configured region of your AWS account. ↩︎