Managing Snowflake resources can be a complex task, especially as your data environment grows. To streamline this process, Infrastructure as Code (IaC) tools like Terraform offer a robust solution. By treating your Snowflake infrastructure as code, you can automate provisioning, updates and deletions, ensuring consistency and reducing human error.
In this blog post, we’ll delve into using Terraform to create Snowflake resources. We will be doing the following steps:
- Setting up a free Snowflake account
- Create the Terraform code for simple use case of connecting and creating a Snowflake Virtual Warehouse
- Setting up environment variables and running Terraform commands from local terminal to deploy the resource
- Automating the deployment pipeline using GitHub Actions, along with setting up environment variables
Why Use Terraform for Snowflake?
There are several ways to create Snowflake resources:
- Manual creation: Using the Snowflake web interface.
- SnowSQL: Using Snowflake’s command-line interface.
- Snowflake REST API: Directly interacting with Snowflake’s API.
- Terraform: Using Infrastructure as Code to automate provisioning.
Terraform stands out as a preferred choice due to its declarative nature, version control integration, and ability to manage complex infrastructure. By using Terraform, you can ensure consistency, reproducibility, and scalability in your Snowflake environment.
Getting Started with Snowflake
Before diving into Terraform, you will need a Snowflake account. If you do not have a Snowflake account already, you can make use of the free account offered by Snowflake with preloaded credits for 28 days ($400 at the time of this blog creation), perfect for experimenting and learning. To create a free account, visit the Snowflake website and follow the sign-up instructions.
Now that we have a Snowflake account, we want to create all the resources in that account using terraform. For that we will need define our resources along with a few other configurations in terraform code files and use terraform cli commands to deploy them on to Snowflake account.
For establishing a connection to Snowflake, we need a way to authenticate Terraform. One of the preferred ways of authentication is Private Key authentication.
Key pair authentication uses a public key and a private key to securely authenticate a user without a password. The server encrypts a challenge with the public key, and the user’s machine decrypts it with the private key to verify identity. Let’s see how to do that in the next section.
Creating a Key Pair for Snowflake Access
We are going to use the Private Key authentication method to set up a secure authentication from Terraform to connect to Snowflake.
For this we need to generate a key pair. Below is the screenshot of the commands and output.
cd ~/.ssh
openssl genrsa -out snowflake_key 4096
openssl rsa -in snowflake_key -pubout -out snowflake_key.pub
Creating a Snowflake Role and User for Terraform
For security and governance, it’s essential to create a dedicated role and user for Terraform to interact with Snowflake. Here’s how:
- Create a Role:
- Grant permissions to the role
- Create a User and grant role
- Set public key to the user for Private Key authentication
- Set the public key created in the previous step to the user created.
Open an empty Worksheet in in the Snowflake UI and run the below commands:
CREATE ROLE IF NOT EXISTS INFRA_ROLE;
GRANT CREATE ROLE ON ACCOUNT TO ROLE INFRA_ROLE;
GRANT MANAGE GRANTS ON ACCOUNT TO ROLE INFRA_ROLE;
GRANT CREATE WAREHOUSE ON ACCOUNT TO ROLE INFRA_ROLE;
GRANT MANAGE WAREHOUSES ON ACCOUNT TO ROLE INFRA_ROLE;
CREATE USER IF NOT EXISTS TERRAFORM_USER DEFAULT_ROLE = INFRA_ROLE;
GRANT ROLE INFRA_ROLE TO USER TERRAFORM_USER;
ALTER USER TERRAFORM_USER
SET RSA_PUBLIC_KEY =
Setting Up Your Terraform Project Structure
To organise your Terraform code effectively, adopt a structured project layout:
project-root-dir
├── infra
│ └── snowflake
│ └── terraform
│ ├── environments
│ │ ├── dev.backend
│ │ └── dev.tfvars
│ ├── modules
│ │ └── warehouses
│ │ ├── provider.tf
│ │ ├── variables.tf
│ │ ├── grants.tf
│ │ └── warehouses.tf
│ ├── variables.tf
│ └── main.tf
├── README.md
└── . . .
- environments: Contains environment-specific configuration files
- modules: Houses reusable Terraform modules for Snowflake resources (e.g., warehouses, databases).
- main.tf: The core Terraform configuration file.
- variables.tf: Defines global variables used across the project.
Running Terraform Commands from Local
To execute Terraform commands locally, follow these steps:
1. Set environment variables for the s3 bucket which will be used as backend for Terraform to store the terraform.tfstate file
TFSTATE_BUCKET : S3 Bucket name
TFSTATE_KEY : Bucket key when the file is stored
TFSTATE_REGION : S3 bucket region
These values can either be set as environment variables or can be added to a config (dev.backend) file as shown in the setp#3.
export TFSTATE_BUCKET=""
export TFSTATE_KEY="terraform-dev.tfstate"
export TFSTATE_REGION="ap-southeast-2"
The next set of variable are required fields for connecting Snowflake from Terraform.
export SNOWFLAKE_ACCOUNT=""
export SNOWFLAKE_USER="TERRAFORM_USER"
export TF_VAR_private_key="$(cat ~/.ssh/snowflake_rsa_key)"
export SNOWFLAKE_PRIVATE_KEY="$(cat ~/.ssh/snowflake_rsa_key)"
Note: Terraform plan doesn’t recognise the private key value if this additional variable SNOWFLAKE_PRIVATE_KEY is not added, in addition to the above variable TF_VAR_private_key
2. Format: Format and check Terraform configuration files:
terraform fmt -check –recursive
3. Init: Run terraform init to initialise the working directory and pass the backend config from a file
dev.backend
bucket = "590312749310-ap-southeast-2-snowflake-poc"
key = "terraform-dev.tfstate"
region = "ap-southeast-2"
Run command:
terraform init -backend-config=environments/dev.backend
Run output:
terraform validate
Run output:
5. Plan: Use terraform plan to preview changes without applying to Snowflake environment.
Run command:
terraform plan -var-file="./environments/dev.tfvars"
Run output:
6. Apply: Use terraform apply to execute the actions on the Snowflake environment
Run command:
terraform apply -var-file="./environments/dev.tfvars" -auto-approve
Run output:
Automating the above process using GitHub Actions pipeline
Following are the steps to create an automated pipeline:
- Set up Environment Variables: Store the required values in the environment variables and store sensitive information as GitHub secrets.
- Create a workflow file
├── .github
│ └──workflows
│ ├──infra-terraform-snowflake.yml
│ │…
│
name: Terraform Pipeline
run-name: Terraform Pipeline-@${{ github.actor }}.${{ github.sha }}
on:
push:
branches:
- "infra/**"
- "!main"
paths-ignore:
- .gitignore
- LICENSE
- '**/*.md'
- 'README.md'
workflow_dispatch:
# Inputs the workflow expects.
inputs:
environment:
description: 'Deployment Environment'
required: true
default: 'dev'
type: choice
options:
- dev
- test
permissions:
id-token: write # This is required for requesting the JWT
contents: read # This is required for actions/checkout
jobs:
terraform-snowflake:
runs-on: ubuntu-latest
environment: ${{ inputs.environment }}
env:
INFRA_PATH : "infra/snowflake/terraform"
ENV: "dev"
# Ensure that only a single job or workflow runs using the same concurrency group
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
# Use the Bash shell regardless whether the GitHub Actions runner is ubuntu-latest, macos-latest, or windows-latest
defaults:
run:
shell: bash
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
repository: cevoaustralia/cevo-data-pattern-library
- name: configure aws credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: "arn:aws:iam::${{ vars.AWS_ACCOUNT }}:role/github-actions-role"
role-session-name: snowflake-terraform-session
aws-region: ${{ vars.AWS_REGION }}
- name: Set up Terraform
uses: hashicorp/setup-terraform@v1
with:
terraform_version: 1.8.5
- name: Prepare Terraform Runs
run: |
sudo apt-get install jq -y
pip install checkov
curl -s https://raw.githubusercontent.com/terraform-linters/tflint/master/install_linux.sh | bash
- name: Terraform Fmt
id: fmt
run:
echo "** Running Terraform Fmt **"
terraform fmt -check -recursive
working-directory: ./${{ env.INFRA_PATH }}
- name: Terraform Init
id: init
run: |
echo "** Running Terraform Init **"
terraform init -backend-config=./environments/${{ env.ENV }}.backend
working-directory: ./${{ env.INFRA_PATH }}
- name: Check Terraform Init Status
if: steps.init.outcome == 'failure'
run: exit 1
- name: Terraform Validate
id: validate
run: |
echo "** Running Terraform Validate **"
terraform validate
working-directory: ./${{ env.INFRA_PATH }}
- name: Check Terraform Validate Status
if: steps.validate.outcome == 'failure'
run: exit 1
- name: Plan Terraform changes
run: |
echo "** Running Terraform Plan **"
terraform plan -var-file="./environments/dev.tfvars" -out tf.plan
working-directory: ./${{ env.INFRA_PATH }}
env:
SNOWFLAKE_ACCOUNT: ${{ vars.SNOWFLAKE_ACCOUNT }}
SNOWFLAKE_USER: ${{ vars.SNOWFLAKE_USER }}
TF_VAR_private_key: ${{ secrets.TF_VAR_PRIVATE_KEY }}
SNOWFLAKE_PRIVATE_KEY: ${{ secrets.TF_VAR_PRIVATE_KEY }}
- name: Terraform Plan Status
if: steps.plan.outcome == 'failure'
run: exit 1
- name: Run Checkov Terraform Infra Checks
run: |
echo "** Save Terraform Plan File as Json File **"
terraform show -json tf.plan > tf.json
# echo "** Format Json Terraform Plan **"
# jq '.' tf.json > tf_pretty.json
# echo "** Running Checkov Terraform Infra Checks **"
# checkov -f tf.json --skip-check CKV2_AWS_5,CKV2_AWS_11,CKV_AWS_144,CKV2_AWS_41,CKV2_AWS_12,CKV_AWS_337
working-directory: ./${{ env.INFRA_PATH }}
- name: Run Terraform Apply
run: |
echo "** Save Terraform Plan File as Json File **"
terraform apply -var-file="./environments/dev.tfvars" -auto-approve
working-directory: ./${{ env.INFRA_PATH }}
env:
SNOWFLAKE_ACCOUNT: ${{ vars.SNOWFLAKE_ACCOUNT }}
SNOWFLAKE_USER: ${{ vars.SNOWFLAKE_USER }}
TF_VAR_private_key: ${{ secrets.TF_VAR_PRIVATE_KEY }}
SNOWFLAKE_PRIVATE_KEY: ${{ secrets.TF_VAR_PRIVATE_KEY }}
GitHub Actions Pipeline Run:
Terraform init
Terraform plan
Terraform apply
Snowflake Account UI – Warehouse created:
Conclusion
By using Terraform, you can significantly improve the management of your Snowflake resources. This blog post provided a foundational overview of creating Snowflake resources using Terraform, from setting up the environment to executing Terraform commands both locally and through GitHub Actions. By following these steps and best practices, you can achieve greater efficiency, consistency and control over your Snowflake infrastructure.