How to deploy AWS Redshift Cluster using terraform.

Reading Time: 4 minutes

The AWS Redshift service manages all of the settings up, operating, and scaling of a data warehouse. These tasks include provisioning capacity, monitoring and backing up the cluster, and applying patches and upgrades to the Amazon Redshift engine.

In this blog, We will create a new VPC and the required resource to create a redshift Cluster in the new VPC. We will create the following resource using terraform:

  1. A new VPC for the redshift cluster
  2. Defining the default Security Group for our VPC
  3. Creating a couple of subnets for cluster
  4. A Redshift Subnet Group for cluster
  5. The IAM role (allows our cluster to read and write to S3)
  6. Redshift Cluster

RedShift Cluster:

We’ll create a few files :

cd <working Dir>

mkdir terraform-redshift

cd terraform-redshift

touch resource.tf

touch provider.tf

touch variable.tf

touch terraform.tfvars

Here, I have created a few files for storing the terraform code. As In resource.tf, I will store the resource, In the provider.tf, I will store the provider and terraform.tfvars, I will store the environment variables and In a variable.tf, I will store varibales

Set the Provider

Set the provider like AWS and region. Here, I will create a profile for AWS credentials.

aws configure --profile <terraform>

In the above command, terraform is the name of the credential profile After running this command, We will provide the Access key and secret key to set up the credentials. As You can see below I have passed the profile name in the provider section.

provider.tf


provider "aws" {

  profile    = "${var.profile}"

  region     = "${var.aws_region}"

}

In this section, I have provided the profile and region name. These values are defined in the terraform.tfvars.

variable.tf


variable "aws_region" { 

}

variable "profile" {
       
      description = "AWS credentials profile"

}

Create VPC

Create the new VPC for our redshift cluster.

terrafrom.tfvars

vpc_cidr = "10.0.0.0/16"
varibale.tf

variable "vpc_cidr" { 

}
resource.tf


resource "aws_vpc" "redshift_vpc" {

 cidr_block       = "${var.vpc_cidr}"

 instance_tenancy = "default"

tags = {

   Name = "redshift-vpc"

 }

}

Internet Gateway (for VPC)

Now, We will define an internet gateway that we can attach to our VPC. After this, We can easily access this from the internet.

resource.tf


resource "aws_internet_gateway" "redshift_vpc_gw" {

 vpc_id = "${aws_vpc.redshift_vpc.id}"

depends_on = [

   "aws_vpc.redshift_vpc"

 ]

}

Default Security Group for VPC

At this point, We will modify the default security group to only allow ingress from port 5439 which is the Redshift port. We will set the IP to the ingress cidr_blocks. I am using 0.0.0.0/0 so no one will know my IP, but this is not the best practice as it will allow anyone to connect to your cluster If they know the username and password.

resource.tf



resource "aws_default_security_group" "redshift_security_group" {

 vpc_id     = "${aws_vpc.redshift_vpc.id}"

ingress {

   from_port   = 5439

   to_port     = 5439

   protocol    = "tcp"

   cidr_blocks = ["0.0.0.0/0"]

 }


tags = {

   Name = "redshift-sg"

 }

depends_on = [

   "aws_vpc.redshift_vpc"

 ]

}

Subnets

Next, we will create two subnets. These subnet will use when creating our Redshift Subnet Group.

terraform.tfvars


redshift_subnet_cidr_first = "10.0.1.0/24"

redshift_subnet_cidr_second = "10.0.2.0/24" 
variable.tf

variable "redshift_subnet_cidr_first" { 

}

variable "redshift_subnet_cidr_second" { 

}
resource.tf


resource "aws_subnet" "redshift_subnet_1" {

 vpc_id     = "${aws_vpc.redshift_vpc.id}"

 cidr_block        = "${var.redshift_subnet_cidr_first}"

 availability_zone = "ap-south-1a"

 map_public_ip_on_launch = "true"

tags = {

   Name = "redshift-subnet-1"

 }

depends_on = [

   "aws_vpc.redshift_vpc"

 ]

}

resource "aws_subnet" "redshift_subnet_2" {

 vpc_id     = "${aws_vpc.redshift_vpc.id}"

 cidr_block        = "${var.redshift_subnet_cidr_second}"

 availability_zone = "ap-south-1b"

 map_public_ip_on_launch = "true"

tags = {

   Name = "redshift-subnet-2"

 }

depends_on = [

   "aws_vpc.redshift_vpc"

 ]

}

Redshift Subnet Group

Here, We will define the redshift subnet group resource.

resource.tf


resource "aws_redshift_subnet_group" "redshift_subnet_group" {

 name       = "redshift-subnet-group"

 subnet_ids = ["${aws_subnet.redshift_subnet_1.id}", 

"${aws_subnet.redshift_subnet_2.id}"]

tags = {

   environment = "dev"

   Name = "redshift-subnet-group"

 }

}

IAM Role Policy

We will create an IAM Role Policy. This role will allow our cluster to read and write to any of our S3 bucket.

resource.tf

resource "aws_iam_role_policy" "s3_full_access_policy" {

 name = "redshift_s3_policy"

 role = "${aws_iam_role.redshift_role.id}"

policy = <<EOF

{

   "Version": "2012-10-17",

   "Statement": [

       {

           "Effect": "Allow",

           "Action": "s3:*",

           "Resource": "*"

       }

   ]

}

EOF

}

IAM Role

We are creating an IAM role using the policy which we just defined. We’ll attach this Role to the cluster later.

resource.tf



resource "aws_iam_role" "redshift_role" {

 name = "redshift_role"

assume_role_policy = <<EOF

{

 "Version": "2012-10-17",

 "Statement": [

   {

     "Action": "sts:AssumeRole",

     "Principal": {

       "Service": "redshift.amazonaws.com"

     },

     "Effect": "Allow",

     "Sid": ""

   }

 ]

}

EOF

tags = {

   tag-key = "redshift-role"

 }

}

Redshift Cluster

Finally, we’ll define the redshift cluster. We will define the password in terraform.tfvars. We will use this password, When we connect to the cluster.

terraform.tfvars

rs_cluster_identifier = "demo-cluster"

rs_database_name = "database_cluster"

rs_master_username = "demo"

rs_master_pass = "<PASSWORD>"

rs_nodetype = "dc2.large"

rs_cluster_type = "single-node"

variable.tf


variable "rs_cluster_identifier" { }

variable "rs_database_name" { }

variable "rs_master_username" { }

variable "rs_master_pass" { }

variable "rs_nodetype" { }

variable "rs_cluster_type" { }
resource.tf


resource "aws_redshift_cluster" "default" {

 cluster_identifier = "${var.rs_cluster_identifier}"

 database_name      = "${var.rs_database_name}"

 master_username    = "${var.rs_master_username}"

 master_password    = "${var.rs_master_pass}"

 node_type          = "${var.rs_nodetype}"

 cluster_type       = "${var.rs_cluster_type}"

 cluster_subnet_group_name = 

"${aws_redshift_subnet_group.redshift_subnet_group.id}"

 skip_final_snapshot = true

 iam_roles = ["${aws_iam_role.redshift_role.arn}"]

depends_on = [

   "aws_vpc.redshift_vpc",

   "aws_security_group.redshift_security_group",

   "aws_redshift_subnet_group.redshift_subnet_group",

   "aws_iam_role.redshift_role"

 ]

}

Initialize:

terraform init 

Plan:

If everything is going good then the below command will show the plan that we are going to create resources.

terraform plan 

Apply:

Apply will create the resources.

terraform apply

Connection:

Now, you can connect to the cluster within the Redshift Service section of the AWS Console.

  1. Select Query Editor and configure the connection details based on your terraform.tfvars file
  2. After connection, select Public as the schema.
  3. Now, Create a table with the following query:
create table shoes(

shoetype varchar (20),

color varchar(20));

4 . You should see a shoes table appear under tables.

Conclusion:

We have covered, How to deploy RedShift using terraform. You can follow this link to know more about redshift. f you find this blog helpful then like and share it with your friends.

You can read more blogs here.

Written by 

Mohd Muzakkir Saifi is a Software Consultant at Knoldus Software. He loves to take deep dives into cloud technologies & different tools. His hobbies are playing gymnastics and traveling.