Tuesday, November 3, 2015

AWS - Bootstrap Windows EC2 instance with Chef-Solo

If you have an environment like we do you, fairly small and few service layers, it may not make sense to provision a Chef Server.  Luckily, we can still get the benefits of using Chef to configure our servers by using the included Chef-Solo.  Chef-Solo will run entirely locally on the instance, therefore we must include all required dependencies on the server, ie. cookbooks, run-lists, environments, etc.

The following process I will demonstrate how to launch an AWS EC2 instance and have Chef-Solo configure the instance. I will not go into details about how Chef works with recipes and cookbooks.

Here are the steps we will need to follow to start this process.
  1. Create S3 bucket to store Chef files, this includes your cookbooks
  2. Create IAM Role and define access to the S3 bucket
  3. Configure UserData to run PowerShell on initial launch
    • The UserData will do the following
      • Download the Chef-Client from an S3 Bucket
      • Download the Chef Cookbooks, Recipes, etc
      • Install the Chef-Client
      • Run Chef-solo
Let's get started!

1. Create an S3 bucket and upload the Chef MSI, cookbooks, run scripts, etc

Within the AWS console, create a new S3 bucket to store the Chef installer as well as all the required Chef cookbooks, environment files, run scripts, etc.  For this example we will use,  'examplebucket' for the S3 bucket name, You will need to use your own unique bucket name.

2. Create an IAM role with a policy to allow Read only access to the S3 bucket

By creating an IAM role and assigning the role to the instance we can eliminate the need to use an IAM user account with access keys.  IAM roles utilize temporary credentials to grant access to AWS resources.

Within the AWS console create a new IAM role and Select Role Type: AWS Service Roles > Amazon EC2

Follow the prompts clicking through until the Role is finally created. With the role created, we must now create a new Inline policy which will grant access to the S3 bucket.

Select the newly created Role and expand the 'Inline Policies' to create a new policy:

Choose the option to create a Custom Policy:

For the policy, we will grant ListBucket and GetObject restricted to the S3 bucket we created earlier.

Here is the policy, you must modify the bucket name :

3. Launch a new instance and configure UserData

When launching a new EC2 instance assign the previously created role to the instance.  Also, expand the Advanced details to provide 'UserData'.  This UserData allows you to run scripts when the instance is first launched.  Our instance will need to download and install Chef as well as execute Chef-Solo.

The userdata will utilize PowerShell to execute downloading and installing Chef.  Here is the actual userdata to include in the instance launch.  This assumes the instance AMI you are launching with has the AWS CLI available (the AWS provided AMI's for Windows include this already)

I've provided comments to the code for clarifications.


The UserData is only run on instance launch and not on restarts.  This is initiated by the Ec2ConfigService which records log information at:
C:\Program Files\Amazon\Ec2ConfigService\Logs\Ec2ConfigLog.txt

Chef-solo will record log information as defined by the -L option, in this case we are creating the logs at:   C:\chef\log.log

I hope this helps you progress to using Chef and moving more towards Infrastructure as Code in your environment!

Saturday, September 12, 2015

AWS - Auto join EC2 Windows instance to Active Directory Domain

Some environments will require you to join your Windows servers to a domain.  The following will show the steps taken to automatically join a server to a Windows domain.  This assumes the following:
   An existing AWS VPC with access to S3 bucket
   New instances are able to communicate to a domain controller.

NOTE:  Amazon does offer its Directory Service with AD Connector that will connect your VPC to your ActiveDirectory, but this will show how you can do so without the AD Connector.

The steps:

  1. Create a PowerShell script to join a server to the domain
  2. Secure the credentials by converting the PowerShell script to an Exe executable using PS2exe
  3. Create an S3 bucket and upload the exe file
  4. Create an IAM role with a policy to allow Read access to the S3 bucket
  5. Launch a new instance, assigning the IAM role and providing User Data which will run the required scripts at first launch

1. Create the PowerShell script

The PowerShell script will join the server to the domain.   We will use the Add-Computer function, and a user account that has permissions to join computers to the domain. Here is the full script, modify the username, password, and DomainName for your environment.  

Save the file as JoinDomain.ps1

2. Convert the PowerShell script to an executable file

To help secure the credentials we will convert the PowerShell script using PS2exe to an executable file. Download PS2exe from: PS2exe download

Extract the zip file to a folder and then run PS2exe.ps1 on the JoinDomain.ps1 script to convert it to an exe file. From a command prompt run the following:

c:\> .\ps2exe.ps1 -inputfile JoinDomain.ps1 JoinDomain.exe

This will create the JoinDomain.exe file.

3. Create an S3 bucket and upload the exe file

Within the AWS console, create a new S3 bucket to store the JoinDomain.exe file.
For this example we will use,  examplebucket  for the bucket name, You will need to use your own unique bucket name.

With the bucket created, we can upload the JoinDomain.exe file to the bucket.

4. Create an IAM role with a policy to allow Read only access to the S3 bucket

By creating an IAM role and assigning the role to the instance we can eliminate the need to use an IAM user account with access keys.  IAM roles utilize temporary credentials to grant access.

Create an IAM role in the AWS console, and Select Role Type: AWS Service Roles > Amazon EC2

Follow the prompts through, clicking next until the Role is finally created. With the role created, we must now create a new Inline policy which will grant access to the S3 bucket.

Select the newly created Role and expand the 'Inline Policies' to create a new policy:

Choose the option to create a Custom Policy:

For the policy, we grant ListBucket and GetObject restricted to the S3 bucket.  Here is the policy, you must modify the bucket name :

"Version": "2012-10-17",
"Statement": [
"Effect": "Allow",
"Action": [
"Resource": ["arn:aws:s3:::examplebucket/*"]

5.  Launch a new instace

Launch a new instance into the VPC.  We need to attach the IAM Role to the instance as well as configure the Advanced > User Data

The User Data is used to run scripts when the instance is first launched.  For our example, we will be downloading the JoinDomain.exe file from S3 and finally executing it.

First assign the IAM Role to the instance.

Next, expand the Advanced Details to show the User Data field.  Here we can provide some PowerShell commands to download the exe file and execute it.  Here is the UserData to include, modifying the S3 bucket location to your environment:

To join newly launched instances to a domain you need to make use of UserData, which allows you to run scripts during the initial startup of the launch.
By using the UserData you can run commands. For our case, we will be executinig an EXE to join to the domain.

Set-ExecutionPolicy unrestricted -Force
New-Item c:/temp -ItemType Directory -Force
set-location c:/temp
read-s3object -bucketname examplebucket -key JoinDomain.exe -file JoinDomain.exe
Invoke-Item C:/temp/JoinDomain.exe

Here's what it looks like in the AWS Console:

Follow the remaining steps to complete launching of the instance. The instance will launch, download the exe, execute it and restart.

Monday, February 9, 2015

AWS - autoscaling and self healing NAT instance

Having your AWS hosted services maintain high availability is often a top priority, and sometimes its not as straightforward as we all would like it to be.  Here I will describe how to create an "almost" highly-available NAT server.

NOTE:  This configuration is not 100% highly available.  If you only have one NAT instance you will have downtime until the newly created NAT instance is re-instated.  For my use this was acceptable as this was used for an email service.  Any outgoing emails would be queued while a new replacement NAT is launched.  The time it takes for a new NAT to be put into service is about 3 minutes.  That met our SLA and not waking me up in the middle of the night!

This configuration will restore services of a failed instance in approximately 3 minutes!

Amazon provides an example of how to configure NAT instances for High-Availability, see it here,  but this configuration uses (2) NAT instances, and only works if the instance is stopped and restarted.  !!The AWS example does not work for terminated instances!!

When you create your NAT instance using an auto-scaling group and launch configuration the newly created instance will have a new network interface (ENI).  You must then update the routing tables with the new ENI ID to direct traffic to the new NAT instance.  We can accomplish this by adding a few items in the launch configuration user data and properly configuring the roles assigned to the instance.

Here are the steps to follow:

1.  Create a new Role (see example below).  Give it a useful name like:  NAT-update-route-table
  The role must grant DescribeNetworkInterfaces and ModifyNetworkInterfaceAttribute for all resources.  This is because we don't have an ARN for the newly launched instance.

  This role must also be allowed to modify the route table that is being used by your subnets.  The actions to allow are CreateRoute and ReplaceRoute.  This we can assign it to only be allowed to our specific route tables using the ARN.

2.  Create a Launch Configuration

   For the launch configuration:
  Select an AMI to use for your NAT.  I recommend using Amazon's community AMI for a NAT, do a search in the AMIs for "amzn-ami-vpc-nat"
  Assign the IAM Role created in the step above
  Assign the appropriate security groups, instance type, etc

  Finally, most importantly provide the User Data which will configure and update the route tables with the new instance ENI.  See full user data below.

I will walk through each step of the user data to explain what each does,  this example is for a AWS Linux NAT instance, therefore we begin our script with:  #!/bin/bash

First step is to enable IP forwarding.

echo 1 > /proc/sys/net/ipv4/ip_forward

Next we must obtain the instance ID.  We can get this from the meta-data provided by Amazon using this URL:

We set the variable, my_instance_id:

my_instance_id=$(curl -s

Next, we must obtain the ID for the network interface, with the instance ID we can now get the ENI ID running this command, which sets the ID to the variable my_eni_id. (be sure to modify this to your region)

my_eni_id=$(aws ec2 describe-network-interfaces --region ", {"Ref": "AWS::Region"}, " --filters Name=attachment.instance-id,Values=${my_instance_id} Name=attachment.device-index,Values=0 --output text | grep NETWORKINTERFACES | cut -f5)

We can now update our Route Table with the new network interface ID (be sure to modify this to match your Route Table ID and region)

aws ec2 replace-route --route-table-id rtb-xxxxxxxx --destination-cidr-block --network-interface-id ${my_eni_id} --region us-east-1

And finally, we change the source destination check for the network interface for the instance to work properly as a NAT device.

aws ec2 modify-network-interface-attribute --network-interface-id ${my_eni_id} --no-source-dest-check --region us-east-1

3.  Last, create the auto-scaling group utilizing the launch configuration.  The auto-scaling group should be configured as:
Desired = 1
Min = 1
Max = 1

  In the event your NAT instance is terminated, the auto scaling group will launch a new instance and update the route table with the new ENI ID.

There is one missing component to this setup, and that is creating a Status Check Alarm.  The alarm should be configured to terminate the instance when it fails status check.  When the instance is terminated the auto-scaling group will launch a new instance.

(I have not yet created the code to create a new Status Check Alarm, this should be easily accomplished in the User Data.  I will hopefully find time to add to this post how to do this at a later time)

Here is the complete IAM Role Policy
(change the arn for the region and your route table)

Here is the complete User Data to add to the launch configuration
(change the region and the route-table-id to match your environment)