Saturday, September 12, 2015

AWS - Auto join EC2 Windows instance to Active Directory Domain


Some environments will require you to join your Windows servers to a domain.  The following will show the steps taken to automatically join a server to a Windows domain.  This assumes the following:
   An existing AWS VPC with access to S3 bucket
   New instances are able to communicate to a domain controller.

NOTE:  Amazon does offer its Directory Service with AD Connector that will connect your VPC to your ActiveDirectory, but this will show how you can do so without the AD Connector.

The steps:

  1. Create a PowerShell script to join a server to the domain
  2. Secure the credentials by converting the PowerShell script to an Exe executable using PS2exe
  3. Create an S3 bucket and upload the exe file
  4. Create an IAM role with a policy to allow Read access to the S3 bucket
  5. Launch a new instance, assigning the IAM role and providing User Data which will run the required scripts at first launch

1. Create the PowerShell script

The PowerShell script will join the server to the domain.   We will use the Add-Computer function, and a user account that has permissions to join computers to the domain. Here is the full script, modify the username, password, and DomainName for your environment.  


Save the file as JoinDomain.ps1

2. Convert the PowerShell script to an executable file

To help secure the credentials we will convert the PowerShell script using PS2exe to an executable file. Download PS2exe from: PS2exe download

Extract the zip file to a folder and then run PS2exe.ps1 on the JoinDomain.ps1 script to convert it to an exe file. From a command prompt run the following:

c:\> .\ps2exe.ps1 -inputfile JoinDomain.ps1 JoinDomain.exe

This will create the JoinDomain.exe file.

3. Create an S3 bucket and upload the exe file

Within the AWS console, create a new S3 bucket to store the JoinDomain.exe file.
For this example we will use,  examplebucket  for the bucket name, You will need to use your own unique bucket name.

With the bucket created, we can upload the JoinDomain.exe file to the bucket.


4. Create an IAM role with a policy to allow Read only access to the S3 bucket

By creating an IAM role and assigning the role to the instance we can eliminate the need to use an IAM user account with access keys.  IAM roles utilize temporary credentials to grant access.

Create an IAM role in the AWS console, and Select Role Type: AWS Service Roles > Amazon EC2


Follow the prompts through, clicking next until the Role is finally created. With the role created, we must now create a new Inline policy which will grant access to the S3 bucket.

Select the newly created Role and expand the 'Inline Policies' to create a new policy:


Choose the option to create a Custom Policy:



For the policy, we grant ListBucket and GetObject restricted to the S3 bucket.  Here is the policy, you must modify the bucket name :


{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": ["arn:aws:s3:::examplebucket/*"]
}
]
}



5.  Launch a new instace

Launch a new instance into the VPC.  We need to attach the IAM Role to the instance as well as configure the Advanced > User Data

The User Data is used to run scripts when the instance is first launched.  For our example, we will be downloading the JoinDomain.exe file from S3 and finally executing it.

First assign the IAM Role to the instance.

Next, expand the Advanced Details to show the User Data field.  Here we can provide some PowerShell commands to download the exe file and execute it.  Here is the UserData to include, modifying the S3 bucket location to your environment:




To join newly launched instances to a domain you need to make use of UserData, which allows you to run scripts during the initial startup of the launch.
By using the UserData you can run commands. For our case, we will be executinig an EXE to join to the domain.

<powershell>
Set-ExecutionPolicy unrestricted -Force
New-Item c:/temp -ItemType Directory -Force
set-location c:/temp
read-s3object -bucketname examplebucket -key JoinDomain.exe -file JoinDomain.exe
Invoke-Item C:/temp/JoinDomain.exe
</powershell>

Here's what it looks like in the AWS Console:




Follow the remaining steps to complete launching of the instance. The instance will launch, download the exe, execute it and restart.

28 comments:

Imran Syed said...

This script is not working for me. It copies down the .exe file from S3 to the server but looks like it never kick it off. It doesn't create any log files either. Any suggestions?

Ryan Lawyer said...

Imran I made some updates to the script to help resolve your issue.

I noticed there are times when the metadata is not available from AWS, so I added a loop until it is available.

I also changed the process to rename and join the server to the domain. Separating the steps out fixes the issue of random errors reporting the directory service is busy.

Here is a revised script:
#Retrieve the AWS instance ID, keep trying until the metadata is available
$instanceID = "null"
while ($instanceID -NotLike "i-*") {
Start-Sleep -s 3
$instanceID = invoke-restmethod -uri http://169.254.169.254/latest/meta-data/instance-id
}

$username = "domain\username"
$password = "password12345" | ConvertTo-SecureString -AsPlainText -Force
$cred = New-Object -typename System.Management.Automation.PSCredential($username, $password)

Try {
Rename-Computer -NewName $instanceID -Force
Start-Sleep -s 5
Add-Computer -DomainName domain.local -OUPath "OU=YourOU,DC=domain,DC=local" -Options JoinWithNewName,AccountCreate -Credential $cred -Force -Restart -erroraction 'stop'
}
Catch{
echo $_.Exception | Out-File c:\temp\error-joindomain.txt -Append
}

Rafiq Islam said...

Hi Ryan,

I followed your modified script but it did not do anything. I could not even find anything under c:/AWSJoinDomain/ or c:/AWSJoinDomain/TempLogs/ folder. Here is the powershell script that I used in the 'User Data' section.
-----------------------------------------------------------------------

Set-ExecutionPolicy unrestricted -Force
New-Item c:/AWSJoinDomain -ItemType Directory -Force
New-Item c:/AWSJoinDomain/TempLogs -ItemType Directory -Force
set-location c:/AWSJoinDomain
read-s3object -bucketname adautojoiner -key ADAutoJoiner64.exe -file ADAutoJoiner64.exe
Invoke-Item c:/AWSJoinDomain/JoinDomain.exe

-----------------------------------------------------------------------

When I copy ADAutoJoiner64.exe to c:/AWSJoinDomain/ and it manually on the server, it does the job. That means the actual script is OK.

Any help is greatly appreciated.

Thanks -Mohammed

Ryan Lawyer said...

Rafiq,
Does the server have access to S3 and able to download the file?
It also appears the 'Invoke-Item' line is referencing the wrong file, it should be calling c:/AWSJoinDomain/ADAutoJoiner64.exe vs JoinDomain.exe

It may help to look at the logs for the EC2 service, which are located at: C:\Program Files\Amazon\Ec2ConfigService\Logs\Ec2ConfigLog.txt

Hope that helps.
Thanks
Ryan

Rafiq Islam said...

Hi Ryan,

Thanks for looking in to this. Yes, sorry, that was my mistake. I corrected everything and analyzed the logs. Two different issues are now in two different scenarios:

Scenario#1: Using Amazon's 'Microsoft Windows Server 2012 R2 Base - ami-3586ac5f':
Scripts in the 'User Data' run, files are copied, ADAutoJoiner64.exe runs, computer is renamed but cannot join the computer to the domain since the ADAutoJoiner64.exe needs .NET 3.5 which is absent.

Scenario#2: Using our custom API:
Scripts in the 'User Data' do not run. So, nothing happens. Probable reason - 'Plugin Ec2HandleUserData is disabled'. This is what the logs show.

So, my understanding is this is what happens:
1. 'Ec2HandleUserData' is enabled at the first instance (created from Amazon AMI) first booting time but it is disabled before the instance is ready.
2. Whatever AMI we create from this instance, the 'Ec2HandleUserData' remains disabled.

So, if we cannot use our own AMI, the .NET is absent and the ADAutoJoiner64.exe cannot run.

BTW, c:/AWSJoinDomain/TempLogs has nothing.

Any help is greatly appreciated.

Regards -Rafiq

Ryan Lawyer said...

Rafiq,
I am using the same AWS AMI and it works with no additional configurations. Is the user account that you are using to join to the domain have proper permissions to add computers?

The AWS AMI has .NET 4.5 installed by default. If you are creating your own custom AMI and disabling .NET that could cause an issue, I'd have to test that scenario. Also, if you are creating a custom AMI you will probably need to specify to 'Shutdown with Sysprep'.

Is there anything in the log file, per your script, c:\temp\error-joindomain.txt ?
There may be a log generated in the event log as well.

Ryan

Rafiq Islam said...

Hi Ryan,

I compiled the script using '-x64 -runtime30'. And, I did not deinstall .NET 4.5 but when I go to add server role on the AWS AMI created instance, I do not see that .NET 4.5 is installed.

The account has the right permission since when I use it with the pure powershell script in the User Data, then it works.

'Shutdown with Sysprep' will not work for us because we want to retain the local administrator password which is not retained in Windows 2008 and higher.

In order to make the script working for the second time it needed 'UserData Execution for the next service start' enabled in the EC2 Service Properties (C:\Program Files\Amazon\Ec2ConfigService\Ec2ConfigServiceSettings.exe) before shutting it down for creating a custom AMI. The problem was that, if we turn this master instance up again for any modification, it is joined to the domain by the UserData scripts copied in it. To overcome that I just placed a check to see that this is not the master instance before trying to joining to the domain.

If the executable does not run because of the absence of .Net, then c:\temp\error-joindomain.txt would not have anything. On the other hand, "echo $_.Exception | Out-File ..." did not work but "$_.Exception | Out-File ..." did.

If you find anything about the .NET issue, please let us know.

Thanks -Rafiq

Ryan Lawyer said...

Rafiq,
I normally compile my script without the '-x64 -runtime30' options. I just tested it compiling this way and it still worked for me, using AMI Windows_Server-2012-R2_RTM-English-64Bit-Base-2016.02.10 (ami-3586ac5f)

Have you been able to test this using an AMI from AWS that hasn't been customized?
I'm curious if the AMI creation process that you are using is causing some sort of issue.

Can you share both the complete UserData as well as the Powershell script you are compiling? Please replace any private data with obscure data instead.

Ryan

Azhagiri Panneerselvam said...

Hi Ryan Lawyer ,

I have used your script for my scenario by not lucky.

Scenario:
I have 2 windows instance which is under ELB and they are in AD. Here I am using Autoscaling. When new machine coming up it should automatically join to the AD.

What is Tried:
I have manually tried your script for testing purpose. I have launched the new instance with your script provided in the user data script box. But the machine not added to the AD.
Script which I have tried:

$username = "example\username"
$password = "Password" | ConvertTo-SecureString -AsPlainText -Force
$cred = New-Object -typename System.Management.Automation.PSCredential($username, $password)
Try {
Add-Computer -DomainName example.com -OUPath "OU=Computers,DC=babajob,DC=com" -Options JoinWithNewName,AccountCreate -Credential $cred -Force -Restart -erroraction 'stop'
}
Catch{
echo $_.Exception | Out-File c:\temp\error-joindomain.txt -Append
}


My expectation:

I want the new server to be added as fast as possible to the AD while launching. Kindly help me

Ryan Lawyer said...

Azhagiri,
Is the DNS settings for your instances pointing to your DNS servers for the Domain or AWS? If its pointing to AWS default .2, then you need to configure the DHCP option set for your VPC or you can add this command to the beginning of your script, modify the IPs to reflect your DNS servers:

Set-DnsClientServerAddress -InterfaceIndex 12 -ServerAddress (\"10.0.0.10\",\"10.0.0.11\")


Also, please note that by putting the script with the password as plaintext in the UserData is not secure. Anyone with access to the server can retrieve this information via the metadata. I'd highly recommend not using this approach.

Thank you
Ryan

Mike Fernandez said...

Hey Ryan

I am testing this out for our company and ran into a few issues which I have conquered most of them except joining the domain. I get the following error:

Computer 'WIN-M339MVHD83P' failed to join domain 'domain.local' from its current workgroup 'WORKGROUP' with following error message: The system cannot find the file specified.

I don't know what file it would be looking for. I followed every step listed.

Mike

Mike Fernandez said...

Hey Ryan

I followed all of the steps listed above. Got the following error:

Computer 'WIN-M339MVHD83P' failed to join domain 'domain.local' from its current workgroup 'WORKGROUP' with following error message: The system cannot find the file specified.

What file is it looking for?

Mike

Ryan Lawyer said...

I have not seen that error either.
It could be the Catch statement with the Out-File having a problem.
Can you run the powershell on the machine manually, and see if you get any errors? Remove the Try/Catch so you can see the error.

justin shroyer said...

This worked perfectly for me. As a newb, I will say the fact that I had a $ in the password appeared to be the primary issue. I kept getting a "username or password incorrect" in my error output file, and the script in ISE was showing different colors during and after the $ within the quotation marks. I didn't want to alter my admin password, but once I swapped users with a password that contained normal characters and made it admin, everything went almost perfectly.

The only other issue I had appeared to be with the OUpath option. I got an error about not being able to find the file. I attempted to just run the "Add-computer" command and kept getting that error. Once I removed -OUpath, it worked for me. I'm in 2012 R2, so I'm not sure if that had anything to do with it, may have also have been my OU choice, but since I wanted it in the Computers OU anyway, I just left it blank (for now).

Regardless, tested it with auto-scaling successfully. I'd like to add more to the script eventually but this is a fantastic starting point for me. Thanks for the write up.

ramz said...

I'm sure you had a valid reason for not doing so, but I'm not sure I understand why you just didn't put the contents of the JoinDomain.PS1 script into the UserData for the launched instance. Would you mind explaining that? Thanks!

Ryan Lawyer said...

ramz,
You can put the script into the UserData, but I wanted a process that would not allow the password to be retrieved. UserData may be retrieved from the instance or the console and it is not protected by cryptographic methods. Therefore it is advised to not store passwords within the UserData.

ramz said...

I don't disagree with your logic, but a simple hex editor/decompiler is all anyone needs to retrieve that password. We got around this by creating an AD account whose sole permission is to add/remove systems from the directory in a specific OU. Not 100% bullet proof, but it's something.

Ryan Lawyer said...

True that could potentially decompile it, but they would need access to the exe file to do that, vs simply retrieving the password from plain text in the UserData.
I do like your suggestion of restricting the AD account. This would add another layer of security. Thanks for providing the input!

Anonymous said...

Ryan, for the most part this works well except, after joining the domain, I have to reboot the server twice. As you know, the script itself reboots it once but then I am not able to log onto the server without rebooting it again. Any ideas?

Ryan Lawyer said...

What error do you receive when you try to logon and it fails?
Have you confirmed it is indeed restarting the server via the script? You should be able to review the logs within the AWS console: Actions>Instance Settings>Get System Log and notice the restart event or possibly any errors.

Anonymous said...

@Ryan - It looks like RDP is not responding at all on that server. The System Log from AWS does not show anything out of ordinary. Telnet to 3389 to the server is failing. However, after I reboot again, everything comes back normal.

Ryan Lawyer said...

When you say reboot, I'll assume you are initiating the reboot from AWS console or CLI since you're not able to RDP to the server?
It sounds like something preventing the restart, what OS are you using?
Can you look at the EC2config log, C:\Program Files\Amazon\Ec2ConfigService\Logs\Ec2ConfigLog
You should be able to see in here where it or if it restarts the server,something like this:
2016-06-29T19:31:19.568Z: Background plugin complete: Ec2HandleUserData
2016-06-29T19:31:19.568Z: After ready plugins complete.
2016-06-29T19:31:19.568Z: Main configuration starting...
2016-06-29T19:31:19.599Z: Main configuration started.
2016-06-29T19:31:35.505Z: Ec2ConfigService stopping...
2016-06-29T19:31:35.505Z: stopping Main configuration
2016-06-29T19:31:35.505Z: stopping Legacy configuration

Is there an exception caught from the script? c:\temp\error-joindomain.txt

Anonymous said...

@Ryan - The powershell script itself has a restart command that reboots the box after the domain join is completed. Based on the console log I can see that the reboot is successful and I can also see from the "Get Instance Screenshot" that the server is waiting for login. But it seems that RDP does not respond at all. Even telnet to port 3389 fails. Then when I do another reboot from AWS console, I can then log onto the box without any issues, I could see that the server was joined to the domain successfully and no error messages at all. It seems there is nothing wrong with your powershell script but what is causing me heartburn is the double reboot :(.

BTW, I really appreciate all your help. You have been super helpful :)

Ryan Lawyer said...

Are you using a custom AMI? Maybe try configuring the RDP service to Startup Delayed, as well as configuring the recovery to restart the service on failure?
If it's a default AWS AMI, can you provide which AMI you are using, I'd be happy to test.

Anonymous said...

Thanks for this. Made some modest changes to get this to work for me. to Line 18:

echo $_.Exception | Out-File -FilePath "C:\Program Files\Amazon\Ec2ConfigService\Logs\error-joindomain.txt" -Append

I added the "-FilePath". Also decided to use an existing directory over creating "c:\temp"

Getting the -OUPath is critical. I'm connecting to a Microsoft AD and the path to "OU=Computers" is different.

Finally I added a "Restart-Computer -Force" at the end.

Ryan Lawyer said...

Thanks for the feedback.

I'm curious about the 'Restart-Computer -Force', as this shouldn't be required since the Add-Computer statement contains the option to -Restart.

Anonymous said...

anyone having luck running this against Windows Server 2016 instances?
i get the below error:

retrieving the com class factory for component with clsid failed due to the following error 800703fa

Anonymous said...

I am getting this error when I run the powershell while building the server.

"Showing a modal dialog box or form when the application is not running in UserInteractive mode is not a valid operation. Specify the ServiceNotification or DefaultDesktopOnly style to display a notification from a service application."