Engineering / August 31, 2016

3 Quick Tips For Faster AWS Scripting

Scripting can be a quick way for both developers and dev ops engineers to eliminate repetitive actions and get predictable results. Start-up scripts, update scripts and backup scripts are among a collection of scripts that are all very common in Linux systems.

Scripts are great at a single host level, but when you need to repeat the action on tens or hundreds of hosts, you get into different territory that not many people know how to handle.

At Eyeview, we use a few quick one-liners that make our work with AWS scripting a whole lot easier, and hopefully they can help you out too.

Setup

The main prerequisite for the below commands is having password-less ssh via RSA keys setup on all your AWS hosts.

Once you have that, make sure you have the private key either on your box or on a bastion host where you will be accessing the AWS instances— e.g. you can add your awskey_rsa file like this:

File: ~/.ssh/config

Host *
IdentityFile /home/user/.ssh/awskey_rsa
IdentityFile /home/user/.ssh/id_rsa
User awsuser
StrictHostKeyChecking no
UserKnownHostsFile /dev/null

You should now be able to ssh to one of your AWS instances by doing the following without a password prompt:

ssh awsuser@aws-instance-ip-address

The main tool we will use is the AWS CLI, which can easily be installed by most package managers.

Another very useful tool in AWS is GNU parallel, which allows you to run commands in parallel. There are other tools you can use to do parallel ssh such as pssh, parallel-ssh, etc., but in here we’ll use parallel + ssh as a more widely available option.

Once you have all these in place, you can test the following quick tips for faster AWS scripting.

1. Batch uptime/load check

It can be very useful to check how loaded your boxes are, or how long they’ve been up. This information should already be tracked in your monitoring (e.g. grafna), but sometimes you want to just do a quick check.

In the example below, we use the security group eyeview-bidder to filter out which instances we want to connect to and then we grep for their PublicDnsName:

$ aws ec2 describe-instances --filter Name=instance.group-name,Values=eyeview-bidder|grep PublicDnsName|cut -f 2 -d :|cut -f 2 -d '"'|sort -u|parallel -j 1024 ssh {} uptime
16:01:17 up 131 days, 56 min, 0 users, load average: 0.71, 0.83, 0.85
16:01:17 up 111 days, 21:59, 0 users, load average: 0.64, 0.76, 0.81
16:01:17 up 111 days, 21:59, 0 users, load average: 0.53, 0.73, 0.76

Or, if you’re using EC2 Classic:

$ aws ec2 describe-instances --filter Name=group-name,Values=eyeview-bidder|grep PublicDnsName|cut -f 2 -d :|cut -f 2 -d '"'|sort -u|parallel -j 1024 ssh {} uptime

Note that by default, parallel runs as many tasks in parallel as there are cores. In this case, our tasks are not CPU bound, so we specify a high number (e.g. 1024 tasks) to run in parallel.

2. Run any command

The above example can easily be extended to run pretty much anything on a set of AWS boxes. Let’s say we want to update a config file on a batch of boxes:

$ aws ec2 describe-instances --filter Name=instance.group-name,Values=eyeview-bidder|grep PublicDnsName|cut -f 2 -d :|cut -f 2 -d '"'|sort -u|parallel -j 1024 "scp eyeview.properties {}:/tmp/"

And then restart the service:

$ aws ec2 describe-instances --filter Name=instance.group-name,Values=eyeview-bidder|grep PublicDnsName|cut -f 2 -d :|cut -f 2 -d '"'|sort -u|parallel -j 1024 "ssh {} 'sudo service bidder restart'"

You can then build aliases and use them in scripts to make this even easier.

3. Batch operations on s3

Another common pain point with working with AWS scripts is dealing with a large number of files in s3. There are many sophisticated tools out there that can be used for advanced processing such as Spark, however, sometimes you just want to do something quick or scripted.

Say you want to download Cloud Trail logs for a day. You can do this with the AWS CLI too, but we tend to use s3cmd for this.

$ s3cmd ls s3://cloudtrail-bucket/AWSLogs/accountnumber/CloudTrail/us-east-1/2016/07/05/|grep -o "s3.*"| parallel "s3cmd get {}"

You can easily combine this with the actual processing of the files so you don’t have to store them all on disk and build command pipes to get the final results that you want.

At Eyeview, we use thousands of AWS EC2 instances and Terabytes of S3 storage. Being able to quickly do parallel operations on them helps us both in day-to-day tasks as well as emergency situations, so we wanted to share these simple one-liners in case they can help anyone else.

Check out our Technology page to learn more about how our technology works.

 
Naoum Naoumov

Naoum Naoumov Decisioning Team Lead

Date: 08.31.2016
Tags: