Monthly Archives: May 2017

AWS CLI Queries and jq

Anyone who’s worked with the AWS CLI/API knows what a joy it is. Who hasn’t gotten API-throttled?  Woot!  Well, anyway, at work we’re using Cloudhealth to enforce AWS tagging to keep costs under control; all servers must be tagged with an owner: and an expires: date or else they get stopped or, after some time, terminated.   Unfortunately Cloudhealth doesn’t understand Cloudformation stacks, so it leaves stacks around after having ripped the instances out of them.  We also have dozens of developers starting instances and CloudFormation stacks every day.  Sometimes they clean up after themselves, but often they don’t (and they like to set that expires: tag way in the future, which is a management problem not a technology problem).  Sometimes we hit AWS quotas in our dev environment and an irritated engineer goes and “cleans up” a bunch of stuff – often not in the way that they should (like by terminating instances and not stacks).

So, I was throwing together a quick bash script to find some of the resulting exceptions and orphans in our environment.  Unattached EBS volumes.  CloudFormation stacks where someone terminated the EC2 instance and figured they were done, instead of actually deleting the stack. Stuff like that. This got into some advanced AWS CLI-fu and use of jq, both of which are snazzy enough I thought I’d share.

Here’s my script, which I’ll explain. You will need the AWS CLI installed (on OSX El Capitan, “brew install awscli” or “pip install –ignoreinstalled six –upgrade –user awscli” – the –ignoreinstalled six works around an El Capitan problem).  You need “aws config” configured with your creds and region and such, and set output to json.  And you need to install jq, “brew install jq.”

#!/usr/bin/env bash
# This script finds problematic CloudFormation stacks and EC2 instances in the AWS account/region your credentials point at.
# It finds CF stacks with missing/terminated and stopped EC2 hosts.  It finds EC2 hosts with missing owner and expires tags.
# It finds unattached volumes. Should you delete them all?  Probably. Kill the EC2 instances first because it'll probably
# make more orphan CF stacks.


echo "Finding misconfigured AWS assets, stand by..."
for STACK in $(aws cloudformation list-stacks --stack-status-filter CREATE_COMPLETE UPDATE_COMPLETE --max-items 1000 | jq -r '.StackSummaries[].StackName')
        INSTANCE=$(aws cloudformation describe-stack-resources --stack-name $STACK | jq -r '.StackResources[] | select (.ResourceType=="AWS::EC2::Instance")|.PhysicalResourceId')
        if [[ ! -z $INSTANCE  ]]; then
                STATUS=$(aws ec2 describe-instance-status --include-all-instances --instance-ids $INSTANCE 2> /dev/null | jq -r '.InstanceStatuses[].InstanceState.Name') 
                if [[ -z $STATUS  ]]; then
                        BADSTACKS="${BADSTACKS:+$BADSTACKS }$STACK"
                elif [[ ${STATUS} == "stopped" ]]; then

echo "CloudFormation stacks with missing EC2 instances: (aws cloudformation delete-stack --stack-name)"

echo "CloudFormation stacks with stopped EC2 instances: (aws cloudformation delete-stack --stack-name)"

echo "EC2 instances without owner tag: (aws ec2 terminate-instances --instance-ids)"
aws ec2 describe-instances --query "Reservations[].Instances[].{ID: InstanceId, Tag: Tags[].Key}" --output json | jq -c '.[]' | grep -vi owner | jq -r '.ID' | awk -v ORS=' ' '{ print $1  }' | sed 's/ $//'

echo "EC2 instances without expires tag: (aws ec2 terminate-instances --instance-ids)"
aws ec2 describe-instances --query "Reservations[].Instances[].{ID: InstanceId, Tag: Tags[].Key}" --output json | jq -c '.[]' | grep -vi expires | jq -r '.ID' | awk -v ORS=' ' '{ print $1   }' | sed 's/ $//'

echo "Unattached EBS volumes: (aws ec2 delete-volume --volume-id)"
aws ec2 describe-volumes --query 'Volumes[?State==`available`].{ID: VolumeId, State: State}' --output json | jq -c '.[]' | jq -r '.ID' | awk -v ORS=' ' '{ print $1  }' | sed 's/ $//'


The AWS cli, of course, lets you manipulate your AWS account from the command line.  jq is a command line JSON parser.

Let’s look at where I rummage through my CloudFormation stacks looking for missing servers.

aws cloudformation list-stacks --stack-status-filter CREATE_COMPLETE UPDATE_COMPLETE --max-items 1000 | jq -r '.StackSummaries[].StackName'

Every separate aws subsection works a little different.  aws cloudformation lets you filter on status, and CREATE_COMPLETE and UPDATE_COMPLETE are the “good” statuses – valid stacks not in flight right now.  The CLI likes to jack with you by limiting how many responses it gives back, which is super not useful, so we set “–max-items 1000” as an arbitrarily large number to get them all.  This gives us a big ol’ JSON output of all the cloudformation stacks.

    "StackSummaries": [
            "StackId": "arn:aws:cloudformation:us-east-1:12345689:stack/mystack/1e8f2ba0-4247-11e7-aad1-500c28601499", 
            "StackName": "mystack", 
            "CreationTime": "2017-05-26T19:11:28.557Z", 
            "StackStatus": "CREATE_COMPLETE", 
            "TemplateDescription": "USM Elastic Search Node"

Now we pipe it through jq.

jq -r '.StackSummaries[].StackName'

This says to just output in plain text (-r) the StackName of each stack. You use that dot notation to traverse down the JSON structure.  So now we have a big ol’ list of stacks.

For each stack, we have to go find any EC2 instances in it and check their status. So this time we use a select inside our jq call, to find only items whose resource type is “AWS::EC2::Instance”.

aws cloudformation describe-stack-resources --stack-name $STACK | jq -r '.StackResources[] | select (.ResourceType=="AWS::EC2::Instance")|.PhysicalResourceId')

And then for each of those instances, we get their status, which is in the InstanceState.Name field.

aws ec2 describe-instance-status --include-all-instances --instance-ids $INSTANCE 2> /dev/null | jq -r '.InstanceStatuses[].InstanceState.Name'

That works.  But there’s more than one way to do it!  The AWS CLI commands support a “–query” parameter – which lets you specify a JSON search string that happens on the AWS end, so you have to do less parsing on your end!

To find instances without the owner tag,

aws ec2 describe-instances --query "Reservations[].Instances[].{ID: InstanceId, Tag: Tags[].Key}" --output json | jq -c '.[]' | grep -vi owner | jq -r '.ID' | awk -v ORS=' ' '{ print $1  }' | sed 's/ $//'

What this does is look under Reservations.Instances and basically outputs me a new JSON with just the ID and tags in it.  “jq -c ‘.[]'” just crunches each one into a one-liner.   I grep out the ones without an owner, turn them into one line with awk, and strip the training space at the end from the awk with a sed (ah, UNIX string manipulation).

With this, you can choose what to put into the –query and what to do after in jq.  The –query is fast and cuts down your result set, so you run less risk of magically missing resources because AWS decided there were too many to tell you about.

You can do filters in the query – so for example, when I do the volumes, instead of doing what I did for the tags using grep, I can instead just do:

aws ec2 describe-volumes --query 'Volumes[?State==`available`].{ID: VolumeId, State: State}' --output json | jq -c '.[]'

Yes, those are backticks, don’t blame the messenger.  This is more precise when you can get it to work.  In the instances’ case, people aren’t good about using the same case (owner, Owner, OWNER) and also I just plain couldn’t figure out how to properly create the query, “Reservations[].Instances[?Tags[].Key==`owner`” and other variations didn’t work for me.  I’m no JSON query expert, so good enough!

Between the CLI queries and jq, you should be able to automate any common task you want to do with AWS!


Filed under DevOps

Tcpdump and Wireshark on OSX

I’m going to start sharing little techie tidbits that require me to go scour the Internet for exactly how to do them, in hopes of making you able to do it in a lot less time than it took me!

So I’m having trouble with connection times spiking to an Amazon Web Services ELB, so it’s time to break out the tcpdump to take packet traces and the wireshark (was ethereal long ago) to analyze it.  I’m on OSX El Capitan (10.11.6).

tcpdump comes on OSX (or if it doesn’t, something installed it without me knowing!).  Step one is figure out what network interface you want to dump.  This will list all your network interfaces.

networksetup -listallhardwareports

Then, run a packet trace on that interface.  I’m using en0 the primary wireless interface, so I run:

sudo tcpdump -i en0 -s 0 -B 524288 -w ~/Desktop/DumpFile01.pcap

I go to another window and hit the URL I’m having trouble with – you can use whatever, but I used ab (Apachebench) which comes with OSX.  Other popular URL-hitters you might install are curl, wget, and siege.

ab -n 10 http:///

Then come back and control-C out of the tcpdump capture.  Now I have a network dump of me hitting that URL (plus whatever other shenanigans my computer was up to at the time, so there’s probably a lot of noise in there from chat clients etc.).

Now to analyze it – wireshark.  I had to go a couple rounds with the installation.

If you want the UI you need to install it as:

brew install wireshark --with-qt

(If you just install wireshark without –with-qt you don’t get wireshark, you get a command line called tshark, and then  you need to reinstall…) For this, as with most things, you need Xcode or at least the Xcode command line tools (I always just install the tools).  You install them with:

xcode-select --install

But if you have an older version (<8.2.1) the wireshark build will fail.  To update the command line tools, you… Apparently you don’t any more.  The App Store doesn’t offer Command Line Tools updates and Apple has gotten more unclear and squirrelly about whether they’re even a thing.  So I just installed full XCode from the App Store, whatever, it’s just network and disk space and contributing to the heat death of the universe, but I’m not bitter, and then it builds.

Optionally if you want to capture from within wireshark on your local box instead of having to tcpdump separately also do

brew cask install wireshark-chmodbpf

But to analyze your tcpdump file just run


And load in the capture file. The quickest way to then sort into what you want is to find one part of a transaction of interest – like in my case by filtering on “http” or just looking around – and then right-clicking on one packed and saying “Follow… HTTP stream” and you get a whole transaction end to end.

Screen Shot 2017-05-26 at 2.43.56 PM

Screen Shot 2017-05-26 at 2.44.16 PM

And now you test out your TCP/IP network admin knowledge by rooting through and seeing if you can find what’s going wrong!


Filed under DevOps