Tag Archives: vagrant

Templating Config Files In Docker Containers

Configuration management does many things, and it does most of those quite poorly. I could write a long essay on how by solving software problems with more software we’re simply creating more software problems, however I will attempt to resist that urge and instead focus on how Docker and the Docker ecosystem can help make configuration management less sucky.

Ignoring volume mounts (which are an abomination for which I hold @gabrtv wholly responsible for), Docker has two main ways to configure your application: firstly by creating the dockerfile in which you explicitly declare your dependencies and insert any configuration files, and secondly at run time where you pass commands and environment variables to be used inside the container to start your application.

We’re going to ignore the dockerfile here and assume that you have at least a passing familiarity with them;  instead we’re going to focus on how to configure your application at run time.

A true Docker Native app would have a very small config file of which some or all settings could be overridden by environment variables or CLI options that can be set at run time to modify the appropriate configuration option (say, pointing it at a MySQL server at ).

Very few applications are written in this way, and unless you’re starting from scratch or are willing to heavily re-factor your existing applications you’ll find that building and configuring your applications to run in “the docker way” is not always an easy or particularly pleasant thing to have to do. Thankfully there are ways to fake it.

To save writing out a bunch of CLI arguments the cleanest ( in my opinion ) way to pass values into docker containers is via environment variables like so:

 $ docker run -ti --rm -e hello=world busybox env

We can then write an application to read that environment variable and use it as a configuration directive like so:

 echo hello $hello

No prizes for guessing our output, we run a docker container from the image containing this script:

 $ docker run -ti --rm -e hello=world helloworld
 hello world

Now this was a pretty asinine demo, and apart from showing how passing environment variables into a docker container works doesn’t really do anything useful.  Let’s look at a slightly more realistic application.  Take a python app that reads a configuration file and when asked renders a web page using the contents of that configuration file:

note:  these examples are abbreviated sections of the example factorish app.


 import ConfigParser
 import os

 from flask import Flask
 app = Flask(__name__)

 def hello():
 Config = ConfigParser.ConfigParser()
 return 'Luke, I am your {}'.format(
 Config.get("example", "text"))

 if __name__ == '__main__':
 app.run(host='', port=80)


 text: father

Now when we run this application we get the following:

$ docker run -d -p 8080:8080 -e text=mother example
$ curl localhost:8080
 Luke, I am your father

Obviously the application reads from the config file and thus passing in the environment variable `text` is meaningless. We need a way to take that environment variable and embed it in the config file before running the actual `example.py` application.  Chances are the first thing that popped into your head would be to use `sed` or a similar linux tool to rewrite the config file like so:


 sed -i "s/^text:.*$/text: ${text}" example.conf
 exec gunicorn -b app:app

Now we can run it again with run.sh set as the starting command and the config should be rewritten.

$ docker run -d -p 8080:8080 -e text=mother example ./run.sh
$ curl localhost:8080
Luke, I am your mother

This might be fine for a really simple application like this example, however for a complicated app with many configuration options it becomes quite cumbersome and offers plenty of opportunity for human error to slip in. Fortunately there are now several good tools written specifically for templating files in the docker ecosystem,  my favourite being confd by Kelsey Hightower which is a slick tool written in golang that can take key-pairs from various sources ( the simplest being environment variables ) and render templates with them.

Using confd we would write out a template file using the `getv` directive which simply retrieves the value of a key.  You’ll notice that the key itself is lowercase,  this is because confd also supports retrieving key-pairs from tools such as etcd and confd which use this format.  When set to use environment variables it is translated into reading the variable “SERVICES_EXAMPLE_TEXT”.

 text: {{ getenv "/services/example/text" }}

We would accompany this with a metadata file that tells confd how to handle that template:

 src   = "example.conf"
 dest  = "/app/example/example.conf"
 owner = "app"
 group = "app"
 mode  = "0644"
 keys = [
 check_cmd = "/app/bin/check {{ .src }}"
 reload_cmd = "service restart example"

The last piece of this puzzle is a executable command in the form of a shell script that docker will run which will call confd to render the template and then start the python application:


 # read 'text' env var and export it as confd expected value
 # set it to 'father' if it does not exist
 # run confd to render out the config
 confd -onetime -backend env
 # run app
 exec gunicorn -b app:app

Now let’s run it, first without any environment variables:

 $ docker run -d -p 8080:8080 --name example factorish/example
 $ curl localhost:8080
 Luke, I am your father
 $ docker exec example cat /app/example/example.conf
 text: father

As you can see the server is responding using the default value of `father` that we set in the export command above.   Let’s run it again but set the variable in the docker run command:

 $ docker run -d -e SERVICES_EXAMPLE_TEXT=mother -p 8080:8080 --name example factorish/example
 $ curl localhost:8080
 Luke, I am your mother
 $ docker exec example cat /app/example/example.conf
 text: mother

We see that because we set the environment variable it is be available to `confd` which renders it out into the config file.

Now if you go and look at the full example app you’ll see there a bunch of extra stuff going on.   Let’s see some more advanced usage of confd by starting a coreos cluster running etcd in Vagrant. etcd is a distributed key-value store that can be used to externalize application configuration and retrieve it as a service.

 $ git clone https://github.com/factorish/factorish.git
 $ cd factorish
 $ vagrant up

This will take a few minutes as the servers come online and build/run the application.   Once they’re up we can log into one and play with our application:

 $ vagrant ssh core-01
 core@core-01 ~ $  docker ps
 CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS                    NAMES
 ee80f89d2565        registry            "docker-registry"   25 seconds ago      Up 25 seconds>5000/tcp   factorish-registry
 c763ed34b182        factorish/example   "/app/bin/boot"     52 seconds ago      Up 51 seconds>8080/tcp   factorish-example
 core@core-01 ~ $ docker logs factorish-example
 ==> ETCD_HOST set.  starting example etcd support.
 2015-11-08T21:35:16Z c763ed34b182 confd[23]: INFO Target config /app/example/example.conf out of sync
 2015-11-08T21:35:16Z c763ed34b182 confd[23]: INFO Target config /etc/service/confd/run out of sync
 2015-11-08T21:35:16Z c763ed34b182 confd[23]: INFO Target config /etc/service/confd/run has been updated
 2015-11-08T21:35:16Z c763ed34b182 confd[23]: INFO Target config /etc/service/example/run out of sync
 2015-11-08T21:35:16Z c763ed34b182 confd[23]: INFO Target config /etc/service/example/run has been updated
 2015-11-08T21:35:16Z c763ed34b182 confd[23]: INFO Target config /etc/service/healthcheck/run out of sync
 2015-11-08T21:35:16Z c763ed34b182 confd[23]: INFO Target config /etc/service/healthcheck/run has been updated
 echo ==> example: waiting for confd to write initial templates...
 2015-11-08T21:35:16Z c763ed34b182 confd[23]: ERROR exit status 1
 Starting example
 *** Booting runit daemon...
 *** Runit started as PID 51
 2015-11-08 21:35:22 [56] [INFO] Starting gunicorn 0.17.2
 2015-11-08 21:35:22 [56] [INFO] Listening at: (56)
 2015-11-08 21:35:22 [56] [INFO] Using worker: sync
 2015-11-08 21:35:22 [67] [INFO] Booting worker with pid: 67
 core@core-01 ~ $ curl localhost:8080
 Luke, I am your father

You can see here that we’ve started the example app,  but notice at the top where it says “starting example etcd support”.  This is because we’ve actually started it with some environment variables that makes it aware that etcd exists. It uses these to configure `confd` to run in the background and watch an etcd key for templated config value.

We can see this and modify the config setting using etcd commands:

 core@core-01 ~ $ etcdctl get /services/example/text
 core@core-01 ~ $ etcdctl set /services/example/text mother
 core@core-01 ~ $ curl localhost:8080
 Luke, I am your mother
 core@core-01 ~ $ exit
 $ vagrant ssh core-02
 core@core-02 ~ $ curl localhost:8080
 Luke, I am your mother

With confd aware of etcd it is able to notice values being changed and react accordingly,  in this case it rewrites the templated config file and then restarts the example application.   If you look at the template’s metadata from earlier you’ll see it is instructed to watch a certain key and rewrite the template if it changes.  It also has two directives `check_cmd` which is used to ensure the created template would be syntactically correct and `reload_cmd` which it runs any time the template is successfully written,  in this case to reload our application.

You’ll also notice that we were able to connect to the other coreos nodes each of which was also running the example application and because etcd was clustered across the three nodes all three applications registered the changed and updated themselves.

So now, not only do we have good clean templating in our container, we also even have the ability to change some of those config settings on the fly by connecting it to etcd.

From this very simple building block we are only a short hop away from being able to automatically configure complicated stacks that react to changes in the infrastructure instantaneously.

Pretty cool huh?

This article is part of our Docker and the Future of Configuration Management blog roundup running this November.  If you have an opinion or experience on the topic you can contribute as well

1 Comment

Filed under DevOps

Velocity 2012 Day One

Hello all! The Velocity cadre grows as the agile admins spread out.  I’m here with Chris, Larry, and Victor from Bazaarvoice and our new friends Kevin, Bob, and Morgan from Powerreviews which is now Bazaarvoice’s West Coast office; Peco is here with Charlie from Opnet, and James is here… with himself, from Mentor Graphics.  Our old friends from National Instruments Robert, Eric, and Matt are here too. We have quite a Groupme going!

Chris, Peco, James and I were on the same flight, all went well and we ended up at Kabul for a meaty dinner to fortify us for the many iffy breakfasts and lunches to come.  Sadly none of us got into the conference hotel so we were spread across the area.  I’m in the Quality Inn Santa Clara, which is just fine so far (alas, the breakfast is skippable, unlike that place Peco and I always used to stay).

I’m sharing my notes in mildly cleaned up fashion – sorry if it gets incoherent, but this is partially for me and partially for you.

Now it’s time for the first session!  Spoiler alert – it was really, really good and I strongly agree with large swaths of what he has to say.  In retrospect I think this was the best session of Velocity.  It combined high level guidance and tech tips with actionable guidelines. As a result I took an incredible number of notes.  Strap in!

Scaling Typekit: Infrastructure for Startups

by Paul Hammond (@ph) of Typekit, Slides are here: paulhammond.org/2012/startup-infrastructure

Typekit does Web fonts as a service; they were acquired by Adobe early this year. The characteristics of a modern startup are extreme uncertainty and limited money. So this is basically an exercise in effective debt management.

Rule #1 – Don’t run out of money.

Your burn rate is likely # of people on the team * $10k because the people cost is the hugely predominant factor.

Rule #2 – Your time is valuable, Don’t waste it.

He notes the three kinds of startups  – venture funded, bootstrapped, and big company internal.  Sadly he’s not going to talk about big company internal startups, but heck, we did that already at National Instruments so fair enough!  He does say in that case, leverage existing infrastructure unless it’s very bad, then spend effort on making it better instead of focusing on new product ideas.  “Instead of you building a tiny beautiful cloud castle in the corner that gets ignored.” Ouch! The ex-NI’ers look ruefully at each other. Then he discussed startup end states, including acquisition.  Most possible outcomes mean your startup infrastructure will go away at some point. So technical debt is OK, just like normal debt; it’s incurred for agility but like financial must be dealt with promptly.

Look for “excuses” to build the infrastructure you need (business and technical). He cites Small Batch Inc., which did a “How to start a company” conference first thing, forcing incorporation and bank accounts and liability insurance and all that, and then Wikirank, which was not “the product” but an excuse to get everyone working together and learn new tech and run a site as a throwaway before diving into a product. Typekit, in standard Lean Startup fashion, announced in a press release before there was anything to gauge interest, then a funding round, then 6 months later (of 4 people working full time) to get 1.0 out.  Launching a startup is very hard.  Do whatever you can to make it easier.

When they launched their stack was merb/datamapper/resque/mysql/redis/munin/pingdom/chef-solo/ubuntu/slicehost/dynect/edgecast/github/google apps/dropbox/campfire/skype/join.me/every project tracking tool ever.

Now about the tech stack and what worked/didn’t work.

  • Merb is a Web framework like Rails. It got effectively end of lifed and merged into Ruby 3, and to this day they’re still struggling with the transition. Lesson: You will be stuck with your technology choices for a long time.  Choose wisely.
  • Datamapper – a Ruby ORM. Not as popular as ActiveRecord but still going.  Launched on v0.9.11!  Over the long term. many bugs. A 1.0 version came out but it has unknown changes, so they haven’t ported.  The code that stores your data, you need 100% confidence in.  Upgrading to Activerecord was easier because you could do both in parallel.   Lesson: Keep up with upgrades.  Once you’re a couple behind it’s over.
  • Resque – queueing system for Ruby. They love it. Gearman is also a great choice. Lesson: You need a queue – start with one. Retrofitting makes things much harder.
  • Data: MySQL/Redis (and Elasticsearch)
    • MySQL: You have to trust your database like nothing else. You want battle tested, well understood infrastructure here. And scaling mySQL is a solved problem, just read Cal Henderson’s book.
    • Redis: Redis doesn’t do much, which is why it’s awesome.
    • Elasticsearch: Our search needs are small, and elastic search is easy to use.
    • Lessons from their data tier: Choose your technology on what it does today, not promises of the future. They take a couple half hour downtimes a year for schema upgrades. You don’t need 99.999% availability yet as a startup.  Sure, the Facebook/Yahoo/Google presentations about that are so tempting but you/re 4 guys, not them.
  • Monitoring
    • Munin – monitoring, graphing, alerting.  Now collected, nagios and custom code and they hate it.
    • Pingdom is awesome. It’s the service of last resort.
    • Pagerduty is also awesome. Makes sure you get woken up and you know who does.
    • Papertrail is hosted syslog. “It’s not splunk but it’s good enough for our needs.” “But a syslog server is easy to run.  Why use papertrail?” The tools around it are better than what they have time to build themselves.  Hosted services are usually better and cheaper than what you can do yourself.  If there’s one that does what you need, use it.  If it costs less than $70/month buy without thinking about it, because the AWS instance to run whatever open source thingy you were going to use instead costs that much.
    • #monitoringsucks shout-out!  “I don’t know anyone who’s happy with their monitoring that doesn’t have 3-4 full time engineers working on it.”  However, #monitoringsucks isn’t delivering. Every single little open source doohickey you use is something else to go wrong and something they all need to understand.  Nothing is meeting small startups’ needs.  A lot of the hosting ones too, they charge per metric or per host (or both) and that’s discouraging to a startup.  You want to be capturing and graphing as much as you can.
  • Chef – started with chef-solo and rsync; moved to Chef Hosted in 2011 and have been very happy with it.
  • Ubuntu TLS 10.04.  “I don’t thing any startup has ever failed because they picked the wrong Linux distribution.”
  • Slicehost – loved it but then Rackspace shut it down, and the migration sucked – new IPs, hours of downtime. Migrated to Rackspace and EC2. Lots of people are going to bash cloud hosting later at the conference as a waste of money. Counterpoint – “Employees are the biggest cost to a startup.”
  • Start with EC2, period, unless you’re an infra company or totally need super bare metal performance.
  • But – credentials… use IAM to manage them. We use it at BV but it ends up causing a lot of problems too (“So you want your stuff in different IAM accounts to talk to each other like with VPC?  Oh, well, not really supported…”)  Never use the root credentials.
  • Databases in the cloud.  Ephemeral or EBS? Backups? They get a high memory instance, run everything in memory, and then stop worrying about disk IO.  Sha za!  Figure it out later.
  • DynECT – Invisible and fine.
  • Edgecast – cool. CDNs are not created equal, and they have different strengths in regions etc. If you don’t want to hassle with talking to someone on the phone, screw Akamai/Limelight/etc. If you’re not haggling you’re paying too much.  But as a startup, you want click to start, credit card signup. Amazon Cloudfront, Fastly. For Typekit they needed high uptime and high performance as a critical part of the service.  Story time, they had a massive issue with Edgecast as about.me was going live. See Designing for Disaster by Jeff Veen from Velocity Europe. Systems perform in unexpected ways as they grow.  Things have unexpected scaling behavior. Know your escape plan for every infrastructure provider.  That doesn’t have to be “immediate hot backup available,” just a plan.
  • Github – using organizations.
  • Google Apps – yay.  Using Google App Engine for their status page to put it on different infrastructure. They use Stashboard, which we used at NI!

“Buy or build?”

Buy, unless nothing meets your needs.  Then build.  Or if it’s your core business and you’re eating your own dog food.
If it costs more than your annual salary, build it.

A third party provider having an outage is still YOUR problem. Still need a “sorry!” Write your update without naming your service provider.  [You should take responsibility but that seems close to not being transparent to me. -Ed.]  Anyway, buy or build option is “neither” if it’s not needed for the minimum viable product.

You’re not Facebook or Etsy with 100 engineers yet. You don’t need a highly scalable data store.  A half hour outage is OK. You don’t need multi-vendor redundancy, you need a product someone cares about.

Rule #3 – Set up the infrastructure you need.

Rule #4 – Don’t set up infrastructure you don’t need.

Almost every performance problem has been on something they didn’t yet measure.  All their scaling pain points were unexpected.  You can’t plan for everything and the stuff you do plan for may be wasted.

Brain twister: He spent a week to write code to automatically bring up a front end Tomcat server in AWS if one of theirs crashes.  That has never happened in years.  Was that work worth while, does it really meet ROI?

Rule #5 – Don’t make future work for yourself.

There’s a difference between not doing something yet and deliberately setting yourself up for redo.  People talk about “technical debt” but just as in finance, there’s judicious debt and then there’s payday loans. Optimize for change. Every time you grow 10x you’ll need to rewrite. Just make it easy to change.

“You ain’t gonna need it”

Everyone’s startup story:

  1. Find biggest problem
  2. Fix biggest problem
  3. Repeat

The story never reads like:

  1. Up front, plan and build infrastructure based on other companies
  2. Total success!

Minimum Viable Infrastructure for a Startup:

  1. Source control
  2. Configuration management
  3. Servers
  4. Backups
  5. External availability monitoring

So you really could get started with github orgs, rsync/bash, EC2, s3cmd, pingdom, then start improving from there. Well, he’s not really serious you should start that way, he wouldn’t start with rsync again.  But he’s somewhat serious, in that you should really consider the minimum (but good) solution and not get too fancy before you ship.

Watch out for

  • Black swans
  • Vendor lockin
  • Unsupported products
  • Time wasting

Woot! This was a great session, everything from straight dope on specific techs, mistakes made and lessons learned, high level guidance with tangible rules of thumb.

Question and Answer Takeaways:
If you’re going to build, build and open source it to make the ecosystem better
Monitoring – none of them have a decent dashboard. Ganglia, nagios, munin UI sucks.


Discussion with Mike Rembetsy and other Etsyans about why JIRA and Confluence are ubiquitously used but people don’t like talking about it.  His theory is that everyone has to hack them so bad that they don’t want to answer 100 questions about “how you made JIRA do that.”

Turning Operational Data Into Gold At Expedia

By Eddie Satterly, previously of Expedia and now with Splunk. This is starting off bad.  I was hoping with Expedia having top billing it was going to be more of a real use case but we’re getting stock splunk vendor pitch.

Eddie Satterly was sr. director of arch at Expedia, now with splunk.  They put 6 TB/day in splunk. Highlights:

  • They built a sdk for cassandra data stores  and archive specific splunks for long term retention to hadoop for batch analysis
  • The big data integration really ramped up the TB/day
  • They do external lookups – geo, ldap, etc.
  • Puppet deploy of the agents/SCCM and gold images
  • A lot of the tealeaf RUM/Omniture Web analytics stuff is being done in splunk now
  • Zenoss integration but moving more to splunk there too
  • Using the file integrity monitoring stuff
  • Custom jobs for unusual volumes and “new errors”

Session was high on generalities; sadly I didn’t really come away with any new insights on splunk from it. Without the sales pitch it could have been a lightning talk.

11 Ways To Hack Puppet For Fun and Productivity

by Luke Kanies. I got here late but all I missed was a puppet overview. Slides on Slideshare.


  1. Puppet as you.  It doesn’t have to run as root.
  2. Curl speaks.  You can pull catalogs etc. easily, decouple see facts/pull catalog/run catalog/run report.
  3. Data, and lots of it. Catalogs, facts, reports.
  4. Static compiler. Refer to files with checksum instead of URL. And it reduces requests for additional files.
  5. config_version. Find out who made changes in this version.
  6. report processor.
  7. Function
  8. Fact
  9. Types
  10. Providers
  11. Face

Someone’s working on a puppet IDE called geppetto (eclipse based).

I don’t know much puppet yet, so most of this went right by me.

Develop and Test Configuration Management Scripts With Vagrant

By Mitchell Hashimoto from Kiip (@mitchellh). Slides on Speakerdeck.

Sure, you can bring up an ec2 instance and run chef and whatnot, but that gets repetitive. This tempts you to not do incremental systems development, because it takes time and work. So you just “set things up once” and start gathering cruft.

Maybe you have a magic setup script that gets your Macbook all up and running your new killer app. But it’s unlikely, and then it’s not like production.  Requires maintenance, what about small changes… Bah. Or perhaps an uber-readme (read: Confluence wiki page). Naturally prone to intense user error. So, use Vagrant!

We’ll walk through the CLI, VM creation, provisioning, scripted config of vm, network, fs, and setup

Install Virtualbox and Vagrant – All that’s needed are vagrantfile and vagrant CLI
vagrantfile: Per project configuration, ruby DSL
CLI: vagrant <something> e.g “vagrant up”

vagrant box – set up base boxes.  It’s just a single file. “vagrant box add name url”.
Go to vagrantbox.es for more base boxes. They’re big (It’s a vm…)

Project context. “vagrant init <boxtype>” will dump you a file.

“vagrant up” makes a private copy, doesn’t corrupt base box

vagrant up, status, reload, suspend (freeze), halt (shutdown), destroy (delete)

Provides shared folders, NFS to share files host to guest
Shared folder performance degrades with # of files, go to NFS

Provisioning – scripted instal packages, etc.  It supports shell/puppet/chef and soon cfengine.
Use the same scripts as production. vagrant up does utp, but vagrant reload or provision does it in isolation

Networking – port forwarding, host-onlu

port forwarding exposes hosts on the guest via ports on the host, even to the outside.
Simple, over 1024 and open
host only makes a private net of VMs and your host. set IPs or even DHCP it. Beware of IP collisions.
bridge – get IPs from a real router. makes them real boxes, though bad networks won’t do it.

multi vm.  Configure multiple VMs in one file and hook ’em up.  In multi mode you can specify a target on each command to not have it do on all

vagrant package “burns a new AMI” off the current system.
package up installed software, use provisioners for config and managing services

Great for developing and testing chef/puppet/etc scripts. Use prod-quality ops scripts to set up dev env’s, QA. It brings you a nice standard workflow.


  • other virtualization, vmware, ec2, kvm
  • vagrant builder: ami creator
  • any guest OS

End, Day One!

And we’re done with “Tutorial” day!  The distinction between tutorials and other conference sessions is very weak and O’Reilly would do better to just do a three day conference and right-size people’s presentations – some, like the Typekit one, deserve to be this long.  Others should be a normal conference session and some should be a lightning talk.

Then we went to the Ignites and James and I did Ignite slide karaoke where you have to talk to random slides.  Check out the deck, I got slides 43-47 which were a bit of a tough row to hoe. I got to use my signature phrase “keep your pimp hand strong” however.

1 Comment

Filed under Conferences, DevOps