My name is Nathaniel Eliot, and I’ve worked extensively in software deployment over the last several years. I have worked on two automation frameworks around Chef: Ironfan, an open-source cluster management system from Infochimps, and Elzar, a (sadly closed-source) blue-green framework based on Spiceweasel. I currently work at Bazaarvoice, where I’m building out a Flynn.io installation.
There is a catch-phrase in DevOps: “cattle, not pets”. It’s intended to describe the step forward that configuration management (CM, e.g. Chef, Puppet, Ansible, Salt, etc.) tools provide. Instead of building and maintaining systems by hand, DevOps-savvy engineers aim to build them via automated, repeatable systems. This has revolutionized system deployment, and resulted in vast improvements in the stability and manageability of large, complicated systems.
But while cattle are an important part of civilizing your software, they have drawbacks. As anybody who’s worked a farm will tell you, cattle management is hard work. Long-lived systems (which most barnyard-style deployments still are) decay with age, as their surrounding environment changes, and as faults are triggered in software components; upgrades to fix these issues can be complicated and fragile. New hosts for these systems can also suffer from unintended evolution, as external resources referenced by the CM build process change. System builds are often lengthy affairs, and often heavily intertwined, such that singular failures can block updates on unrelated resources.
These issues mean that failures are often addressed by “reaching into the cow”: SSH logins to affected hosts. As the phrasing implies, this should be considered a little gross. Your team’s collective understanding of a system is based on it being build in predictable ways from visible source code: an SSH login undermines that understanding.
Building a Brickyard
The phrase I like for container automation (CA, e.g. Flynn, Mesos+Docker, etc.) is “brickyard, not barnyard”. Bricks are more uniform, quicker to make, and easier to transport than cows: CA provides greater immutability of product, faster cycle time, and easier migration than CM.
Because everything is baked to the base image, the danger of environmental changes altering or breaking your existing architecture is far lower. Instead, those changes break things during the build step, which is decoupled from the deployment itself. If you expand this immutability by providing architecturally identical container hosts, your code is also less vulnerable to “works in dev” issues, where special development configuration is lacking on production machines.
Rapid cycle time is the next great advantage that CA provides, and arguably the largest from a business perspective. By simplifying and automating build and deployment processes, CA encourages developers to commit and test regularly. This improves both development velocity and MTTR (mean time to repair), by providing safe and simple ways to test, deploy, and roll back changes. Ultimately, a brick is less work to produce than a fully functioning cow.
Because CA produces immutable results, those results can easily be transported. The underlying CA tools must be installed in the new environment, but the resulting platform looks the same to the images started on it. This gives you a flexibility in migration and deployment that may be harder to achieve in the CM world.
These benefits are theoretically achievable with configuration management; Ironfan is a good example of many of these principles at work in the CM world. However, they aren’t first class goals of the underlying tools, and so systems that achieve them do so by amalgamating a larger collection of more generic tools. Each of those tools makes choices based on the more generic set of situations it’s in, and the net result is a lot of integration pain and fragility.
Bricks or Burgers
So when should you use CM, and when should you use CA? You can’t eat bricks, and you can’t make skyscrapers from beef; obviously there are trade-offs.
Configuration management works best at smoothing the gaps between the manually deployed world that most of our software was designed in, and the fully automated world we’re inching toward. It can automate pretty much any installation you can do from a command line, handling the wide array of configuration options and install requirements that various legacy software packages expect.
Container automation currently works best for microservices: 12-factor applications that you own the code for. In existing architectures, those often live either in overly spacious (and fallable) single servers, or in messy shared systems that become managerial black-holes. This makes them an easy first target, providing greater stability, management, and isolation than their existing setups.
However, that’s as things stand currently. Civilization may depend on both, and the cattle came first, but ultimately it’s easier to build with bricks. As frameworks like Flynn expand their features (adding volume management, deploy pipelines, etc), and as their users build experience with more ambitious uses, I believe CM is slowly going to be trumped (or absorbed) by the better CA frameworks out there.