Tag Archives: configuration management

StackEngine Webinar – Docker and the Future of Configuration Management

I hope you’ve been enjoying our Docker and the Future of Configuration Management blog roundup!  I’m joining Jon Reeve of StackEngine, who’s sponsoring the roundup with prizes and such, in a Webinar this week to discuss the various points of view we’ve seen covered.

The Webinar will be on Wednesday Dec 09, 2015 at 11:00 AM CST. Register now at: https://attendee.gotowebinar.com/register/5726672543793290498

In this webinar – we’ll explore how Docker and containers are impacting the future of configuration management. Is true “Golden Image” management now a reality? We’ll explore different points of view and the pros and cons of Docker’s impact.

We’ll also review StackEngine’s approach to Docker and container management and how it is benefiting DevOps and Operations teams.

Leave a comment

Filed under DevOps

Docker: Service Management Trumps Configuration Management

Docker – It’s Lighter, Is That Really Why It’s So Awesome?

When docker started to hit so big, I have to admit I initially wondered why.  The first thing people would say when they wanted to talk about its value is “There’s slightly less overhead than virtualization!” Uh… great? But chroot jails etc. have existed for a long time, like even back when I got started on UNIX,and fell out of use for a reason,  and there also hadn’t been a high pressure rush of virtualization and cloud vendors trying to keep up with the demand for “leaner” VMs – there was some work to that end but it clearly wasn’t a huge customer driver. If you cared too much about the overhead, you had the other extreme of just jamming multiple apps onto one box, old school style. Now, you don’t want to do that – I worked in an environment where that was the policy and I developed my architectural doctrine of “sharing is the devil” as a result. While running apps on bare metal is fast and cost effective, the reliability, security, and manageability impediments are significant. But “here’s another option on the spectrum of hardware to VM” doesn’t seem that transformative on its face.

OK, so docker is a process wrapper that hits the middle ground between a larger, slower VM and running unprotected on hardware. But it’s more than that.  Docker also lets you easily create packaged, portable, ready to run applications.

The Problems With Configuration Management Today

The value proposition of docker started to become more clear once the topic of provisioning and deployment came up. Managing systems, applications and application deployments has been at worst a complicated muddle of manual installation, but at best a mix of very complex configuration management systems and baked images (VMs or AMIs). Engineers skilled in chef or puppet are rare. And developers wanted faster turnaround to deploy their applications. I’ve worked in various places where the CM system did app deploys but the developers really, really wanted to bypass that via something like capistrano or direct drop-in to tomcat, and there were always continuous discussions over whether there should be dual tooling, a CM system for system configuration and an app deployment system for app deploys. And if you have two different kinds of tooling controlling  your configuration (especially when, frankly, the apps are the most important part) leads to a lot of conflict and confusion and problems in the long term.

And many folks don’t need the more complex CM functionality. Many modern workloads don’t need a lot of OS and OS software – the enterprise does that, but many new apps are completely self-contained, even to the point of running their own node.js or jetty, meaning that a lot of the complexity of CM is not needed if you’re just going to drop a file onto a  vanilla box and run it. And then there’s the question of orchestration. Most CM systems like to put bits on boxes, but once there’s a running interconnected system, they are more difficult to deal with.  Many discussions about orchestration over the years were frankly rebuffed by the major CM vendors and replied to with “well, then integrate with something (mcollective, rundeck).” In fact, this led to the newer products like ansible and salt arising – they are simpler and more orchestration focused.

But putting bits on boxes is only the first step.  Being able to operate a running system is more important.

Service Management

Back when all of the agile admins were working at National Instruments, we were starting a new cloud project and wanted to automate everything from first principles. We looked at Chef and Puppet but first, we needed Windows support (this was back in 2008, and their Windows support was minimal), and second, we had the realization that a running cloud, REST services type system is composed of various interdependent services, and that we wanted to model that explicitly. We wanted more than configuration management – we wanted service management.

What does it look like when you draw out your systems?  A box and line diagram, right? Here’s part of such a diagram from our systems back then.

phylogical

Well, not to oversimplify, but when you use something like CloudFormation, you get the yellow boxes (hardware, VMs, cloud instances). When you use something like chef or puppet, you get the white boxes (software on the box). But what about the lines? The point of all those bits is to create services, which are called by customers and/or other services, and being able to address those services and the relationships between them is super important. And trying to change any of the yellow or white boxes without intelligent orchestration to handle the lines – what makes your services actually work – is folly.

In our case, we made the Programmable Infrastructure Environment – modeled the above using XML files and then added a zookeeper-based service registry to handle the connections, so that we could literally replace a database server and have all the other services dependent on it detect that, automatically parse their configurations, restart themselves if necessary, and connect to the new one.

This revolutionized the way we ran systems.  It was very successful and was night and day different from the usual method of provisioning, but more importantly, controlling production systems in the face of both planned and unplanned changes. It allowed us to instantiate truly identical environments, conduct complex deployments without downtime, and collaborate easily between developers and operations staff on a single model in source control that dictated all parts of the system, from system to third party software to custom applications.

That, in conjunction with ephemeral cloud systems, also made our need for CM a lot simpler – not like a university lab where you want it to always be converging to whatever new yum update is out, but creating specifically tooled parts for one use and making new ones and throwing the old ones away as needed. Since we worked at National Instruments, this struck us as on the same spectrum the difference from hand-created hardware boards to FPGAs to custom chips – the latter is faster and cheaper and basically you throw it away for a new one when you need a change, though those others are necessary steps along the path to creating a solid chip.

We kept wondering when the service management approach would catch on.  Ubuntu’s Juju works in this way, but stayed limited to Ubuntu for a long time (though it got better recently!) and hasn’t gotten much reach as a result.

Once docker came out – lo and behold, we started to see that pattern again!

Docker and Service Management

Dockerfiles are simple CM systems that pull some packages, install some software, and open some ports. Here’s an example of a dockerfile for haproxy:

#
# Haproxy Dockerfile
#
# https://github.com/dockerfile/haproxy
#
# Pull base image.
FROM dockerfile/ubuntu
# Install Haproxy.
RUN \
sed -i ‘s/^# \(.*-backports\s\)/\1/g’ /etc/apt/sources.list && \
apt-get update && \
apt-get install -y haproxy=1.5.3-1~ubuntu14.04.1 && \
sed -i ‘s/^ENABLED=.*/ENABLED=1/’ /etc/default/haproxy && \
rm -rf /var/lib/apt/lists/*
# Add files.
ADD haproxy.cfg /etc/haproxy/haproxy.cfg
ADD start.bash /haproxy-start
# Define mountable directories.
VOLUME [“/haproxy-override”]
# Define working directory.
WORKDIR /etc/haproxy
# Define default command.
CMD [“bash”, “/haproxy-start”]
# Expose ports.
EXPOSE 80
EXPOSE 443

Pretty simple right? And you can then copy that container (like an AMI or VM image) instead of re-configuring every time. Now, there are arguments against using pre-baked images – see Golden Image or Foil Ball. But  at scale, what’s the value of conducting the same operations 1000 times in parallel, except for contributing to the heat death of the universe? And potentially failing from overwhelming the same maven or artifactory server or whatever when massive scaling is required?  There’s a reason Netflix went to an AMI “baking” model rather than relying on config management to reprovision every node from scratch. And with docker containers each container doesn’t have a mess of complex packages to handle dependencies for; they tend to be lean and mean.

But the pressure of the dynamic nature of these microservices has meant that actual service dependencies have to be modeled. Bits of software like etcd and docker compose are tightly integrated into the container ecosytem to empower this. With tools like this you can define a multi-service environment and then register and programmatically control those services when they run.

Here’s a docker compose file:

web: 
build: . 
ports:
 - "5000:5000"  
volumes:
 - .:/code  
links:
 - redis 
redis:
 image: redis

It maps the web server’s port 5000 to the host port 5000 and creates a link to the “redis” service.  This seems like a small thing but it’s the “lines” on your box and lines diagram and opens up your entire running system to programmatic control.  Pure CM just lets you change the software and the rest is largely done by inference, not explicit modeling. (I’m sure you could build something of the sort in Chef data bags or whatnot, but that’s close to saying “you could code it yourself” really.)

This approach was useful even in the VM and cloud world, but the need just wasn’t acute enough for it to emerge.  It’s like CM in general – it existed before VMs and cloud but it was always an “if we have time” afterthought – the scale of these new technologies pushed it into being a first order consideration, and then even people not using dynamic technology prioritized it. I believe service management of this sort is the same way – it didn’t “catch on” because people were not conceptually ready for it, but now that containers is forcing the issue, people will start to use this approach and understand its benefit.

CM vs. App Deployment?

In addition, managing your entire infrastructure like a build pipeline is easier and more aligned with how you manage the important part, the applications and services.  It’s a lot harder to do a good job of testing your releases when changes are coming from different places – in a production system, you really don’t want to set the servers out there and roll changes to them in one way and then roll changes to your applications in a different way.  New code and new package versions best roll through the same pipeline and get the same tests applied to them. While it is possible to do this with normal CM, docker lends itself to this by default. However, it doesn’t bother to address this on the core operating system, which is an area of concern and a place where well thought out CM integration is important.

Conclusion

The future of configuration management is in being able to manage your services and their dependencies directly, and not by inference. The more these are self-contained, the more the job of CM is simpler just as now that we’ve moved into the cloud the need for fiddling with hardware is simpler. Time spent messing with kernel configs and installing software has dropped sharply as we have started to abstract systems at a higher level and use more small, reusable bits. Similarly, complex config management is something that many people are looking at and saying “why do I need this any more?”  I think there are cases where you need it, but you should start instead with modeling your services and managing those as the first order of concern, backing it with just enough tooling for your use case.

Just like the cloud forced the issue with CM and it finally became a standard practice instead of an “advanced topic,” my prediction is that containers will force the issue with service management and cause it to become more of a first class concern for CM tools back even in cloud/VM/hardware environments.

This article is part of our Docker and the Future of Configuration Management blog roundup running this November.  If you have an opinion or experience on the topic you can contribute as well

4 Comments

Filed under DevOps

Docker and the Future of Configuration Management – Coming In November!

All right!  We have a bunch of authors lined up for our blog roundup on Docker and the future of CM.  We’ll be publishing them throughout the month of November. But it’s not too late to get in on the action, speak up and you can get a guest post too! And have a chance to win that sweet LEGO Millenium Falcon…

To get you in the mood, here’s some good Docker and CM related posts I’ve read lately:

And a video, Immutable Awesomeness – I saw John and Josh present this at DevOps Enterprise Summit and it’s a must-watch! Using containers to create immutable infrastructure for DevOps and security wins.

Leave a comment

Filed under DevOps

Docker and The Future of Configuration Management – Call For Posts

Docker Docker Docker

docker-logoDocker is taking the DevOps world by storm. It has moved from “cool DevOps tool” to “lots of attention at the highest levels in the enterprise” in record time. One of the key discussion points when introducing docker, however, is how that will interact with the rest of your configuration management picture. Some people have claimed that the docker model greatly reduces the need for traditional CM; others disagree and feel that’s too simplistic an approach. (If you aren’t sure what docker is yet, check out Karthik’s recent Webcast Why To Docker? for an intro.)

What we want to do here at the Agile Admin is collect a diversity of experiences on this!  Therefore we are declaring a blog roundup on this topic where we’ll publish a bunch of articles on the topic. And we’d love for any of you out there to participate!

Come Blog With Us

Just email us (ernest dot mueller at theagileadmin.com works) or reply in the comments to this post saying “I will write something on this!” and we’ll publish your thoughts on the topic here on The Agile Admin. We’ll be starting this roundup in about a week and looking to finish up by the end of November. We’re specifically soliciting some authors to write on the topic, but we also want all of you in the DevOps community to get your say in. We can set you up as an author here in our WordPress, or you can write in Word or whatnot and mail it in, either way. Make sure and include a brief blurb about who you are at the end!

Fun and Prizes!

lego-falconStackEngine has generously agreed to sponsor this roundup, and the best community post (we’ll choose one) by Nov 30 will win a LEGO Millenium Falcon to show off their geek cred! And very soon even your kids will know what it is.

So please let us know  you’d like to participate, and then write an article – really anything touching on docker and config management is fine, even “we started to look into docker and stopped because we couldn’t figure out how to slot it in with our existing CM,” or “the simplicity of dockerfiles makes me swear off CM forever” or “golden images are the way to go” or “who needs docker anyway if you have a stellar chef implementation you get the same benefits…” No restrictions (other than those good taste and professionalism dictate). Heck, other containerization technologies than docker per se are fine too. “Containers, CM, what do you know?” No length requirements, and you can have images and videos and stuff, that’s all great.

We know keeping up a whole blog is a lot of work many technologists don’t have time for, so this is our first experiment in letting other great DevOps folks who may want to write an article every once in a while without the hassle of running their own thing use The Agile Admin to get their post out to a decent-sized audience. Come on down, we’re friendly!

StackEngineAnd thanks to StackEngine for sponsoring the roundup! They are an Austin startup that’s all about Docker tooling. Fellow Agile Admin Karthik and Boyd, who many of you know from the Austin-and-wider DevOps scene, are part of the SE team, which vouches for their awesomeness.

Leave a comment

Filed under DevOps

Velocity 2013 Day 3 Liveblog: Getting Started With Configuration Management

Getting Started With Configuration Management

By @sascha_d.

Configuration management! define and idempotently enforce system state across a bunch of machines.

But it’s not about the tool. But you need one.

You should care about package repositories!

Anyway, she was an infra queen who loved to rewrite things by hand, but finally realized this was a blocker. Needed repeatable process. Started to look at all the CM stuff but it was really overwhelming, there’s so much out there.

Lose the baggage

Need to remove your fear, inflexibility, and arrogance.  You have some, you’re an engineer. You often need to change how you think about things.  CM and automation is not a “threat to your job.”

It’s OK not to know and to admit it.  There’s a lot of crap out there.  What’s a ruby gem?  I don’t know either. It’s OK. You can go learn what you need. Everyone’s “faking it,” that’s how it works.

“I can’t code/I don’t program/I’m not a developer.” It’s OK, you can – you don’t have to be a pro dev. You get most of the concepts from doing good CM.

It’s not “just scripts” – this is all hard work and code.

Also, just because you understand systems doesn’t mean you understand CM – and moving that understanding from the gut to the mind.

Ask why things are the way they are; don’t accept constraints just because they’re there now.

Learn your tool

Resist the urge to “automate the world.” Pick something small but impactful, light on data.

Understand the primitives of your tool.  Don’t just port your bask scripts or break out to a bash block.

Read the source code.  You don’t have to write it to read it. You will see what really happens.

Test!  Learn to test! vagrant, test-kitchen, bats, jenkins (vagrant book free this week only!)

Own or get pwned

Infrastructure is an ecosystem, and you need to have curators for the tools.

There’s acceptable and unacceptable technical debt.

Curate what you’re installing – if you just rampantly download newest from the Internet you get jacked.

Own your package repos. Have one.  Have base/custom. Don’t just have stuff sitting around.

Own your build tools – artifactory/nexus/jenkins/travis.

Own your version control. Well, more “use” version control.

Own your integrity. Don’t disable CM, don’t do different in dev and prod, don’t deploy differently in different envs… You have control over this. [Ed: We are bad about this. Whether a change is puppet or manual or rundeck is a big ass mystery in our environment.  Angry cat.]

The end!

Leave a comment

Filed under Conferences, DevOps

Amazon CloudFormation: Model Driven Automation For The Cloud

You may have heard about Amazon’s newest offering they announced today, CloudFormation.  It’s the new hotness, but I see a lot of confusion in the Twitterverse about what it is and how it fits into the landscape of IaaS/PaaS/Elastic Beanstalk/etc. Read what Werner Vogels says about CloudFormation and its uses first, but then come back here!

Allow me to break it down for you and explain why this is such a huge leverage point for cloud developers.

What Has Come Before

Up till now on Amazon you could configure up a single virtual image the way you wanted it, with an AMI. You could even kind of construct a scalable tier of similar systems using Auto Scaling, by defining Launch Configurations. But if you wanted to construct an entire multitier system it was a lot harder.  There are automated configuration management tools like chef and puppet out there, but their recipes/models tend to be oriented around getting a software loadout on an existing system, not the actual system provisioning – in general they come from the older assumption you have someone doing that on probably-physical systems using bcfg2 or cobber or vagrant or something.

So what were you to do if you wanted to bring up a simple three tier system, with a Web tier, app server tier, and database tier?  Either you had to set them up and start them manually, or you had to write code against the Amazon APIs to explicitly pull up what you wanted. Or you had to use a third party provisioning provider like RightScale or EngineYard that would let you define that kind of model in their Web consoles but not construct your own model programmatically and upload it. (I’d like my product functionality in my own source control and not your GUI, thanks.)

Now, recently Amazon launched Elastic Beanstalk, which is more way over on the PaaS side of things, similar to Google App Engine.  “Just upload your application and we’ll run it and scale it, you don’t have to worry about the plumbing.” Of course this sharply limits what you can do, and doesn’t address the question of “what if my overall system consists of more than just one Java app running in Beanstalk?”

If your goal is full model driven automation to achieve “infrastructure as code,” none of these solutions are entirely satisfactory. I understand CloudFormation deeply because we went down that same path and developed our own system model ourselves as a response!

I’ll also note that this is very similar to what Microsoft Azure does.  Azure is a hybrid IaaS/PaaS solution – their marketing tries to say it’s more like Beanstalk or Google App Engine, but in reality it’s more like CloudFormation – you have an XML file that describes the different roles (tiers) in the system, defines what software should go on each, and lets you control the entire system as a unit.

So What Is CloudFormation?

Basically CloudFormation lets you model your Amazon cloud-based system in JSON and then provision and control it as a unit.  So in our use case of a three tier system, you would model it up in their JSON markup and then CloudFormation would understand that the whole thing is a unit.  See their sample template for a WordPress setup. (A mess more sample templates are here.)

Review the WordPress template; it lets you define the AMIs and instance types, what the security group and ELB setups should be, the RDS database back end, and feed in variables that’ll be used in the consuming software (like WordPress username/password in this case).

Once you have your template you can tell Amazon to start your “stack” in the console! It’ll even let you hook it up to a SNS notification that’ll let you know when it’s done. You name the whole stack, so you can distinguish between your “dev” environment and your “prod” environment for example, as opposed to the current state of the Amazon EC2 console where you get to see a big list of instance IDs – they added tagging that you can try to use for this, but it’s kinda wonky.

Why Do I Want This Again?

Because a system model lets you do a number of clever automation things.

Standard Definition

If you’ve been doing Amazon yourself, you’re used to there being a lot of stuff you have to do manually.  From system build to system build even you do it differently each time, and God forbid you have multiple techies working on the same Amazon system. The basic value proposition of “don’t do things manually” is huge.  You configure the security groups ONCE and put it into the template, and then you’re not going to forget to open port 23 AGAIN next time you start a system. A core part of what DevOps is realizing as its value proposition is treating system configuration as code that you can source control, fix bugs in and have them stay fixed, etc.

And if you’ve been trying to automate your infrastructure with tools like Chef, Puppet, and ControlTier, you may have been frustrated in that they address single systems well, but do not really model “systems of systems” worth a damn.  Via new cloud support in knife and stuff you can execute raw “start me a cloud server” commands but all that nice recipe stuff stops at the box level and doesn’t really extend up to provisioning and tracking parts of your system.

With the CloudFormation template, you have an actual asset that defines your overall system.  This definition:

  • Can be controlled in source control
  • Can be reviewed by others
  • Is authoritative, not documentation that could differ from the reality
  • Can be automatically parsed/generated by your own tools (this is huge)

It’s also nicely transparent; when you go to the console and look at the stack it shows you the history of events, the template used to start it, the startup parameters it used… Moving away from the “mystery meat” style of system config.

Coordinated Control

With CloudFormation, you can start and stop an entire environment with one operation. You can say “this is the dev environment” and be able to control it as a unit. I assume at some point you’ll be able to visualize it as a unit, right now all the bits are still stashed in their own tabs (and I notice they don’t make any default use of their own tagging, which makes it annoying to pick out what parts are from that stack).

This is handy for not missing stuff on startup and teardown… A couple weeks ago I spent an hour deleting a couple hundred rogue EBSes we had left over after a load test.

And you get some status eventing – one of the most painful parts of trying to automate against Amazon is the whole “I started an instance, I guess I’ll sit around and poll and try to figure out when the damn thing has come up right.”  In CloudFront you get events that tell you when each part and then the whole are up and ready for use.

What It Doesn’t Do

It’s not a config management tool like Chef or Puppet. Except for what you bake onto your AMI it has zero software config capabilities, or command dispatch capabilities like Rundeck or mcollective or Fabric. Although it should be a good integration point with those tools.

It’s not a PaaS solution like Beanstalk or GAE; you use those when you just have an app you want to deploy to something that’ll run it.  Now, it does erode some use cases – it makes a middle point between “run it all yourself and love the complexity” and “forget configurable system bits, just use PaaS.”  It allows easy reusability, say having a systems guy develop the template and then a dev use it over and over again to host their app, but with more customization than the pure-play PaaSes provide.

It’s not quite like OVF, which is more fiddly and about virtually defining the guts of a single machine than defining a set of systems with roles and connections.

Competitive Analysis

It’s very similar to Microsoft Azure’s approach with their .cscfg and .csdef files which are an analogous XML model – you really could fairly call this feature “Amazon implements Azure on Amazon” (just as you could fairly call Elastic Beanstalk “Amazon implements Google App Engine on Amazon”.) In fact, the Azure Fabric has a lot more functionality than the primitive Amazon events in this first release. Of course, CloudFormation doesn’t just work on Windows, so that’s a pretty good width vs depth tradeoff.

And it’s similar to something like a RightScale, and ideally will encourage them to let customers actually submit their own definition instead of the current clunky combo of ServerArrays and ServerTemplates (curl or Web console?  Really? Why not a model like this?). RightScale must be in a tizzy right now, though really just integrating with this model should be easy enough.

Where To From Here?

As I alluded, we actually wrote our own tool like this internally called PIE that we’re looking to open source because we were feeling this whole problem space keenly.  XML model of the whole system, Apache Zookeeper-based registry, kinda like CloudFormation and Azure. Does CloudFormation obsolete what we were doing?  No – we built it because we wanted a model that could describe cloud systems on multiple clouds and even on premise systems. The Amazon model will only help you define Amazon bits, but if you are running cross-cloud or hybrid it is of limited value. And I’m sure model visualization tools will come, and a better registry/eventing system will come, but we’re way farther down that path at least at the moment. Also, the differentiation between “provisioning tools” that control and start systems like CloudFormation and bcfg2 and “configuration” tools that control and start software like Chef and Puppet (and some people even differentiate between those and “deploy” tools that control and start applications like Capistrano) is a false dichotomy. I’m all about the “toolchain” approach but at some point you need a toolbelt. This tool differentiation is one of the more harmful “Dev vs Ops” differentiations.

I hope that this move shows the value of system modeling and helps people understand we need an overarching model that can be used to define it all, not just “Amazon” vs “Azure” or “system packages” vs “developed applications” or “UNIX vs Windows…” True system automation will come from a UNIVERSAL model that can be used to reason about and program to your on premise systems, your Amazon systems, your Azure systems, your software, your apps, your data, your images and files…

Conclusion

You need to understand CloudFormation, because it is one of the most foundational changes that will have a lot of leverage that AWS has come out with in some time. I don’t bother to blog about most of the cool new AWS features, because they are cool and I enjoy them but this is part of a more revolutionary change in the way systems are managed, the whole DevOps thing.

7 Comments

Filed under Cloud, DevOps

Velocity 2010: Infrastructure Automation with Chef

After a lovely lunch of sammiches, we kick into the second half of Workshop Day at Velocity 2010.  Peco and I (and Jeff and Robert, also from NI) went to Infrastructure Automation with Chef, presented by Adam Jacob, Christopher Brown, and Joshua Timberman of Opscode.  My comments in italics.

Chef is a library for configuration management, and a system written on top of it.  It’s also a systems integration platform, as we will see later.  And it’s an API for your infrastructure.

In the beginning there was cfengine.  Then came puppet.  Then came chef.  It’s the latest in open source UNIXey config management automation.

  • Chef is idempotent, which means you can rerun it and get the same result, and it does minimal work to get there.
  • Chef is reasonable, and has sane defaults, which you can easily change.  You can change its mind about anything.
  • Chef is open source and you can hack it easily.  “There’s more than one way to do it” is its mantra.

A lot of the tools out there (meaning HP/IBM/CA kinds of things) are heavy and don’t understand how quickly the world changes, so they end up being artifacts of “how I should have built my system 10 years ago.”

It’s based on Ruby.  You really need a third gen language to do this effectively; if they created their own config structure it would grow into an even less standard third gen language.  If you’re a sysadmin, you do indeed program, and people that say you’re not are lying to you.  Apache config is a programming language. Chef uses small composable primitives.

You manage configuration as idempotent resources, which are put together in recipes, and tracked like source code with the end goal of configuring your servers.

Infrastructure as Code

The devops mantra.  Infrastructure is code and should be managed with the same rigor.  Source control, etc.  Chef enables this approach.  Can you reconstruct your business from source code, data backup, and bare metal?  Well, you can get there.

When you talk about constraints that affect design, one of the largest and almost unstated assumptions nowadays is that it’s really hard to recover from failure.   Many aspects of technology and the thinking of technologists is built around that.  Infrastructure as code makes that not so true, and is extremely disruptive to existing thought in the field.

Your automation can only be measured by the final solution.  No one cares about your tools, they care about what you make with them.

Chef Basics

There is a chef client that runs on each server, using recipes to configure stuff.  There’s a chef server they can talk to – or not, and run standalone.  They call each system a “node.”

They get a bunch of data points, or attributes, off the nodes and you can search them on the server, like “what version of Perl are you running.”  “knife” is the command line tool you use to do that.

Nodes have a “run list.”  That’s what roles or recipes to apply to a node, in order.

Nodes have “roles.”  A role is a description of what a node should be, like “you’re a Web server.”  A role has a run list of its own, and attributes to modify them – like “base, apache2, modssl” and “maxchildren=50”.

Chef manages resources on nodes.  Resources are declarative descriptions of state.  Resources are of type package or service; basically software install and running software.  Install software at a given version; run a service that supports certain commands.  There’s also a template resource.

Resources take action through providers.  A provider is what knows how to actually do the thing (like install a package, it knows to use apt-get or yum or whatever).

Think about it as resources go through a platform to pick a provider.

Recipes apply resources in order.  Order of execution is determined by the order they’re listed, which is pretty intuitive.  Also, systems that fail within a recipe should generally fail in the same state.  Hooray, structured programming!

Recipes can include other recipes.  They’re just Ruby.  (Everything in Chef is Ruby or JSON). No support for asynchronous actions – you can figure out a way to do it (for file transfers, for example) but that’s really bad for system packages etc.

Cookbooks are packages for recipes.  Like “Apache.”  They have recipes, assets (like the software itself), and attributes.  Assets include files, templates (evaluated with a templating language called ERB), and attributes files (config or properties files).  They try to do some sane smart config defaults (like in nginx, workers = number of cores in the box).  Cookbooks also have definitions, libraries, resources, providers…

Cookbooks are sharablehttp://cookbooks.opscode.com/ FTW! They want the cookbook repo to be like CPAN – no enforced taxonomy.

Data bags store arbitrary data.  It’s kinda like S3 keyed with JSON objects .  “Who all plays D&D?  It’s like a Bag of Holding!”  They’re searchable.  You can e.g. put a mess of users in one.  Then you can execute stuff on them.  And say use it instead of Active Directory to send users out to all your systems.  “That’s bad ass!” yells a guy from the crowd.

Working with Chef

  1. Install it.
  2. Create a chef repo.  Like by git cloning their stock one.
  3. Configure knife with a .chef/knife.rb file.  There’s a Web UI too but it’s for feebs.
  4. Download some cookbooks.  “knife cookbook site vendor rails -d” gets the ruby cookbook and makes a “vendor branch” for it and merges it in.
  5. Read the recipes.  It runs as root, don’t be a fool with your life.
  6. Upload them to the server.
  7. Build a role (knife role create rails).
  8. Add cloud credentials to knife – it knows AWS, Rackspace, Terremark.
  9. Launch a new rails server (knife ec2 server create ‘role[rails]’) – can also bootstrap
  10. Run it!
  11. Verify it!  knife ssh does parallel ssh and does command, or even screen/tmux/macterm
  12. Change it by altering your recipe and running again.

Live Demo

This was a little confusing.  He started out with a data bag, and it has a bunch of stuff configured in it, but a lot of the stuff in it I thought would be in a recipe or something.  I thought I was staying with the presentation well, but apparently not.

The demo goal is good – configure nagios and put in all the hosts without doing manual config.

Well, this workshop was excellent up to here – though I could have used them taking a little more time in “Working with Chef” – but now he’s just flipping from chef file to chef file and they’re all full of stuff that I can’t identify immediately because I’m, you know, not super familiar with Chef.  THey really could have used a more “hello world”y demo or at least stepped through all the pieces and explained them (ideally in the same order as the “working with chef” spiel).

Chef 0.8 introduced the “chef shell,” shef.    You can run recipes line by line in it.

And then there was a fire alarm!  We all evacuate.  End of session.

Afterwards, in the gaggle, Adam mentioned some interesting bits, like there is Windows support in the new version.  And it does cloud stuff automatically by using the “fog” library.  And unicorn, a server for people that know about 200% more about Rails than me.  That’s the biggest thing about chef – if you don’t do any other Ruby work it’s a pretty  high adoption bar.

One more workshop left for Day 1!

6 Comments

Filed under Conferences, DevOps

A Case For Images

After speaking with Luke Kanies at OpsCamp, and reading his good and oft-quoted article “Golden Image or Foil Ball?“, I was thinking pretty hard about the use of images in our new automated infrastructure.  He’s pretty against them.  After careful consideration, however, I think judicious use of images is the right thing to do.

My top level thoughts on why to use images.

  1. Speed – Starting a prebuilt image is faster than reinstalling everything on an empty one.  In the world of dynamic scaling, there’s a meaningful difference between a “couple minute spinup” and a “fifteen minute spinup.”
  2. Reliability – The more work you are doing at runtime, the more there is to go wrong.  I bet I’m not the only person who has run the same compile and install on three allegedly identical Linux boxen and had it go wrong somehow on one of ’em.  And the more stuff you’re pulling to build your image, the more failure points you have.
  3. Flexibility – Dynamically building from stem cell kinda makes sense if you’re using 100% free open source and have everything automated.  What if, however, you have something that you need to install that just hasn’t been scripted – or is very hard to script?  Like an install of some half-baked Windows software that doesn’t have a command line installer and you don’t have a tool that can do it?  In that case, you really need to do the manual install in non-realtime as part of a image build.  And of course many suppliers are providing software as images themselves nowadays.
  4. Traceability – What happens if you need to replicate a past environment?  Having the image is going to be a 100% effective solution to that, even likely to be sufficient for legal reasons.  “I keep a bunch of old software repo versions so I can mostly build a machine like it” – somewhat less so.

In the end, it’s a question of using intermediate deliverables.  Do you recompile all the code and every third party package every time you build a server?  No, you often use binaries – it’s faster and more reliable.  Binaries are the app guys’ equivalent of “images.”

To address Luke’s three concerns from his article specifically:

  1. Image sprawl – if you use images, you eventually have a large library of images you have to manage.  This is very true – but you have to manage a lot of artifacts all up and down the chain anyway.  Given the “manual install” and “vendor supplied image” scenarios noted above, if you can’t manage images as part of your CM system than it’s just not a complete CM system.
  2. Updating your images – Here, I think Luke makes some not entirely valid assumptions.  He notes that once you’re done building your images, you’re still going to have to make changes in the operational environment (“bootstrapping”).  True.  But he thinks you’re not going to use the same tool to do it.  I’m not sure why not – our approach is to use automated tooling to build the images – you don’t *want* to do it manually for sure – and Puppet/Chef/etc. works just fine to do that.  So if you have to update something at the OS level, you do that and let your CM system blow everything on top – and then burn the image.  Image creation and automated CM aren’t mutually exclusive – the only reason people don’t use automation to build their images is the same reason they don’t always use automation on their live servers, which is “it takes work.”  But to me, since you DO have to have some amount of dynamic CM for the runtime bootstrap as well, it’s a good conservation of work to use the same package for both. (Besides bootstrapping, there’s other stuff like moving content that shouldn’t go on images.)
  3. Image state vs running state – This one puzzles me.  With images, you do need to do restarts to pull in image-based changes.  But with virtually all software and app changes you have to as well – maybe not a “reboot,” but a “service restart,” which is virtually as disruptive.  Whether you “reboot  your database server” or “stop and start your database server, which still takes a couple minutes”, you are planning for downtime or have redundancy in place.  And in general you need to orchestrate the changes (rolling restarts, etc.) in a manner that “oh, pull that change whenever you want to Mr. Application Server” doesn’t really work for.

In closing, I think images are useful.  You shouldn’t treat them as a replacement for automated CM – they should be interim deliverables usually generated by, and always managed by, your automated CM.  If you just use images in an uncoordinated way, you do end up with a foil ball.  With sufficient automation, however, they’re more like Russian nesting dolls, and have advantages over starting from scratch with every box.

Leave a comment

Filed under DevOps, Uncategorized