Tag Archives: agile

Scrum for Operations: Order from Chaos

Welcome to the second installment in Scrum for Operations, a series where I talk about (and go through) the process of doing systems work as part of a DevOps team according to the Scrum methodology. Last time, I introduced the basics of Scrum as it generally is used, and its key benefit of frequently delivering useful functionality. But I already hear the objections – “How can that turn out all right?” It is so light on process that one’s initial inclination is to dismiss it as “cowboy coding”, and we already know not to be “cowboy sysadmins,” right? One’s intuition might be (and mine was in the beginning, I’ll be honest) that this would lead to a metastable process that could not be sustainable without fundamental fatal flaws overtaking it.

Well, as I learned after trying to learn more and kicking the tires with our dev team here, there are several core disciplines that are agile’s saving graces.

Testing

We ops guys are used to testing being a neglected afterthought in the development process, often tossed over the wall to a QA team that isn’t well integrated into the product. Therefore we have a hard time trusting code that’s being handed over to us given our experience – we get it handed to us and it doesn’t work!

Well, agile pretty much understands that without pervasive testing, this kind of fast cycle process is doomed. At its extreme, some practitioners use Test Driven, aka Test First, development where failing tests must be written first and then code filled in behind it till the test passes. This creates a large inherent test framework.

Even agile groups that don’t do this almost always have metrics on unit test coverage and a required bar devs must hit.  Here, our desktop software group that’s newly using Scrum has the mandate that “there must be XX% unit test code coverage or you’re not ready to ship.”

Similarly, acceptance testing (automated continuous testing of user stories vs the code) is a common part of agile. Continuous ongoing testing ensures quality through the dev cycle and reduces the need for time-intensive, and mysteriously always insufficient, big-cycle regression testing.

This is a great culture. And there’s all kinds of different tests – unit test, integration/functional/regression testing, performance testing, fault testing… Starting to get interesting to you?  How about monitoring? In reality application monitoring is a special case of testing – it’s a “lightweight integration regression test.” Our initial approach to DevOps includes making test coverage goals for things like monitors and performance testing, because that plugs into the existing agile mindset well.

Bonus new terminology thing – the quick acceptance test you do upon release, which we always called “critical path testing,” is now being called “smoke testing” by the hip. Update your dictionaries!

A side note on formal QA groups. Just as we are working on DevOps, there has been previous work on how QA teams interact with agile dev teams, and there are a variety of different doctrines on how to split the work – often, it’s devs that are responsible for a lot of the testing. It’s a hard balance – you want the devs to be responsible for some of the testing because the best testing is “close to the code,” but just like with Ops, a real QA team has expertise beyond what a developer can just bolt in with 10% of their attention. Here, we have a dev team and also a remote QA team; devs test their own code on the daily build and then there’s a weekly push to a more stable environment where the QA team does acceptance testing and is moving into performance testing and the like.

Anyway, this endemic focus on testing and automation of testing and testing metrics is the pin that makes this agile flywheel actually turn without just flying off. (You are correct, some agile teams don’t do this – we call those “the unsuccessful ones.”)

And this is for you to do as well! There’s a whole post or series of posts in the topic “What does a unit test mean for something infrastructurey” – it is incumbent on you to figure it out and also have high test coverage with your work.

Refactoring

In general, agile dev is the epitome of horizon planning. You know you can’t get all the requirements ahead of time (or if you do get them all ahead of time, what you come out with won’t serve any real human’s need) and similarly preplanned architecture and design often doesn’t survive contact with the scrum. So it’s not “don’t plan or design,” but it’s “plan and design in an ongoing manner.”

This is one of the scariest parts for an ops person – we assume that we get one “bite at the apple”, and once we’ve set up the systems and let in the developers, we’ll never be allowed to change anything without a fight.  But developers have this problem internally all the time – one dev is working on a core library or API that other developers are using, and they don’t wait for core guy to get done before they start. Instead, they have adopted a concept they call refactoring. Refactoring just means that each sprint, you are open to redoing fundamental stuff that needs to change (or that you realize you did kinda ghetto in the first place).

Because this is an accepted part of the iterative approach, you get to leverage this as well.First iteration they get the basic Tomcat and mySQL install out of the repo, and they can get started – and then in the second iteration you front it with Apache, or tune the DB for security, or whatnot and they have to make some changes to fit. I’m not promising no one will ever cry about this, but it’s a part of what makes the culture successful so it’s there for you to leverage.

And for you to adhere to! Be open to refactoring your infrastructure based on the emerging project needs.

Source Control

A developer might not even mention this, and most books on agile don’t, because to them it’s so fundamental a discipline that it’s like breathing air. Sadly the same can not be said of Ops folks, so I’m mentioning it. When code is changed, it is in a shared source control repository – which gives other people on the group visibility into it (a collaboration touchpoint), is a common place to source it from (a deployment touchpoint), and can be used to easily manage multiple versions, even experimental ones, and merge or roll back changes.

This is the most fundamental empowering technology of modern software development (not just agile) and you must uptake it immediately or you have lost.  Fair warning. It is the stepping stone that will allow subsequent Cool DevOps Automation to happen.

Conclusion

These three disciplines convert agile/Scrum from dangerous free-for-all to a new technique that gets your product done both more quickly and with higher quality than a waterfall method. I’ll talk further next time about how Ops slots into all this, and how you can fit your systems admin work into a Scrum mindset.

Also note, there’s some other agile disciplines surrounding agile design and encapsulation and patterns and whatnot, which I don’t understand well enough yet to speak authoritatively on. Feel free and chime in with other core disciplines if you are!

Leave a comment

Filed under DevOps

Scrum for Operations: What Is Scrum

Agile

It’s not a mandatory part of DevOps, but I believe that DevOps works a lot better if operations teams adopt Agile.  But all that most systems teams know about Agile is that “it’s that thing that makes the development teams not plan worth a damn any more.”  Well, though there may be some truth to that, a well run agile process is very effective and not uncontrolled at all. Constructing and maintaining infrastructure in an agile manner is very possible – we’ve done it. In the beginning it seems daunting, just as it did initially to the legions of software developers who were steeped in waterfall based thinking, but once you adopt it I think you’ll see a lot of traditional pain points get a lot better – that was our experience. This is the first in a series of posts about using Scrum for Web operations, and I thought I’d start with explaining what Scrum is from an ops guy’s point of view.

Scrum

Scrum is one of the more popular agile methodologies.  Agile was initially defined by the Agile Manifesto and from there got turned into more specific implementations of its principles, methods, and practices, and Scrum is one of those more specific prescriptions on how to “do” agile. Other methodologies have been used for DevOps – like check out this great presentation on how Stephen Nelson-Smith (@LordCope) did a XP DevOps implementation at a British government agency.

While developing our first two products that we delivered on the cloud using DevOps, we used an “agile-ish” methodology, which is to say not a formal agile approach.  This time, we’ve decided to run Scrum by the book, not just for the feature developers but for our systems engineers, operations staff, security engineers, and system automation developers. Some folks are talking about hybrid models like Scrumban (Scrum + Kanban/Lean) to better incorporate ops work, but we’re going to start with Scrum first and see where we get to.

As a system administrator, that’s scary, because we don’t know Scrum. But Scrum is pretty darn simple.

Here’s two short videos that give a pretty good intro to Scrum.

Scrum in Under 10 Minutes, courtesy Axosoft:

Scrum Basics:

(P. S. Scrum master lady from this video… Call me!)

But if like me you think videos are not an efficient method of conveying information, here’s the short form.

The Team

In Scrum, you form a small crossfunctional team to all work together on a product instead of having to cross organizational boundaries and fill out forms to get any work done.  The roles consist of:

  • product owner – the “business guy” who has say over what features get greenlit and when
  • scrum master – the project manager who keeps everyone on track
  • team – ideally 5 to 7 developers writing code (this is where Ops will plug in, in DevOps)
  • testers, security, other folks – probably should be included too, but adoption of that varies

Of course in a mid to large sized organization there are other stakeholders, like customers and management and legal and whatnot. But you form a team that has everything it needs to complete its work.

The Backlog

All needed features are brainstormed and put into a master list of features called a “product backlog.” This backlog contains everything – including your systems tasks. The backlog is then broken up into smaller chunks, like release backlogs (features targeted for a specific release) and sprint backlogs (specific tasks for a sprint). The sprint backlog is basically your work breakdown structure, if you’re more comfortable with that terminology, except that the tasks are worded more in terms of what feature they provide instead of in terms of what specific things you need to do; this fosters communication with the product owner.

The developers (and, ideally, operations staff)  on the team help generate the backlog and provide time estimates in hours for each task. The product owner owns the prioritization and ordering (as constrained by things that are actual dependencies).

The Sprint

Work is performed in month-long iterations called “sprints.” Requirements are frozen for the sprint and the development is time-boxed – it must end at the termination of the sprint. At the end of the sprint, whatever features were put in that sprint should be complete and “ready to ship.”

The Standup

As the scrum, or sprint, progresses, there are daily “standups” – 15 minute meetings where everyone stands up, reports what they have done since the last standup, what they plan to accomplish by the next standup, and any roadblocks they are encountering. By keeping this meeting short it doesn’t waste time, but by having an actual face to face meeting you get very rapid and effective collaboration that cannot be achieved via managing project plans, sending emails, generating status reports, or the like.  It cuts out all the busywork and keeps the kernel of coordination that lets a team keep up velocity.

The Burndown

The progress of the team against the sprint backlog is tracked by a “burndown chart.” As each team member completes their tasks for the sprint you can easily see whether you are on track for successful completion or not.

And that’s Scrum in a nutshell.  Five things. Small integrated team, backlog, sprints, standups, burndown.

Next Time

I’ll talk about each of these areas more in depth later in the series, as we go through them ourselves as we develop a real product, and explain how a system administrator (aka systems engineer, infrastructure admin, operations ninja) can fit their work into this structure. But first, I will explain why Agile/Scrum is not just “crazy talk.” To a hardened system admin, or really to anyone used to working in a waterfall environment, it is very counterintuitive that this approach doesn’t just degenerate into the IT equivalent of orcs pillaging a city. But agile has several interesting practices that make it work, and should be very interesting to an operations person.

4 Comments

Filed under DevOps

OPSEC + Agile = Security that works

Recently I have been reading on OPSEC (operations security).  OPSEC, among many things, is a process for security critical information and reducing risk.  The 5 steps in the OPSEC process read as follows:

  1. Identify Critical Information
  2. Analyze the Threat
  3. Analyze the Vulnerabilities
  4. Assess the Risk
  5. Apply the countermeasures

It really isn’t rocket science, but it is the sheer simplicity of the process that is alluring.  It has traditionally been applied in the military and has been used as a meta-discipline in security.  It assumes that other parties are watching, sort of like the aircraft watchers that park near the military base to see what is flying in and out, or the Domino’s near the Pentagon that reportedly sees a spike in deliveries to the Pentagon before a big military strike.  Observers are gathering critical information on your organization in new ways that you weren’t able to predict.  This is where OPSEC comes in.

Since there is no way to predict what data will be leaking from your organization in the future and it is equally impossible to enumerate all possible future risk scenarios, then it becomes necessary to perform this assessment regularly.  Instead of using an annual review process with huge overhead and little impact (I am looking at you, Sarbanes-Oxley compliance auditors), you can create a process to continue to identify risks in an ever-changing organization while lessening risk.  This is why you have a security team, right?  Lessening the risk to the organization is the main reason to have a security team.  Achieving PCI or HIPPA compliance is not.

Using OPSEC as a security process poses huge benefits when aligned with Agile software development principles.  The following weekly assessment cycle is promoted by SANS in their security training course.  See if you can see Agile in it.

The weekly OPSEC assessment cycle:

  1. Identify Critical Information
  2. Assess threats and threat sources including: employees, contractors, competitors, prospects…
  3. Assess vulnerabilities of critical information to the threat
  4. Conduct risk vs. benefit analysis
  5. Implement appropriate countermeasures
  6. Do it again next week.

A weekly OPSEC process is a different paradigm from the annual compliance ritual.  The key of the security is just that: lessen risk to the organization.   Iterating through the OPSEC assessment cycle weeklymeans that you are taking frequent and concrete steps to facilitate that end.

Leave a comment

Filed under Security

Specific DevOps Methodologies?

So given the problem of DevOps being somewhat vague on prescriptions at the moment, let’s look at some options.

One is to simply fit operations into an existing agile process used by the development team.  I read a couple things lately about DevOps using XP (eXtreme Programming) – @mmarschall tweeted this example of someone doing that back in 2001, and recently I saw a great webcast by @LordCope on a large XP/DevOps implementation at a British government agency.

To a degree, that’s what we’ve done here – our shop doesn’t use XP or really even Scrum, more of a custom (sometimes loosely) defined agile process.  We decided for ops we’ll use the same processes and tools – follow their iterations, Greenhopper for agile task management, same bug tracker (based off HP), same spec and testing formats and databases.  Although we do have vestiges of our old “Systems Development Framework,” a more waterfally approach to collaboration and incorporation of systems/ops requirements into development projects.  And should DevOps be tied to agile only, or should it also address what collaboration and automation are possible in a trad/waterfall shop?  Or do we let those folks eat ITIL?

Others have made a blended process, often Scrum + Lean/kanban hybrids with the Lean/kanban part being brought in from the ops side and merged with more of a love for Scrum from the dev side. Though some folks say Scrum and kanban are more in opposition, others posit a combined “Scrumban.”

Or, there’s “create a brand new approach,” but that doesn’t seem like a happy approach to me.

What have people done that works?  What should be one of the first proposed roadmaps for people who want to do DevOps but want more of something to follow than handwaving about hugs?

7 Comments

Filed under DevOps

My Take On The DevOps State of the Union

DevOps has been a great success in that all the core people that are engaged with the problem really ‘get’  it and are mainly on the same page (it’s culture shift towards agile collaboration between Ops and Dev), and have gotten the word out pretty well.

But there’s one main problem that I’ve come to understand recently – it’s still too high level for the average person to really get into.

At the last Austin Cloud User Group meeting, Chris Hilton from Thoughtworks delivered a good presentation on DevOps. Afterwards I polled the room – “Who has heard about DevOps before today?” 80% of the 30-40 people there raised their hands.  “Nice!” I thought.  “OK, how many of you are practicing DevOps?”  No hands went up – in fact, even the people I brought with me from our clearly DevOps team at NI were hesitant.

Why is that?  Well, in discussing with the group, they all see the value of DevOps, and kinda get it.  They’re not against DevOps, and they want to get there. But they’re not sure *how* to do it because of how vague it is.  When am I doing it?  If I go have lunch with my sysadmin, am I suddenly DevOps?  The problem is, IMO, that we need to push past the top level definition and get to some specific methodologies people can hang their hats on.

Agile development had this same problem.  And you can tell by the early complaints about agile when it was just in the manifesto stage.  “Well that’s just doing what we’ve always done, if you’re doing it right!”

But agile dev pressed past that quickly.  They put out their twelve principles, which served as some marching orders for people to actually implement. Then, they developed more specific methodologies like Scrum that gave people a more comprehensive plan as to what to do to be agile.  Yes, the success of those depends on buyin at that larger, conceptual, culture level – but just making culture statements is simply projecting wishful thinking.

To get DevOps to spread past the people that have enough experience that they instinctively “get it,” to move it from the architects to the line workers, we need more prescription.  Starting with a twelve principles kind of thing and moving into specific methodologies.  Yes yes, “people over process over tools” – but managers have been telling people “you should collaborate more!” for twenty years.  What is needed is guidance on how to get there.

Agile itself has gotten sufficient definition that if you ask a crowd of developers if they are doing agile, they know if they are or not (or if they’re doing it kinda half-assed and so do the slow half-raise of the hand).  We need to get DevOps to that same place.  I find very few people (and even fewer who are at all informed) that disagree with the goals or results of DevOps – it’s more about confusion about what it is and how someone gets there that causes Joe Dev or Joe Operator to fret.  Aimlessness begets inertia.

You can’t say “we need culture change” and then just kick back and keep saying it and expect people to change their culture.  That’s not the way the world works.  You need specific prescriptive steps. 90% of people are followers, and they’ll do DevOps as happily as they’ve done agile, but they need a map to follow to get there.

3 Comments

Filed under DevOps

F5 On DevOps and WordPress Outages

Lori MacVittie has written a very interesting post on the F5 blog entitled “Devops: Controlling Application Release Cycles to Avoid the WordPress Effect.”

In it, she analyzes a recent WordPress outage and how “feathered” releases can help mitigate impact in multitenant environments.  And specifically talks about how DevOps is one of the keys to accomplishing these kinds of schemes that require apps and systems both to honor them.

Organizations that encourage the development of a devops role and discipline will inevitably realize greater benefits from virtualization and cloud computing because the discipline encourages a broader view of applications, extending the demesne of application architecture to include the application delivery tier.

Nice!  In my previous shop we didn’t use F5s, we used Netscalers, but there was the same interesting divide in that though they were an integral part of the application, they were seen as “Infrastructure’s thing.”  Apps weren’t cognizant of them and whenever functionality needed to be written against them (like cache invalidation when new content was published) it fell to us, the ops team.  And to be honest we discouraged devs from messing with them, because they always wanted some ill-advised new configuration applied when they did. “Can’t we just set all the timeouts to 30 minutes?”

But in the newer friendlier world of DevOps coordination, traditionally “infrastructure” tools like app delivery stuff, monitoring, provisioning, etc. need to be a collaboration area, where code needs to touch them (though in a way Ops can support…)  Anyway, a great article, go check it out.

Leave a comment

Filed under DevOps

Good DevOps Discussions

An interesting point and great discussion on “what is DevOps”, including a critique about it not including other traditional Infrastructure roles well, on Rational Survivability (heh, we’re using the same blog theme.  I feel like a girl in the same dress as another at a party.).  It seems to me that some of the complaints about DevOps – only a little here, but a lot more from Andi Mann, Ubergeek – seem to think DevOps is some kind of developer power play to take over operations.  At least from my point of view (an ops guy driving a devops implementation in a large organization) that is absolutely not the case.  Seems to me to be a case of over-touchiness based on the explicit and implicit critique of existing Infrastructure processes that DevOps represents.  Which is natural; agile development had/has the exact same challenge.

Note that DevOps is starting to get more press; here’s a cnet article talking about DevOps and the cloud (two great tastes that taste great together…).

And here’s a bonus slideshare presentation on “From Agile Development to Agile Operations” that is really good.

2 Comments

Filed under DevOps

Busting the Myths of Agile Development: What People are Really Doing

I just watched a good Webcast from an IBM agile expert about the state of Agile in the industry, and it had some interesting bits that touch upon agile operations.

Webcast – (registration required, sadly)

It’s by Scott Ambler, IBM’s practice leader for agile development.  They surveyed programmers using Dr. Dobbs’ “IT State of the Union” survey, which has wide reach across the world and types of programmers; the data in this presentation comes from that and other surveys.  All their surveys and results and even detailed response data (they’ve done a lot over time) are online for your perusal.

In the webcast, he talks some about “core” agile development extending to “disciplined” agile development that extends to address the full system lifecycle, is both risk and value driven, and has governance and standards.  “Core” begs the question of “where do requirements and architecture come from?” and “how do I get into production?”   Some agile folks who consider themselves purists say these aren’t needed and you should just start coding, he calls this view “phenomenally naive.”

He does mention some times where things like enterprise architecture were slow and introduced huge delays into the process because it took months to do reviews and signoffs.  He calls this “dysfunctional” but really isn’t it the way things are unless there’s a pattern to change it? I think project management and enterprise architecture are suffering from the same problem we operations folks are, which is that we’re just now figuring out “what does it mean to incorporate agile concepts into what we do?”

The meat of the preso is over agile team metrics and busting myths about “it’s only for small colocated teams…”  Here’s my summary notes.

  • Agile teams have a success rate higher than traditional teams.  Agile and iterative about tie and come in with much higher (2-4x) result on quality, functionality, cost efficiency, and timeliness than traditional or ad hoc processes.
  • Agile is for not just coding, but project selection/initiation, transition, ops, and maintenance.  More and more folks are doing that, but some get stuck with doing agile only in the coding and not the surrounding part of the lifecycle.
  • Agile is being used by co-located teams in only 45% of the time; all the rest are distributed in some manner.  But that does affect success some  – “far located” teams have 20% lower success than fully co-located.  Don’t distribute if you don’t have to.
  • Most orgs are using agile with small teams, the vast majority are size 10 or less.  But they see success with even the very large teams.
  • We’re seeing people successful with agile under complaince frameworks liek Sarbox, and governance frameworks like ISO – even ITIL is mentioned.
  • Agile isn’t just for simple projects – the real mix in the wild is actually weighted at medium to very complex projects.
  • Though agile is great for greenfield projects, there’s very large percentages of teams using it on legacy environments.  COTS development is the rarest.
  • 32% of successful agile teams are working with enterprise architecture and operations teams.  Should be more, but that’s a significant inroad.  He says those teams are also most successful when behaving agile (or at least lean).
  • Biggest problems with agile adoption are a waterfall culture (especially one where the overall governance everyone has to plug into is tuned to waterfall) and stakeholder involvement.  Testers say “We need a detailed spec before we can start testing…”  DBAs say “Developers can’t code until we have a complete data model…”  Management resistance is actually the lowest obstacle (14% of respondents)!

A lot of nice stats.  The two biggest takeaways are “agile isn’t just for certain kinds of projects, it’s being used for more than that and is successful in many different areas” and “agile is for the entire lifecycle not just coding.”  As advocates of agile systems, I think that’s a good sign that the larger agile community is wandering our way as we’re building up our conception of DevOps and wandering their way ourselves!

Leave a comment

Filed under DevOps

Our First DevOps Implementation

Although we’re currently engaged in a more radical agile infrastructure implementation, I thought I’d share our previous evolutionary DevOps implementation here (way before the term was coined, but in retrospect I think it hits a lot of the same notes) and what we learned along the way.

Here at NI we did what I’ll call a larval DevOps implementation starting about seven years ago when I came and took over our Web Systems team, essentially an applications administration/operations team for our Web site and other Web-related technologies.  There was zero automation and the model was very much “some developers show up with code and we had to put it in production and somehow deal with the many crashes per day resulting from it.”  We would get 100-200 on-call pages a week from things going wrong in production.  We had especially entertaining weeks where Belgian hackers would replace pages on our site with French translations of the Hacker’s Manifesto.  You know, standard Wild West stuff.  You’ve been there.

Step One: Partner With The Business

First thing I did (remember this is 2002), I partnered with the business leaders to get a “seat at the table” along with the development managers.  It turned out that our director of Web marketing was very open to the message of performance, availability, and security and gave us a lot of support.

This is an area where I think we’re still ahead of even a lot of the DevOps message.  Agile development carries a huge tenet about developers partnering side-by-side with “the business” (end users, domain experts, and whatnot).  DevOps is now talking about Ops partnering with developers, but in reality that’s a stab at the overall more successful model of “biz, dev, and ops all working together at once.” Continue reading

Leave a comment

Filed under DevOps

dev2ops Interview

Want to hear me spout off more about DevOps?  Well, here’s your chance; I did an interview with Damon Edwards of DTO and they’ve posted it on the dev2ops blog!

Killer quote:

“I say this as somebody who about 15 years ago chose system administration over development.  But system administration and system administrators have allowed themselves to lag in maturity behind what the state of the art is. These new technologies are finally causing us to be held to account to modernize the way we do things.  And I think that’s a welcome and healthy challenge.”

Leave a comment

Filed under DevOps