Ok, so this is old but I hadn’t read it before. InfoQ hosted What is the Role of an Operations Team in Software Development Today? The premise: You don’t need ops any more! DevOps means your developers can do ops. Ta da.
Well, besides separation of duties problems, this has a number of fundamental flaws with it. The first is sheer amount of knowledge required and work to do. One of the greatest difficulties in hiring Web Ops people is getting the wide generalist/specialist skill set that they need – the whole first chapter of Web Operations is Theo Schlossnagle talking about that. The more skill sets you pack into one person, the less good they are at them. A good developer needs a huge skill set, so does a good operations person. If you add “app server administration” to a dev, they’re going to have to “forget” Spring or something to make room, in a virtual sense. Sure, you can take a developer and teach them ops, it’s not totally foreign – but that’s because all these people come out of the same CS/MIS programs in the first place, duh. You can take a Flash developer and teach them embedded development too, but I think everyone understands what a fundamental retooling that is.
So is this just an idea from someone who has no idea what all Operations folks do? Maybe. I know I had one discussion inside our IT department with a development architect who, bridling at our concerns with a portal project, said “What do you people do anyway? Why do we need your team? You just move files around all day!” It’s the classic “I don’t know what all that job entails, so it must be easy” syndrome. But our systems team has a huge amount of institutional knowledge around APM, security, management, etc – heck, we try to spread it into the dev teams as much as we can, but there’s a lot. It’s similar to QA – sure, “developers can do their own testing” – but doing good load testing etc. is a large field of endeavor unto itself. If all testing is left to devs, you don’t get good testing. Doesn’t mean devs shouldn’t test, or write unit tests – they are a necessary but not sufficient part of the testing equation.
But you know, I would argue that from a certain point of view, maybe this is right.
Infrastructure = code, right? And if you are far down the path of automation and system modeling, then you redefine Ops as just a branch of development. One guy on the team knows SQL, another knows .NET, and another knows Apache config and Amazon AMI. One tester knows how to do functional regression tests, another knows how to do load tests, and another knows how to do performance, security, and reliability testing. Sure, from a certain point of view these are simply all different technical skills in one big bag of skills, so a systems engineer who configures WebLogic is just a developer that knows WebLogic as one of their tools. And I think there’s a lot of truth to this really, and part of our DevOps implementation here has focused on mainstreaming ops into our same agile tracking tool, bug tracking, processes, etc. as our developers.
However, this misses another huge part of the equation – mixing reactive support and proactive work is a time-killer that causes context switching and thrash that degrades efficiency.
Even in our Web admin team, we separated out into a “systems engineering” group and a “production support” group. The former worked on projects with developers writing new code, and the latter handled pages, requests, etc. around a running system. It’s because the interrupt driven work from operations absolutely killed people’s ability to execute on projects. There’s a great part in the O’Reilly book Time Management for System Administrators that prescribes swapping off ops/support with other admins to reduce that problem.
Many developers don’t understand a running system. It’s been interesting being in R&D now at NI, where a lot of the development is desktop software driven – for a long time we ran these public facing Web demos where the R&D engineers would say, with a straight face, “You log into Windows, click to run this app, and then lock the screen.” Even the idea of running as a service was weird hoodoo.
Anyway, in IT here the apps teams have split out as well! There’s App Ops groups that offload CI and production issues from the “main” App Dev groups; then the systems engineers work more with the main app dev groups and the production support team works with the app ops groups. And believe me, there’s enough work to go around.
Now, of course you need developers involved in the operational support of your apps. That’s part of the value of DevOps – it’s not all “Ops needs to learn new stuff,” it’s also “Devs need to be involved in support.” But in the end, those are huge areas where “do it all” is not meaningful. Developers helping with production support is a necessary but not sufficient part of operations.