All right, the corporatey part of the week (Velocity) is over, and the tech Illuminati have stayed for DevOpsDays Silicon Valley (used to be Mountain View) – with like 500 people!
To open, a funny Point Break DevOps parody! Ah, makes me want to watch that movie again.
DevOps + Agile = Business Transformation
The first talk is from Jesse Robbins (@jesserobbins)! Ex-Velocity co-host and co-founder of Opscode, he has now found a home in the TECH UNDERGROUND which is DevOpsDays. He started Velocity because he couldn’t share all the secret stuff they were doing at Amazon but knew it was so important and crucial to the Web and thus the world… Sometimes frighteningly so.
DevOps, he says, is the ability to consistently create and deploy reliable software to an unreliable platform that scales horizontally. The right tools and culture are critical to doing this successfully.
The Internet is becoming pervasive. Applications became customer service vehicles. Walmart and Amazon both understand this. Email killed the post office. These rips in the social fabric reveal something better. These changes are coming faster and faster and the technology that does this is ours – we build and run it.
Misaligned incentives cause conflict. “Operant conditioning.” People know what the good guys are doing, but they just can’t change themselves to do it – elephants can’t fly just by flapping their ears harder.
You can do your keep-it-small DevOps effort, but eventually you have to say “if we don’t do this everywhere we will fail” – that’s not a business or technology problem, it’s a culture problem. He’s given this speech inside a bunch of organizations and knows how much resistance there is to change because they all wriggle around like itchy bear cubs when he says it.
Circuit City’s downfall and Blockbuster’s downfall due to Netflix are examples of cultures making agility impossible. You can’t “agile out” of that. You can provide tools and culture but the overall foundation has to spread. True story, Blockbuster decided the brilliant way to get out of its death spiral was to buy Circuit City, which was also in a death spiral. And “make it up in volume” I guess. Is “being a meathead” a culture problem? I reckon.
Conway’s Law – you make things that are copies of your org structure.
Fundamental attributes of successful cultures:
- shared mission and incentives
- infrastructure as code
- application as services
- dev+ops+all as teams
Full stack automation, commodity hw or cloud, reliability in the software, infrastructure APIs, code infra services – infra as product, app as customer.
Service orientation. versioned APIs, resiliency (design for failure), storage abstraction, push complexity up the stack, deep instrumentation
Agile, trust basis, shared metrics and monitoring, incident management, service owners on call, tight integration (maybe you end up with dedicated network or sec oncall, like SREs, but at the core still collaborative), continuous integration, SRE/SRO to spread concepts, game days.
It takes time – amazon.com didn’t switch to EC2 till Nov 10, 2010.
- Start small, build trust & safety
- Create champions
- Use metrics to build confidence
- Celebrate successes
- Exploit compelling events – cause moments of openness
Continuous Quality: What DevOps Means for QA
By Jeff Sussna, @jeffsussna.
Old definition of quality – “does the software meet the spec?” But agile is about delivering value and cloud is about turning software into services. New definition of quality is “does the service help customers accomplish their jobs-to-be-done.”
A restaurant isn’t just about food delivery, there’s a lot of value creation in the whole chain. A service needs functionality, operability, deliverability, coherency (does it engage me throughout my journey).
New approaches include user-centered design, test-driven development, continuous delivery, MTTR over MTBF; build in testing and learn from failure.
So QA changes. Boundaries blur and automation takes over the manual activities. New and more valuable role: represent the “service not software” perspective, watchdog those 4 attributes.
QA engineers need to lift their gaze above the mechanics of testing, treat tests as code, focus on building quality into the system (quality advocate). New skills include understanding and thinking about service (e.g. outage comms), ops (sec, monitoring), process/automation
[Ed: If we add all this devopsy stuff into the definition of done, QA should look at all of it.]
Good testers see systems and their prts (and gaps), ask probing questions, design good tests, engage that proficiency in design and test plan critiques.
So we need this new kind of testing as well as the old kinds so with continuous delivery how do we catch up? And people give a lot of “buts” about automated functional testing, But there are frameworks and DSLs that allow you to make changeable, encapsulated testing. And it’s a process problem – no one asks what it’ll take to test the system. Write code and tests together, commit them together.
Operability and deliverability need testing. Design for internal users too…
You still want QA (instead of obsoleting them) as attached to the customer and as an antidote to confirmation bias.
Continuous Quality – everyone is testing all the time, quality infused, QA is a mirror for the organization. There is still specialization.
Is your team instrument rated?
Culture. Is it hugs and beer? No, it’s incentives + human factors.
Why aviation as a DevOps analogy? It progressed from craft to trade to science to industry. DevOps is in th e”late trade” phase of that development.
Incident response is good but the house is already on fire by then. In aviation there’s a lot of scale and you want to avoid the incidents in the first place.
Learning to fly – first, you learn visual flight rules. Use your eyeballs. Then you move to instrument flight rules – flying in the system.
Flying by instrument relies on standardization, communication (precision), expectations (responsibilities in a situation), remediation. It is not static, blindly relying on automation or process, or fun-verboten.
How to get there? Define your current process even if it’s weird, focusing on operational requirements, derive primitives, define operational dictionary, and make sure the nonfunctional requirements) are owned.
Formalize roles, responsibilities. There should be clear transfer of control on who “has the ball.” Drill/train and delegate. Priority classes. Fly|navigate|communicate.
Understand your org’s limitations.
Holding patterns/WIP are bad because it adds chaos to the system.
Investigate outcomes. Should you have an external team investigate? “No blame” postmortems aren’t about not being a jerk and making people sad, but because it’s very unlikely a failure is “one guy’s fault” and it’s a red herring to think so. [I made a lovely “Root cause is a myth” custom t-shirt at the con! -Ed.]
You should have a day-to-day operational model that accounts for incentives and the human factors that make people able to deliver on them.
Leveling Up a New Engineer in a Devops Culture; Healthy Sustainability
By Gary Foster and Mercedes Coyle from Scripps
You want to hire a new engineer, teach them “our way,” inculcate a devops mindset from the beginning, add good practices and training to the local labor pool, and pay it forward.
Identify needs and outcomes desired and get a mentor. THEN go hire! They go to hackathons and stuff to hire. Incubators, boot camps (e.g. Hackbright).
And now the new engineer! She was looking for a place where she could get up to speed quickly, support and challenge but no hand holding, senior engineers to help a new engineer grow. She had basics skills in coding/testing/deploying and willingness to learn.
What to do on the job as a new person? Question what you don’t understand, avoid perfectionism, and speak up.
The mentor’s responsibility is patience, giving them responsibility like seasoned engineers, ask them for ideas, teach problem solving not syntax and don’t give the answer.
So train ’em, listen to ’em, form a cult around ’em. Take responsibility for bad habits.
Adrian Cockroft of Netflix (@adrianco) on beer pineapples and bottlenecks. “Cockroft headroom plot” helps you see when there’s serialization due to a bottleneck.
Peco!!! @bproverb on how we have an incident driven culture and effectively reward failure. “Actionable alerts” are reactive and often we have sparse bad data. Analyze and track close calls, reward for prevention. Spend time with your data. No need to theorize if you have data, you can track close calls and pursue root cause. Find analyst ninjas. Close call focused analysis.
David Hatten from UrbanCode/IBM. The positive powers of negative thinking. And nihilism. And criticism. Somebody needs a nap. Read Be Nice To Programmers.
Chantell Smith from ITSM Academy (subbing in for Jayne Groll) – what is devops culture? a multicultural society of frameworks and tools and standards and whatnot. There’s evangelists and detractors. Need communication to get over the cultural divide. Use the “git r done” scrum, not just for devs. Pair with a kanban board. ITSM is still a good thing and not lame! Let’s get a common dictionary/vocabulary. [Ed: So Gene and co., stop slacking on the devops cookbook!]
Systems theory for enjoyment of AWS – read John Gall’s Systemantics/The Systems Bible. Systems in general work poorly or not at all. Complex systems are always broken somewhere. Some simple services don’t even really work… Start simple and working and grow to complex and working. @whirlycott from Stackdriver (Philip Jacob)
I attended three openspaces on Day 1 afternoon.
Women in DevOps
The first was on getting more women into DevOps/related tech jobs. There were a lot of people and so we didn’t get too deep into any specific area of that. It was noted that benefits and especially maternity leave were super important and a good thing to stress in your job postings. Also that women are likely to not apply to”you must know all these 20 things” job descriptions. Though I’ve known guy engineers that fall into the same trap. “I only meet 19 of those, I’d best not apply.” Hint as a hiring manager – if you meet like half of the things, you’re well advised to put a resume in!
We churned a bit over the fact that largely, people are hiring by exercising their known-people networks, and since historically more engineers are male, that tends to be self-reinforcing. You can go deliberately look into female-tech boot camps and the like.
In the end, I think the core problem is that we’re working at a fast pace. We put out a job posting (or a call for DevOpsDays presenters, as was brought up as an example) and we look through the responses we get. If there’s no responses from women, then we can’t include them. But to break through that, we have to take the time to deliberately reach out (and figure out where to reach out to).
There was some talk about “wiping identifying information off” but I think that’s a blind alley. I’m personally more likely to interview/etc. a woman or minority for an engineering position to try to level the playing field, if all the resumes are “Candidate 26” then so much for that. I mean, maybe it’s true there’s a lot of old school tech companies out there who are like “wimmen on the front lines with us? Never!” but I have to say I’ve never seen that.
My main takeaway was that we should probably get the female engineers we have on staff, ask them to super-plumb their social networks, and get their views on what aspects of job descriptions/interviews/work environments are or are not attractive to them and double down on it.
Where the Hell are the Product Managers?
The second was one I proposed, entitled “Where the Hell are the Product Managers?” DevOps is nominally about bringing Ops into the agile team that is already a mashup of Product, Development, and QA. But unfortunately, despite 10 years post-agile manifesto, I find that healthy PM embedding into the agile team is honored more in the breach than in the observance. Furthermore, in terms of owning “nonfunctional” requirements, or God forbid, an entire platform-type product, they tend to not want to do that.
We had a good discussion; some people had good PM engagement and others didn’t. Few had success with PMs doing effective prioritization of nonfunctional requirements and most “platform teams” didn’t have a PM, though some did and reported that it was super awesome. In fact, Bryan Dove from here at Bazaarvoice talked about one team he worked where the designer and marketing person came to colocate with the team as well and it was very effective.
The main takeaway was to continue to try to push the agile practice of crossfunctional, embedded and ideally colocated teams, because the results are so much better. And if one needs to hire more X (PMs, Ops, whatever) so that there can be one per product team, do it.
Running a DevOpsDays
The third was about running a DevOpsDays event. Since I helped run DevOpsDays Austin I went to that to share the love. If you’re looking to run one, we’ve made our budget and planning docs and everything available for others to crib from. My short playbook is:
- Get around 8 people as organizers, from a mix of companies. 2 will punk out and the other 6 will be able to share the load.
- Find a venue, that’s the most important thing. It’ll give you a capacity and whether you’re planning on charging. We did a free DevOpsDays and had a large (>30%+) no show rate, and then did a $120 DevOpsDays and had a small (<10%) no show rate.
- Don’t get fancy. Nail the hard requirements and then if you get excess sponsor money, add on other goodies. For DOD Austin we added a band and a movie and more swag and more snacks later as our bank account swelled, but we could have cut off after venue/internet/some food and been done with it.
- BMC loves to do the A/V and stream the event! But beware, once they leave after the morning events you’ll be without mikes and stuff.
- Patrick Debois has the usual schedule/format for you to use.
- Don’t worry about sponsor money, they’re lining up to pay you. It’s more important to set expectations – this isn’t a “high traffic sales leads” event and you don’t get the attendees’ emails – it’s better to send engineers than salespeople, you’re trying to affect influencers.
And that’s Day 1, expanded!