Category Archives: DevOps

Pertaining to agile system administration concepts and techniques.

DevOpsDays Austin is Back For 2022!

Your DevOpsDays Austin 2022 Organizers!

Well, we had to skip 2 years in a row due to the pandemic, but we were finally able to have DevOpsDays Austin in person again in 2022! It’s our tenth anniversary of DoD Austin, we had the first one at National Instruments back in 2012, one of the early DevOpsDays in the US.

We had to move to a new venue, and used the beautiful University of Texas Etter-Harbin Alumni Center (our site lead Bill is a UT alum which makes it half price!). The Etter-Harbin Center is right across from the stadium where we had DoDA in the years leading up to the hiatus. It has plenty of great outdoor spaces, which we used for lunches and happy hours, as well as a great main ballroom with views of the outside. It worked great for our target of 350 attendees, and we think we could make it work for more in the future.

We were thinking and came across the perfect theme – it’s our tenth anniversary, and we’re just back from the pandemic, and DevOps is also just a little more than ten years old and at a weird inflection point that has people asking “is DevOps dead? Where does DevOps go from here?” So we decided that since we were also in the Alumni Center, the obvious theme was our 10 Year Class Reunion!

We don’t take our themes lightly at DevOpsDays Austin. We settled on a new theme for our talks – instead of the normal RFC for whatever technical and culture topics, we required all talks to be a retrospective format – reflecting on what you’ve learned over the years of DevOps and what you think the future holds. We had lots of great speakers, many of whom are long time parts of the DoD Austin community, both locals like Rob Hirschfeld, Christa Meck, Ross Dickey, and Victor Trac, as well as folks from other parts of the Earth like Patrick Debois, Damon Edwards, J. Paul Reed, Pete Cheslock, and Michael Cote, who all frequently come to Austin to share with us.

And we printed a yearbook, with pics of speakers from all the events, our tshirts over the years, and more! Very snazzy, and we had people sign each others’ yearbooks to add some fun to the hallway track. In fact, you can view the yearbook online and buy a hardcopy here if you want!

The DevOpsDays 2022 Yearbook

We did require COVID protocols – masking inside and (honor system) vaccine or test, and while it is a bummer to not see each others’ faces, it also resulted in only one person I know of getting COVID the week after, so well worth it.

We didn’t have to worry about sponsor interest! We sold out quickly. Here’s the ones I got snaps of!

Everything went great, and it was super to finally get back together and interact with our local DevOps community. J. Paul Reed led a great session where a retro was done on DevOps in general!

And one of the best things is that we managed to carry on our tradition of giving our excess proceeds to charity! I’ll do a separate post on that, but the short form is that we contributed $28,000 to LGBT-supporting charities, half to The Trevor Project and half to Out Youth here in Austin, bringing us to $100,000 given to charity over our 10 years in existence! Stay tuned for more details on that…

2 Comments

Filed under Conferences, DevOps

All Day DevOps Is Coming Up

And James, Karthik, and I are hosting the Modern Infrastructure track again this year! ADDO is a free, 24 hour, multi-track online conference with a lot of great speakers. More info follows…

What: All Day DevOps, Live Online

When: November 12, 2020 (24 hours)

Where: From your desktop, laptop, or mobile device

Free: Click here to register

On November 12th, we will be supporting the 5th Anniversary Celebration for All Day DevOps. This is a 24 hour event with 6 simultaneous tracks, delivering 180+ sessions, live online. Session tracks include CI/CD Continuous Everything, Cloud Native Infrastructure and Monitoring, DevSecOps, Cultural Transformation, Site Reliability Engineering and Government.

Check out the amazing speaker lineup. Registration is free. Full details are located at AllDayDevOps.com.

Virtual Viewing Parties: Hosting a virtual viewing party is free for anyone in the community you supply the group and the connection while All Day DevOps provides the content. Here are some guidelines that will provide detailed information and tips to assist you with your party planning.

Leave a comment

Filed under Conferences, DevOps

Change Management – Share Your Thoughts!

Hey loyal admineers! (I just made that up.) I wanted to toss a question out to our readers. I’m working on a Change Management course for LinkedIn Learning right now to join my other courses, and I was hoping to hear some good new techniques people are using to do it that a) ensure compliance but b) are not super heavy and lame.

My current approach is to mention the ITIL, COBIT, ISO 27000-1, etc. approaches but then come in with a practical approach inspired by Visible Ops and leavened with DevOps innovation.

Chime in in the comments – how do you do change management? What compliance regimes do you have to fulfill? Are you using one of the ITSM frameworks? Are you using a tool (ServiceNow aka “ServiceNo” followed by JIRA were the two most popular when I asked at the latest CloudAustin meeting)? Are you using any techniques that you think are excellent and would like others to hear about? I’d love to hear from you!

Leave a comment

Filed under DevOps

Pragmatic Pipeline Security

Check out agile admin James Wickett’s talk from DeliveryConf last month on adding security into your continuous software delivery pipeline!

Leave a comment

Filed under Conferences, DevOps, Security

Record and Replay Browser Testing, Take 2

Recently I reported, and I quote, “Bah!” from trying a bunch of record and playback cross browser testing options. To recap, we’re a startup, our devs are writing UI tests but we don’t have cross-browser testing, so I tried to find something where I could record and replay our pretty simple Angular UI flow and get some cross-browser testing on it without needing code. And I didn’t find anything that worked. But I got a bit more traction after that first run at it, so here’s part 2.

EndTest

The EndTest support folks looked into it and fixed my test so it worked. Some were alternate locator schemes.  Some were advanced options I am not sure how I would have figured out (to close those pesky multiselect dropdowns, you can’t just click on the overlay backdrop that comes up, you have to offset by some pixels…).  I am not a front end guy, I just fake it, so this is a little daunting, but their support seems able to help with tests so it’s doable.

I now have a working test, somewhat generated from the recording and somewhat generated by programming. The trial doesn’t allow other browsers by default but they turned them on for me.  With some light fiddling I got them all running, and the only issue was IE not running because it didn’t like that pixel-offset workaround from the above and EndTest fixed it by changing my test to hit “enter” instead – fair enough.

I then also made a test for the second part of our flow that has a PDF that needs validation, you can do it with a screenshot comparison.

So after help from their support, I have a working test suite – it’s a stretch to say it’s pure record and replay; it’s record, edit the locators a bunch, and replay, but since my last iteration on this was “none of these solutions work and do crossbrowser”, that’s a big win.

Sauce Labs

Last time I had been trying to use the Selenium IDE for record/replay and integrate with Sauce for the crossbrowser testing. They got back to me and extended my trial minutes and gave me some tips on how to get some of the tests working.  They don’t support the Selenium IDE however so they don’t really help with the tests per se.

I had a weird experience, there were intermittent (but frequent) timeouts from running tests against Sauce with selenium-side-runner, just at some point 0-4 minutes into the test it’d hang and time out and give me a “ETIMEDOUT connect ETIMEDOUT 66.85.49.22:443″.  

And then I got into the lovely rich set of test things that don’t work the same across the browsers.  Oh, in Edge “the element isn’t clickable because it’s obscured.”  In Firefox on Mac “that radio button isn’t clickable because the inner part of the radio button obscures it.” All different elements.  The problem is, fiddling with the locators to get something that browser likes breaks it in the other browsers.  But the app works in those browsers, just not the tests.

Anyway, Sauce support said “we can’t support Selenium IDE questions” so I guess that’s it.  (I don’t know that those timeouts when running tests on their service count as selenium IDE questions but whatever). I had hope when I found Selenium IDE that a record and replay with Sauce was feasible but it seems like it’s a starting point to seed your own Selenium code at best.

mabl

A sales rep from mabl wouldn’t let go of me till I tried their solution. So I did and it has a lot of promise.  It tries a bunch of different methods to find a locator automatically and self-heals the tests, which is great.  On the one hand the tests are slow; a 4 minute selenium test is a 6-8 minute mabl test. On the other hand it works!

It worked the first time in fact (Well… second, but it was because I went into a click frenzy trying to get the mabl trainer window and our Hubspot popup out of my way). I now have a working mabl test, though it is having one-minute timeouts and confusion on that same “click on the backdrop to get out of the multiselect dropdown” issue that’s a pain in all the other solutions as well, it does it but after a long timeout and it doesn’t self heal it for the next test run so it works but is slow. I hate multiselect dropdowns. Anyway, after a discussion with mabl support it turns out that it tries a bunch of ways to find a locator and then self heals if it found a better one, and then tries a bunch of ways to click/exercise that locator but doesn’t self heal from that hence the long timeout in my test.  I gave them the feedback “self heal that too, yo!”

OK so it’s working, how about cross-browser?  I add Firefox – it works.  IE and Edge are only on the “enterprise plan” but I ask them to add them to the trial, they do, I run them, and…  They work!  Safari… Well, a problem there, but mabl looks at it and thinks it may be them. I’m super impressed, of all the stuff I’ve tried only GhostInspector actually recorded and replayed without significant recoding and that was only Chrome/Firefox.  They’re still working on a fix as of “press time.” They also do PDF testing, which we need..

So it works great and looks good!  And they have a lot of cool dev integration and stuff to get into, it uses a branching model for the tests, you can run the tests locally…  Great looking ecosystem. You can’t choose like OS platform though, the Chrome and Firefox are just on Linux. At our current level of detail that’s fine.

Next step is discussing pricing (Web site just says “contact us”)… And unfortunately it’s way, way out of reach of a 10-person startup. The cross-browser and PDF testing are part of the “Enterprise plan” but even if I sweet talk them into putting those features in their lowest cost plan it’s still cost prohibitive for us while we’re in seed round.  I mean, it’s totally worth it because it works out of the box without fiddling around, if I were still at AT&T Cybersecurity I’d make someone spring for it for sure, but it’s an extremely significant pricing step above these other solutions and not at the value point for where we are right now.  Dang.

New Bottom Line

OK so my new bottom line is this.

  1. If you just want quickie Chrome/Firefox on Linux record and replay, GhostInspector works out of the box. Nice but I want more browsers, doesn’t quite fit my needs.
  2. If you want record and replay on various OS/browser combinations and are willing to do some re-coding and testing of it to make it work, EndTest does it. Fits my need with a little pain, and is affordable.
  3. If you want cross-browser record and replay without hand coding, without full platform choices but with a bunch of cool dev friendly extras, mabl does a great job – for a lot more money than the other options.  Best for my needs, but most expensive.
  4. The other options basically don’t work on Angular (Crossbrowsertesting), or possibly work but with intensive time investment that makes it not really record and replay IMO (Sauce + Selenium IDE).
  5. Though if you don’t need a bunch of CI-driven executions per day and don’t care about all the platforms you can probably just use the Selenium IDE to do 3 major browsers locally installed on whatever laptop you have yourself for free (Safari/Chrome/Firefox on Mac or Chrome/Firefox/Whatever Microsoft Is Pushing Today on Windows). Free but DIY.

So for us being on a seed round budget, EndTest is probably the best compromise of functionality and price at this point, YMMV of course.

1 Comment

Filed under DevOps

Trying To Record And Replay Browser Tests… FML

I’m working for a startup right now and we don’t have a huge excess of development staff.  Our devs have been implementing UI testing in Cypress, but we also need some wide cross-browser testing of our front end Angular apps – we’d already found a couple blocker bugs on Edge and IE largely by accident.  The devs are all busy devving, so I figured I’d take that on. I said, “Well, there’s products where you can just click to record a UI session and replay it in other browsers without writing a bunch of code, let’s try that out.”  Most everyone has a free trial nowadays so I could see which ones were best. Then the pain began.

Sauce Labs – Part 1

I had used Sauce in a previous life, when we had a bunch of Robot Framework/Selenium tests and I liked it.  So I went there first.  Unfortunately, they have no record/replay capability, verified by their support, so I moved on.

But I came back later because I had found that there was a Selenium IDE that’s a record and playback tester that you can integrate with Sauce by using selenium-side-runner.

Selenium IDE is very cool, its killer feature is that as it records it copies various ways to address an item on the screen – css, xpath, full xpath – and when it replays if the first one doesn’t work it tries the next one and if that works tells you “hey you should update this test.” That’s great because UI testing is shitty and unreliable at best, and once you have Angular generating ever-changing ids for elements it is even worse.  The only bad thing is you have to go add assertions in manually afterwards.

Screen Shot 2020-01-09 at 10.58.47 AM

So in fairly short order, I managed to get a reproducible Selenium IDE script that exercised our Angular app and works.  The app’s just like 7 screens of form fill, it’s not crazy.

Well, then I tried to save it as a “.side” project and feed it through Sauce by using selenium-side-runner, which is just:

npm install -g selenium-side-runner

selenium-side-runner –server <sauce-url> -c “browserName=’chrome’ version=’latest’ platform=’macOS 10.14′” ‘Paul Precision.side’

You get that sauce URL that has credentials embedded under User Settings/Driver Creation in their UI.

Unfortunately once I push it to sauce (starting on the same OS/browser, which you go get the tokens for from their Platform Configurator) – problems. The player is great, it shows the video (even live while testing) synced to the step taking place (unfortunately since I’m piping it in, it’s not showing the steps in the test syntax, but in raw Selenium execution syntax).

Screen Shot 2020-01-09 at 12.37.42 PM

I fixed most of them by going and changing selectors away from CSS to xpath, then sitting there iterating with chrome dev tools and the IDE trying new ways to use an item that works in chrome then works in selenium IDE and then… Doesn’t work on Sauce. I have gotten it 90% working but the last 10% is blocking me.

CrossBrowserTesting

Next I tried SmartBear’s CrossBrowserTesting.com. An all-in-browser recorder that worked great!  And then the replays didn’t work.  I messed with it a while and contacted their support, who said “Oh yeah it doesn’t do angular, it’s for static pages.”  So on to the next one. Who uses static pages, this is 2020?

The interface is nice enough, editable steps next to a running video (though not synced up).

Actually looking at it closer I bet I could do the same “edit all the locators” deal and try to get it to work but… My 7 day trial is over (a week shorter than the other options) so I guess I can’t try.  It didn’t do the nice multi-locator guessing Selenium IDE did but it does seem to have several options in a dropdown while I edit the tests, and the recorder is integrated into the offering so that’s nice – the UI was good overall. Unfortunately the super short trial and the presales support saying “Angular? Go away!” prevented me from really seeing if it can work for us.

Screen Shot 2020-01-09 at 12.38.18 PM.png

GhostInspector

Demoralized, I head to Twitter, and someone recommends GhostInspector.  You record it with a Chrome plugin and then replay in the browser – video, but then it shows the editable steps next to a screenshot showing % change from the last screenshot (the steps aren’t synced to the video, which would be better) .  You can do assertions while you’re recording.  And the replay works the first time and every time – Hallelujah!

Screen Shot 2020-01-09 at 12.54.11 PM

And then I look to set up cross-browser and discover they only support Chrome and Firefox, and to even do that in an automated manner you have to duplicate the entire test suite.  I was so disappointed, it worked perfectly otherwise.

Seriously y’all if you add more browsers I’ll pay you immediately for this.

EndTest

Determined to make this happen I find EndTest and, after verifying they support a full OS/browser matrix, try them. They also use a browser plugin recorder like GhostInspector.

I’ll be honest, the UX is terrible.  Besides the 1990s colored icons, everything is always a click away – you have to watch the replay video separately from looking at the logs from looking at the steps from editing the steps. Everyone, the magic combination is editable steps to the left, running video and logs to the right, highlight the step you’re on as it plays. Anything else harms your usability.  And also while editing steps you can’t add a step in just anywhere, you have to add it at the end of your 100 steps and then drag it up page by page…  And often when you do that you just get “error saving test” messages for no reason. Argh.

Screen Shot 2020-01-09 at 1.08.46 PM

But… The recording is quick and then it is semi working.  Tempting.  Now I start the iterative edit-replay-debug cycle.  It is slow. You get to give your steps a name but those names don’t show up in the test output, because why would they.  After an afternoon of fiddling, I’m halfway through a 7 screen flow. Their support was nicely proactive and reached out to me about an error (I was looking for text with a $ in it and you can’t do that, but you can define a variable and then use that…)

It’s at this point I also find the Selenium IDE and bring Sauce back into the mix.

Keep Trying – Sauce and EndTest

Next, what I was doing was fiddling with the steps in the Selenium IDE, then pumping those changes both into Sauce via CLI and manually editing them into EndTest’s UI, desperately hoping to get one to pass (they don’t act the same under the same inputs for whatever reason).

Locator by locator I grind through making the test work.  I have a lot of trouble where we use multiple option mat-selects, because they “stay open” while you select items and I can’t get them to close.  I try sending ESCAPE keys but can’t get that to work, I try double clicking on other things…  One of our devs figured out the magical thing to click on was the overlay backdrop (css=.cdk-overlay-backdrop) to close the damn multiselect box.

This takes several grueling days.  I ask support folks for help but don’t really get any useful traction.  Finally, I get a magic combination in the Selenium IDE that also works in Sauce!  I try the same ones in EndTest and they don’t work.

Screen Shot 2020-01-10 at 11.15.10 AMScreen Shot 2020-01-10 at 11.15.18 AM

It’s super frustrating.  The same locator doesn’t work in all 3 tools, often forcing me to choose a less portable option – instead of something resilient to change like “xpath=//span[contains(.,’Visual Line of Sight’)]” – which works in some cases – I end up having to use something like like  xpath=//mat-option[@id=’mat-option-87′]/mat-pseudo-checkbox (and sadly in angular material those IDs randomize unpredictably). Like, there will literally be two identical-except-for-the-text-and-ids-in-them widgets one after the another and one kind of locator works on the first one and not on the second. No idea why.

Sauce Labs – Part 2

OK, so of all the options the only one that actually works for me and will allegedly do crossbrowser testing is an unsupported combo of Selenium IDE and Sauce run off the command line.  A couple sources I found over the course of this:

Not optimal, but at this point I’m a week in and taking what I can get.  Let’s try an actual crossbrowser matrix now.  Bonus hacky Bash script:

#!/usr/bin/env bash

tests=("Paul Precision.side")

platforms=("browserName='chrome' version='latest' platform='macOS 10.14'"
        "browserName='chrome' version='latest' platform='Windows 10'"
        "browserName='chrome' version='latest' platform='Linux'"
        "browserName='MicrosoftEdge' version='latest' platform='Windows 10'"
        "browserName='safari' version='latest' platform='macOS 10.14'"
        "browserName='firefox' version='latest' platform='macOS 10.14'"
        "browserName='internet explorer' version='latest' platform='Windows 10'")

for test in "${tests[@]}"
do
        for platform in "${platforms[@]}"
        do
        echo Running "${test}" "${platform}" 
        echo
        selenium-side-runner --server https://<secrets>@ondemand.saucelabs.com:443/wd/hub -c "${platform}" "${test}"
done
done

Chrome on MacOS – works.  Chrome on Windows – works.  Chrome on Linux – for some reason can’t find a selector early on.  Edge on Windows – weird proxy 400 error, won’t even load the page.  Pretty sure that’s not my fault.  Safari on MacOS – can’t click on the first things it needs to click on.  Firefox on MacOS – same error?  Really?  Now IE… Out of minutes (despite the UI telling me .6 automated hours remain).

I have tried all these os/browser combos manually and they work.

So my conclusion is all these suck and I guess I just need to pay manual QA people to click on our app.  Great.  Or for Cypress to get off their butts and add cross-browser support, which they say “is coming” for three years now.

We’re a startup and time is money, so in the end cross-browser testing is not worth the hassle in all these solutions.  But it is important and I’d love someone to make a solution that actually works for it.

P.S. Please do not suggest another solution unless it has a) UI record and replay capability and b) is cross browser (Chrome/Firefox/Safari/IE/Edge on Windows/MacOS/Linux). I know there’s a million browser automation testing tools out there, that’s not what I need.

Update

I put some more time into this and got some working options – see Record and Replay Browser Testing, Take 2!

Leave a comment

Filed under DevOps

Why Blaming “Human Error” Is Wrong

I’ve been writing a LinkedIn Learning course on postmortems lately and digging into all the fun research on the topic (Dekker, Hollnagel, and so on), and deepening my knowledge on the things I hope most of you know (root cause is a myth, blaming “human error” is wrong…).

I came across an example that really brought it home to me why the continuous blaming of human error is wrong – not “mean,” not “unenlightened,” but just plain logically ineffective.

One of the classic examples of design choices contributing to aviation accidents is the similarity and close placement of landing gear and flap controls in an airplane cockpits. Pilots lower the landing gear, then when they land they pull the flaps – but a small miss has them retract the landing gear instead and they pancake in.

In fact, the US Air Force did a study at the close of World War II where they looked back at all kinds of “pilot error” crashes and identified a bunch of design problems that contributed, and the flap/landing gear confusion was #2 on the list, forming 16%

Analysis of Factors Contributing to 460 ‘Pilot-Error’ Experiences in Operating Aircraft Controls,” by P.E. Fitts and R.E. Jones, USAF Aero Medical Laboratory, Memorandum Report, July 1947.

Then, 20 years later, a major study on the same topic

Aircraft Design-Induced Pilot Error, National Transportation Safety Board, Department of Transportation, Washington, D. C., PB # 175 629, July 1967.
And then, another 13 years later, suddenly they realized the same thing about small craft.

Well, there were very simple fixes mandated to this problem early on – the FAA now requires, for example, the landing gear control be shaped like a wheel and the flap control be shaped like a flap, so basic visual and tactile feedback is available to distinguish the two (especially at night, under stress, etc.).

Screen Shot 2019-09-04 at 5.00.37 PM

There’s other simple tricks that greatly reduce these accidents, like putting a catch on the landing gear retraction. Suddenly the “human error” goes away (leading to the reasonable question of how broadly we define “human error…”

So why did this “known” problem persist – damaging planes and killing people I might add – for 33 years?  In fact, it still happens, here’s a lovely writeup from 2015.

Basically, because all of the accidents where this happened continued to be declared “pilot error.” When there’s not any significant further inquiry (which to be fair, bothers people like airlines and the government and aircraft manufacturers and people with money), just saying it’s pilot error gets the problem over with by a minor sacrifice (a pilot) instead of doing any harder work.

At my new job, we’re doing risk analysis and commercial insurance for unmanned autonomous vehicles (drones). It’s interesting to now be working in a space that’s actually closer than tech to all this safety research. And you can see the same things happening.

Sure, actual research shows that it’s technical problems, not really human error, that is the problem most of the time – see

News article: More drone crashes caused by technical glitches, not human error, study shows.

Study: Exploring Civil Drone Accidents and Incidents to Help Prevent Potential Air Disasters

It cites technical problems as 64% of the time and frankly doesn’t really distinguish human error from design-induced human error.

But of course you can still just blame human error.  I smelled something questionable in the recent news reports of how the crew was to blame for a UK Watchkeeper drone crash. Oh sure, the drone failed to land correctly so the crew intervened, so it’s their fault.  I suspect if they had not intervened and it had continued to malfunction it would also be “their fault.” Here’s the full Ministry of Defense writeup, which goes to the “loss of situational awareness” synonym for human error. And here’s a later Register article with more details, like “The most appropriate [flight reference card] drill …stated: ‘If UA [unmanned aircraft] not maintaining centreline axis: Engine cut……..Command’.” and that they were under supervision of contractors from the drone manufacturer. Apparently following the actual designated drill recommendation of cutting the engine still makes it your fault.

Screen Shot 2019-09-05 at 9.51.07 AM

Of course, “Five drones – almost 10% of a 54-strong fleet bought from French firm Thales – have been wrecked in mid Wales crashes.”  Apparently they have a lot of navigation problems.  So when one goes to land in a populated area and is clearly not navigating properly and goes off the runway and the crew cuts the engine…  The crash is ‘their fault.’  Riiiiiight.

It’s actually a fascinating question – when drone operations are more and more autonomous, how long can we just hold the “crew” responsible for anything that goes wrong? It’s cheaper and less embarrassing, so I’m betting a good while.

For us in tech this opens up an interesting discussion, beyond the obvious statement of “if you do an incident postmortem and simply write it off to developer or operator error you aren’t doing your job”.

We can’t always fix all design issues immediately.  At what point, though, does not prioritizing a better-than-bandaid fix become negligence? “Thirty years,” like with the flap/landing gear thing?

There’s a lot of legislation that tries to protect the powerful from lawsuits etc. – but as autonomy becomes more common, how long will that last for us? Technology firms have managed to get out of being held responsible for endemic security flaws (largely thanks to Microsoft) for decades.

You can see this beginning to crumble in aviation with things like the recent Boeing 737 Max crashes. “It’s human error!” declares the Boeing CEO.  But people aren’t that dumb and the Internet helps information get out that was previously inaccessible.  So the next tack is blame the software, but that’s also buck-passing… The software was the band-aid fix on top of the design issues.

When will our lovely band of insulation finally be whittled away in tech? Soon, I’d bet… “Oh sure let’s crank out some self driving cars, I’m sure it’ll be fine and we can just give the standard ‘what me worry’ face when our crappy design ends up killing some soccer mom that we use when we mess up a software patch nowadays.”

In the end, if you are motivated by actual safety, or uptime, or security instead of the CYA game of who to blame, you have to push beyond the nearest human, or the outermost band-aid on your Rube Goldberg system, to improve the system. You’re going to have to consider how your design and interfaces of your software and the tooling you use to operate it contribute. You’re going to have to use facts and numbers and not soothing opinions to say “you know what? That goes wrong more than our other systems – there’s something wrong with it, we have to dig in and figure out what.”

Leave a comment

Filed under DevOps

Community First! Village

2019-06-08 10.21.02

DoD Organizer Family Tour

DevOpsDays Austin sponsored this great charity this year with our proceeds, and the program is so cool I wanted to do a whole post on it.

Community First! Village “is a 51-acre master planned community that provides affordable, permanent housing and a supportive community for men and women coming out of chronic homelessness.”  It consists of 200+ micro-homes and RVs and supporting infrastructure, they’re at 78% of capacity already, and they are planning for another 300 homes to be built. They’re located in southeast Austin out near the Travis County Expo Center.

DCIM100MEDIADJI_0012.JPG

Aerial View of Village

And it’s really nice! The primary kind of residence are little mini-houses, 180-200 square feet in size, with electricity but no plumbing.  There are standalone bathroom buildings with individual lockable rooms. There’s kitchen buildings for more extensive cooking. There’s RVs, more expensive but better for those with medical problems. There’s a community garden (with chickens and bees), a store, a hairdresser, a garage, a forge, and more.  Heck, there’s a bus stop and an Amazon dropbox.

Here’s a series of pictures I took on our tour.

This slideshow requires JavaScript.

Austin has around 2200 homeless, and the number continues to rise. My parents visited me in Austin a couple months ago, and we went out and ate and they were shocked by how many were on the street, especially as we drove through the “shelter district” downtown. There are many efforts to help, but this is an approach I hadn’t heard of before, and wanted to share with everyone.

How Does It Work?

Donna Emery, the Director of Development for Mobile Loaves & Fishes, gave us a tour and told us all about it. She’d love any of you to come tour the village as well! Mobile Loaves & Fishes as an organization has been serving the homeless for many years, and this is their deeply considered idea at making a permanent difference.

The village isn’t a shelter; it’s intended to be permanent. They identify candidates for the village via social workers and the array of people trying to help the increasing homeless population (there’s a database they all use to track homeless clients and try to get them services and such).  The person says they want to get into the Village, and there’s an about 12 month runway program to get them ready and in.

There are three rules to living in the village.

  1. Have to pay rent. Micro-homes rent for $275-$375/month, the RVs more like $435. They work to ensure they have their social services and encourage “dignified income” working in the village or otherwise. 96% of the residents pay their rent on time, which is better than your average apartment building!
  2. Have to follow civil law. This isn’t “anything goes”, and safety is paramount. They don’t turn you away if you have a alcohol or substance abuse problem – you’re only going to get over that if you have housing – but crime isn’t allowed. It isn’t a major problem for them; homeless are generally the victims, not the perpetrators, of crimes (other than the criminalization of being homeless, of course). Applicants do have criminal background checks – they don’t disqualify you out of hand for having a record though, but don’t allow sex offenders and evaluate a past of violent crime carefully.
  3. Have to follow the rules of the community (like a strict HOA) – you have to care for your neighborhood. This isn’t a jungle, it’s a community. The place was very clean and well tended. (Pets are welcome, though! We spoke with a man walking his dog at length on our tour.)

Last year, residents earned $650k in “dignified income” – working in the gardens, crafting, doing maintenance, working in the garage and market…  You can make $900/mo from a job cleaning the community bathrooms, for example. Donna stressed that they don’t rely on handouts – it harms the dignity of the people and you don’t take care of things that are free. When a major tech company donated a bunch of tablets, they set up a monthly tablet rental.  “But those are free, we’re giving them to you, don’t make money off them,” they initially complained. But MLF explained that handouts are an unhealthy dynamic, and this way the renters respect the tablets – and themselves – more. They’ve put a lot of thought and experience into creating a place where communities and lives can grow for people that have had nothing.

Of course, they provide a lot of help, from social services to things like teaching them to use Netspend for money management.

Blue ribbon Austin business and organizations have donated a lot of the infrastructure to make this work – Alamo Drafthouse, HEB, Charles Maund, the Topfer family, and many more.

Really A Community

But the thing I found the most striking about this is that it’s really a community, and a part of the larger community around it.

40% of the residents are women. There have been two weddings so far among the residents and two residents passed away with their wishes to be interred in the Village. The average age of homeless coming there is around 50 and they’ve been chronically homeless for around 10 years. This isn’t an attempt at “give them a shower and shave and get them a job and send them back out into the wild,” this is a permanent home where they can belong as long as they want. Donna shared with us that what really makes persistent homelessness is some kind of crisis combined with a collapse of a person’s social relationships – no family, no friends to help. Being sent away from a community doesn’t tend to form better social support, does it?

From their FAQ:

It’s all about relationships. Mobile Loaves & Fishes desires to empower the community around us into a lifestyle of service with the homeless. We achieve this vision through Community First! Village by taking a relational approach for connecting with our homeless brothers and sisters, instead of a transactional approach. When we bring an individual into community with others, we truly begin to make a sustainable impact on their lives.

Mobile Loaves & Fishes believes that the single greatest cause of homelessness is a profound, catastrophic loss of family. That’s why our focus at Community First! Village is to do more than just provide adequate housing. We have developed a community with supportive services and amenities to help address an individual’s relational needs at a fraction of the cost of traditional housing initiatives. We seek to empower our residents to build relationships with others, and to experience healing and restoration as part of engaging with a broader community.

DCIM100MEDIADJI_0643.JPGThe businesses aren’t just for the residents – you can go there to the garage and pay to get your oil changed.  You can go attend their movie nights (the Alamo donated a projector) that are open to the public like any movie night in any park. They do things like a trail of lights during the holidays. There’s plenty of reasons for non-residents to go there, it’s not a “camp.” It’s just a subdivision, really, like any other one you’d drive through in Austin.

DCIM100MEDIADJI_0173.JPGHeck, you can go live there. 170 of the occupants are former homeless, but there are also many “mission families” living there with them to provide help and more strongly tie them into the social fabric of the Austin community.  Or you can rent spare homes on AirBNB!  They have a hall (“Unity Hall”) that can accommodate up to 300 and there’s a commercial kitchen attached (also staffed by residents) so you can host events there – we started seriously looking at it for smaller tech events. (More pics are in the slideshow above).

How Can You Help?

Let’s get real.  If you’re reading this tech blog you’re probably incredibly well off. Working for a company that’s incredibly well off. We have an embarrassment of riches in the tech scene here in Austin, living next to people with nothing. In DevOps we talk continually about collaboration, sharing, and community – one would think that our appetite for helping the less fortunate would go farther than just making sure you get an underrepresented person on your next tech panel.

You can help with funding.  Their Phase II capital campaign is building more homes and supporting buildings, a clinic, and more. Eventually they want things like dental care (an especially hard problem; it’s relatively expensive but dental problems unheeded turn into medical problems quickly). You can give, you can encourage your company to give. DevOpsDays Austin made spare money from sponsors, so we were able to put $25,000 into sponsoring one of the homes in their next phase.

You can help by volunteering. Persons or groups can email them and get set up to come help!  Get your church or other organization involved. They’ve had over 100 Eagle Scouts do their projects out there.

You can help by participating in your local government.  They had a long battle to be able to start the village and had to locate outside the City of Austin because of the never-ending NIMBY-ism of residents not wanting “those people” anywhere near them. Advocate for compassion and the homeless in your city council and other venues.

CFV_14_ResidentYou can help even by just going there, using the businesses, interacting with the residents to weave them into the fabric of Austin. Go on a tour to see what they’re doing out there. Bring your kids! We all had a great and deeply moving family outing in our visit to the Village.

1 Comment

Filed under Conferences, DevOps

Incident Management Course Coming!

2019-06-13 10.51.48I know we’ve been quiet on the blog, all four agile admins have been busy – several of us moved to new jobs, everyone has a lot going on.

But we’re still doing stuff!  I just went out to Carpenteria to film a LinkedIn Learning course on Incident Management.  The agile admins have a full DevOps curriculum on LinkedIn Learning (which was lynda.com); most of them are in the “Become a DevOps Engineer” learning path!  You can view them as a LIL member or they can be bought individually nowadays too.

We’ve done the 101 level (DevOps Foundation), the 201 level (CI/CD, CM/Infrastructure as Code, SRE, Monitoring and Observability, Lean and Agile) and now we’re hitting more details – Karthik’s done a bunch of Kubernetes and Cloud Native courses, Peco is doing more monitoring courses, James is doing DevSecOps courses…

2019-06-13 12.28.02And I just went and filmed an Incident Management course.  Incident Response, really, I’m hoping for a subsequent course that focuses on retrospectives (each class is only like an hour long and retros are a huge fun topic so I wanted to give them enough time on their own).

Pictured are my producers Adam and Lori and my live action director Julia (who’s also done some of my other courses!) This was a slides course (my first), but they have a program where they can add in a little live action, and since I’ve done it a bunch and Julia’s great we burned through a bunch of scripts in a short time on camera! Thanks to all of them (and my content manager Brian Anderson, not pictured).

The Course

I’ve been creating IM processes and training and leading organizations in them for a while now. A good incident response program removes friction and lets your smart technical staff focus on one thing, solving the problem, without having to worry about what to do otherwise. When I left AlienVault, the #1 thing people came and said to me in my 2 week notice period was “Hey, that incident management process, that’s really made a huge difference,” which is great to hear.

And it was a good opportunity to refresh on the newer developments in the field.  I first got into modern IM, which I defines as “derived from the Incident Command System”, in 2008 after I heard Brent Chapman speak at Velocity on Incident Command for IT: What We Can Learn from the Fire Department.  But (aside from retros) while that concept spread, for 5-6 years there wasn’t really a lot more in terms of new developments. Luckily that’s changed, and there’s been a lot lately. John Allspaw and J. Paul Reed have both done masters’ theses with Lund University’s Division of Risk Management and Societal Safety; there’s a new O’Reilly book Incident Management for Operations as well as IM being a hot topic in the Google SRE books, and so on. The REdeploy conference and Thai Wood’s Resilience Roundup weekly email newsletter and the Oncall Nightmares podcast re full of late breaking developments. (These sources and more are listed in the course handout!)

Special thanks to J. Paul for giving me guidance on the course content and giving me permission to use his and Kevina Finn-Braun’s Incident Lifecycle Model in it.

Expect video topics like:

  • Why Do I Need Incident Management?
  • The Incident Command System
  • Scoping the Problem
  • Your Incident Toolchain
  • Incident Toolchain Example
  • Detecting and Reporting Incidents
  • First Response and Escalation
  • Incident Communication With Your Users
  • Communicating Inside Your Organization
  • Best Practices for Diagnosis and Repair
  • Cleaning Up After
  • Continuously Improving
  • Training and Game Days
  • Implementation Challenges

Oh, and I got to use props for the first time (like that fire extinguisher in the lead pic), we threw some in for kicks. Fun!

The Experience

Speaking of that, I just wanted to give the LinkedIn Learning team a shout-out.  Making courses with them is a great experience, class all the way.  They are all super skilled at what they do and super friendly. Going to their campus/studio in Carpenteria, CA is always an exceedingly pleasant experience. Everything’s top notch, sound booths, live action studios… It’s not the average webcam tech course when you’re looking down the barrel of a camera with a director, a producer, and a sound/teleprompter person fussing over the fine details! If you are an expert in something (not just tech) and are interested in doing courses, I’m happy to introduce you to someone there; it’s all top quality.

And they treat their people well there!  As best as I can tell they always have, from when they were Lynda to when they were LinkedIn to now being owned by Microsoft. Lori confided in me, “I was a documentary filmmaker with a non-profit for years and I didn’t know jobs like this existed; I’ve never been treated so well.”

While I was there they were doing their monthly “InDay”, and apparently this is the most anticipated one of a year as it’s game themed. They had inflatable human foozball, arcade games, did up the cafeteria with a Stranger Things theme, even had a D&D training session.

 

2019-06-13 17.33.21And of course Carpinteria is beautiful, right on the beach, extremely temperate. It’s between Ventura and Santa Barbara, just north of LA. If you go out there, my hot tips are the nearby Shoals restaurant (a little down the 101) where you can get a table right on the water, and Chocolats du CaliBressan, a French chocolatier down in the far north end the beach side of Carpinteria. Oh and the booze is super cheap in the supermarket, so we always make some gin and juice and hang out in the Holiday Inn’s hot tub while we’re there…

 

2 Comments

Filed under DevOps

DevOpsDays Austin 2019 Retrospective

2019-05-02 12.49.54As mentioned, DevOpsDays Austin 2019 went off great!  And after the event, we sent out extensive surveys to attendees, sponsors, volunteers, speakers, and even the organizers to learn and improve. (Thanks to everyone who gave their feedback, we appreciate it!)

Last year we also did an extensive retrospective to figure out how we wanted this year to go, and this year’s event was driven by that feedback and our vision to make DoD Austin the place for practitioners to come, learn from each other, and build the local community.

Let me share this year’s retro with you – some of the numbers and sentiments are below with my thoughts. If you want the full details, sure, here you go!

Full DevOpsDays Austin 2019 Retrospective (pdf)

If you’re not familiar with a NPS score, it’s used to measure sentiment on a scale from -100 to +100.  When you get asked “would you recommend” something on a 1-10 scale, generally they’re taking that number and bucketing it into 1-6 being detractors (counted as negative), 7-8 being neutral, and 9-10 being promoters (counted as positive). Above 0 is “good”, above 50 is “excellent.”  See more about NPS scores here.

Sorry about the quality of the pics, these are basically ones I snapped myself on my iPhone. But hopefully they show some of what happened at the event!

Attendee Feedback (62 NPS, 50 responses)

2019-05-02 09.43.28

Damon Edwards

“Informative, laid back, friendly, humorous event. My favorite conference for a couple of years now.” 84% of attendees said they were likely to return.

The things people liked the most as measured by the freeform comments were the openspaces (9 comments), the speakers/talks, especially their diversity (8 votes), the culture/atmosphere of the event (5 votes), and the community and people (5 votes).

This makes me happy. DevOpsDays isn’t just “a conference,” it really focuses on building community – people meeting each other in a friendly and collaborative environment. The content is nice but it’s not the primary value of the event.

2019-05-02 09.48.15

Mandy Whaley

Concerns people had the most were “Nothing/great job” (10 votes), difficulty with travel and parking at the venue, including handicap access (6 votes), talks (6 votes), we want better lunches (4 votes).

Read on for more but we’re probably changing venues next year and will keep access in mind.  Now on the lunches – we used to have fancy lunches and they were a significant time and effort sink, with long lines, lots of time spent, and so on.  We moved to box lunches and now lunch goes fast and easy and leaves everyone more time to interact with each other.  We do not plan to ever change back from that, but we will see if we can get a BBQ place or something to do a nice lunch box.

(There were more likes and dislikes and we are evaluating action on all of them, but dang this post is going to be long already so I’m focusing on the top line items.)

Speaker Feedback (90 NPS, 10 responses)

2019-05-02 11.10.39

Pete Cheslock

  • “Everyone was really positive; welcoming, low-pressure environment.”
  • Experience – 50% excellent, 50% very good
  • Organization – 40% extremely, 50% very organized
  • Friendliness – 90% extremely, 10% very friendly

Likes: No tech problems/helpful techs/setup organized (x4), Supportive/welcoming (x3), Engaged audience (x3).  Dislikes: Chromebook support problem, schedule slippage, openspaces competing with Conversations talks.

Great overall, some things for us to tweak!  After several years in the same venue and buying a lot of gear, our crack AV team have the tech end of it pretty much down pat.

2019-05-03 15.20.05

Jon Loyens

Organizer Feedback (88 NPS, 8 responses)

  • “Just [wanted] to say how much I enjoy working with the crew and watching it all come together to put on a great event for the community. I get a lot out of doing it each year and see my contribution as an important way to give back.”
  • Time spent – 62.5% just right, 12.5% little long, 12.5% little short, 12.5% way too short
  • 93% likely to return (the one that isn’t pleaded a heavy year at work coming up)

Major likes included working together (x3), inclusion (x2), and the opportunity to give back (x2). Dislikes included some stressing out and looking for problems, and speaker notification happening late. There was good discussion about explaining openspaces more especially for the newer folks.

It’s important to me that our organizers have a good time too – my assigned domain on the organizer team is “Organizers” – besides working the master budget and schedule for folks, I facilitate and try to ensure that this volunteer gig is not onerous, and I’m happy we seem to be there.

2019-05-02 13.33.45

Deborah Hawkins

Volunteer Feedback (94 NPS, 17 responses)

  • Experience: 72.7% excellent, 27.78% very good
  • How much time you spend – 83% about right, 11% too much, 6% too little
  • 93% likely to return

We have a lot of volunteers from the community that come to slave away working the event for a free ticket and a couple meals, basically.  It’s very important to all of us that they have a good experience – these are the future organizers, and community members going above and beyond to give back to the community.  Boyd and Daria and the other organizers did a great job both organizing the work and making sure the volunteers had time to participate in the event and have a good experience – even given the storm-nightmare loadout at the end of the event. Thanks to all our great volunteers!

Sponsor Feedback (60 NPS, 10 responses)

  • “A++ highly recommend, etc. Y’all did a bang-up job putting this together, and the community is certainly a testament to your hard work and continuous efforts. I’ve told everyone at HQ that we need to learn from you.”
  • Experience – 70% excellent, 20% very good, 10% good
  • Liked: “Always a great event – excellent sessions, great opportunities to meet with customers and prospects.” Vendor area good. Friendly people and networking.
  • Disliked: Platinum sponsors were upstairs. Water bottles ran out. We want badge scanners. No day before setup. Only 1 minute blurb. Schedule off track. When will courtesy shipping be picked up.

2019-05-03 09.49.41So… Sponsors. For a number of years we kept expanding our sponsor offerings.  Then we realized the event had become too much of a traditional conference and we were spending lots of space, time, and effort on sponsors, when to be honest we don’t really need all that much money to put on the event.  Two years ago after a bunch of sponsor problems and everyone working themselves to the bone to provide professional conference services I did away with sponsor tables altogether. We let them back this year but really wanted to make the event not about that.  We also warn the sponsors up front this isn’t a “churn the leads” event, we want sponsors who are going to send technical people to engage with the community.

Did it work out that way?  Kinda. There’s too much expectation set up about what “conferences are like” and “DevOpsDays are like” and between the person purchasing the sponsorship and the people actually sent on site there’s a lot of room for expectations to drift.

2019-05-03 14.52.36

Tristan Slominski

I feel like there’s plenty of big conferences for that kind of sponsor engagement.  DevOpsDayses didn’t used to be like that, but as time goes on and they all grow it’s tempting to “improve” by making it more sponsor focused. We love sponsors who engage with the community but we consciously balance their participation in the event.

Funny story… Like I said we only let sponsor tables back on a limited basis this year. But there was a run on them, and we sold out of the ones we needed to fund the event quickly and had a bunch of sponsors still wanting to participate, including ones who had participated for  years. So we extended the sponsor room, just to let them participate, because we felt bad about excluding them. So we always sell out, so that’s probably a sign that we’re doing fine there.

And we got to sponsor a house for the homeless with the spare money, so that’s spiffy.

Recruiter Feedback (-50 NPS, 2 responses)

This is a new addition that didn’t work out so well. We had imagined a big recruiter speed dating thing. But few recruiters and attendees signed up for it so we pivoted into a recruiter fair.  It was during happy hour, but half the attendees leave before that. We had them by the bar, but the DevOps Trivia during the happy hour was also a big draw.

While all the recruiters rated their experience “good” they had low traffic.

So, sorry that didn’t work out. But I stressed to the organizers that this wasn’t a failure – if we don’t try new things that don’t work out sometimes, we’re not trying hard enough.

We’re one of the great grand-daddy DevOps events. We have years of experience, ample funding, and a big community.  Smaller DoDs, especially ones getting off the ground, often need to hew close to the “standard format” for a safe launch and to pay their bills.  We can afford to experiment, so I strongly urge the team every year to try different things.  It’s OK if we appeal to different sets of the community each year.  It’s OK to not do something again (even if it went well) and it’s OK to try new things as stretch goals. I kinda like putting how we run our event where our DevOps mouth is, so to speak.

This lets us try things out first. We were the first DoD with a multi-content track. We created the new “Conversations” talk format this year. We keep innovating, and sometimes there’s just not a fit given the constraints of venue, time, people, and so on. So this one didn’t go off great, but to me that just means we’re legitimately experimenting hard enough.

Ernest’s Retrospective Thoughts

Overall it went great!  Smooth, excellent execution by everyone involved. I feel like the Austin tech community is stronger for our event existing and that’s what I want out of it.

My main challenge personally this year was with the talks.

We really went into this year with an intent to curate the talks to a pretty specific practitioner format. DoD Austin has a bunch of years behind it so we don’t necessarily need the DevOps “talk circuit” talks to fill slots.  We feel like we can be very specific about the experience we want to curate – no repeat talks from other events (go watch them on the Internet, everyone posts videos!), some preference to local speakers, encourage diversity both in speakers and in content…  But we didn’t execute on that well.  We started using Papercall this year and it makes it easy for people to mass submit to multiple events – a great feature but somewhat antithetical to our needs. We had 200 submissions for 20 slots and had a lot of weeding to do and had to turn away a lot of folks. And while we had good talks, they didn’t fit our proposed theme necessarily.

We also just selected talks late, to where it risked people whose talks were declined not being able to attend because we sold out our attendee cap.

The second challenge was with openspaces.  In general the larger the event, the harder it is to make openspaces work. Once there’s more than 25 people in an openspace the format collapses and it’s just “2-3 people talking to each other and everyone else straining to hear,” basically a super crap panel talk. Putting them in the luxury boxes in the stadium worked really well there, because only so many people can fit into one, so it was a forcing function to keep them small enough to work. So they went well overall.

But some folks didn’t like them. Each year we get some feedback from folks more used to traditional content.  “Maybe we should get the openspace topics submitted before the conference so they’re already on the schedule!” No offense, but over my dead body. That’s not what openspaces are about and openspaces are the heart of DevOpsDays. They are for what the actual attendees want to talk about right then; the entire point is that they’re not programmed content. Early DevOpsDays were a couple talks and then pretty much all openspaces.  My general attitude is “if you don’t want to participate in openspaces, this is not the event for you.” We need to explain openspaces more ahead of time though, to seed ideas and get new people to understand the format.  Our experiment with mini-talks and then linked openspaces worked out great, I went to two of them and got high value out of them.

Next Year

A couple big changes are coming next year.

First of all, we’re probably changing venue.  We’ve enjoyed the stadium a lot, and love the staff there, but we’ve probably done as much as we can with the event in that particular form factor.

We’re considering going entirely to the new 20 minute talk format.  They were well received – if you really have more content than 20 minutes, a linked openspace is probably the best venue to explore it with highly engaged attendees!  And it’ll prevent people just submitting their “same talk” as much. We can also get more speakers in!

Also, we know it’s a bummer that we’ve been capping attendance and sponsors and that people who want to attend get turned away. So far we’ve felt like we have had to, both because of venue capacity but also to keep openspaces good and keep the great atmosphere and community and opportunities for engagement that make our event distinct.

Now that we have enough experience, we think we might be able to go bigger and still keep the small group and one-on-one interaction. We’ve all been to a bunch of conferences and seen other things – 1-1 mentoring table signups, for example, and other formats that facilitate it.  We’re also thinking about adding some “working groups” – opportunities to do something, produce position papers, whatnot, give the experts a really neat thing to do at the event.

And maybe even add on a third day, with all unstructured content. On a Saturday so people could bring their kids and stuff.

I wanted to just blaze big next year; the rest of the team loved the vision but reminded me how much burn-in there is on a new venue – getting A/V figured out, all the rough spots of a year one… So we may iterate into it, with getting a new venue and going slightly larger and trying out new engagement ideas next year, and then the year after saying “Big tent!  All are welcome!  Fly in for this one, no attendee or sponsor caps!” and making it a heroically sized event.

There’s no one right format for DevOpsDays – I encourage other organizers to keep experimenting as well.  Your event doesn’t have to be the same year to year; you can target different goals and audiences and sizes and such each time.

If anyone read this far, feel free and comment with your thoughts below! (Obligatory disclaimer, don’t tell me “well this isn’t right for my DevOpsDays” – that’s fine, none of this is to declare the “right” way to do an event, it’s just what is working for us in our community with our particular goals.)

Leave a comment

Filed under Conferences, DevOps