Category Archives: Monitoring

Monitoring and Observability

Ah, observability, the new buzzword of the day. Monitoring vendors aplenty are using the word, to basically mean “better monitoring!” You know, #monitoringlove not #monitoringsucks. Because monitoring doesn’t help with debugging and doesn’t have app instrumentation right?

Well, I have to say “bah” to that.  So here’s the thing.  I’m an electrical engineer by education, and I spent a lot of time working at National Instruments, an engineering test and measurement company.  You may be surprised to know these terms have actual definitions that don’t require Twitter arguments to discover.

Monitoring is an activity you perform. It’s simply observing the state of a system over a period of time.

Why do we monitor? For three reasons, in general.

  • Problem Detection – you know, alerting, or seeing issues on dashboards.
  • Problem Resolution – root cause and troubleshooting.
  • Continuous Improvement – capacity planning, financial planning, trending, performance engineering, reporting.

How do we monitor?  Well, that’s called instrumentation. You can instrument your systems and get CPU and stuff, you can use synthetic probes, you can use JavaScript bugs to get end user monitoring, you can emit metrics from applications, you can introspect services and apps via whatever parts are exposed (from JMX to nginx stats to sysdig traces), you can take network traces… (Some folks are similarly trying to redefine “instrumentation” to just mean application instrumentation, which is lame, and in defiance of the fact that application performance management tools that do app instrumentation have existed for decades.)

You can instrument metrics or events; metrics have certain sampling frequency and resolution…

So what is observability?  This isn’t a new term. It comes from system control theory. You know, the stuff that makes your A/C system and electrical plants and your car work.

Observability is a measure of how well the internal states of a system can be inferred from knowledge of its external outputs.

Observability is a property of a system. You can monitor a system using various instrumentation, but if the system doesn’t externalize its state well enough that you can figure out what’s actually going on in there, then you’re stuck.

So is observability hippy bullcrap?  No, of course not. In a DevOps world, it’s very important that the apps and systems concentrate on making themselves both observable and controllable (I leave it to the reader to research controllability, unless I get agitated enough to post about that too). Do you make yourself “easy to monitor”?

Externalizing custom metrics contributes to observability (you know, like with dropwizard metrics).  So does good logging.  So does proper architecture!  Take a system that sticks all kinds of messages into one message queue rather than using separate queues for separate types – the latter is more observable; you can more readily see how many of what is flowing through.  (It’s more controllable too, as you can shut off one queue or another.)

Making your system observable is therefore important, so that if you monitor it with appropriate instrumentation, you understand the state of the system and can make short or long term plans to change it.

While a monitoring tool can definitely contribute to this via its innovation in instrumentation, analysis, and visualization, in large part observability is a battle won or lost before you start sticking tools on top of the system. It’s very important to take it into account when designing and implementing services. No tool is going to “give you” observability and that’s the usual silver bullet fallacy heard from someone who wants to sell you something.

I’m not saying every vendor is using the term wrongly (in fact I just came across this New Relic post that is very well done), but I have to say I am less than impressed when common engineering terms are so widely misused and misunderstood widely in our industry.

Would you like to know more?  Peco and I are working on a new lynda.com course on monitoring and observability!  There’ll be real engineering, a broad canvas of the different kinds of monitoring instrumentation, tips on implementation and use… We’ve both been using and/or building monitoring tools for decades now so we hope to have some useful info for you.

1 Comment

Filed under DevOps, Monitoring

CNCF and K8s 101’s

I never make New Year’s resolutions, but I want to do something different for 2018!

One thing I’m learning a lot about is Kubernetes and the CNCF ecosystem around it over the past couple of years and often find myself having a hard time keeping up with ecosystem sometimes. There are almost weekly releases on the many projects, and getting started content for all the new tools and technology is hard to find.

So! I plan to do quick 101 blogs on different topics under the Container/Kubernetes/CNCF umbrella. My first blog article will be on Prometheus- The monitoring tool that integrates GREAT with k8s! It’ll be based on my GitHub code here: https://github.com/karthequian/prometheus-demo (shhh sneak peak).

But, I need your help! Give me a list of things you are confused about in the container space, or want more info on, and I’ll be happy to do the legwork on it!

So, give me input here, or on twitter!

1 Comment

Filed under Cloud, DevOps, k8s, Monitoring

Ensuring Performance In Complex Architectures

The Agile Admin’s very own Peco Karayanev (@bproverb) gave this talk at Velocity this year.  Learn you some monitoring theory!

Leave a comment

Filed under Conferences, Monitoring

Monitoring Survey

James Turnbull (@kartar) has this year’s monitoring survey up, so am reposting his call for participants…

TL;DR – Please take the 2015 Monitoring Survey at
https://www.surveymonkey.com/s/monitoringsurvey2015.

Last year I ran a monitoring survey, whose data I also reviewed as a
series of posts on [my] blog
(http://kartar.net/2014/11/monitoring-survey—background/). I was
interested in running the survey because I think we’re seeing the
beginnings of a significant change in the maturity of the monitoring
landscape and I’d like to track that change.

I’ve decided to make the survey a yearly event and am coinciding the
launch of this year’s survey with Monitorama in Portland.

The survey takes about 5 minutes to fill out and the results will again
be presented on this blog, in some conference talks and made available
as Creative Commons licensed data. The survey is totally anonymous and
the data won’t be used for any commercial purposes.

You can find the survey here –
https://www.surveymonkey.com/s/monitoringsurvey2015.

In related news, if you can’t be at Monitorama try to watch along at http://monitorama.com/#watch!

Leave a comment

Filed under Monitoring

An article I wrote for InfoWorld’s New Tech Forum on all the various monitoring techniques: Know your options for infrastructure monitoring

Leave a comment

by | July 3, 2014 · 2:43 pm

Monitoring and the State of DevOps

If you haven’t read the new  2014 State of DevOps Report from Puppet Labs and other luminaries, check it out now!

I also pulled out some of their findings on monitoring to inspire a post for the Copperegg blog, Monitoring and the State of DevOps, which I thought I’d mention here too.

Leave a comment

Filed under DevOps, Monitoring