Tag Archives: IaaS

The Cloud Procurement Pecking Order

I was planning to go to this meeting here in town about “Preparing for the post-IaaS phase of cloud adoption” and it brought home to me how backwards many organizations are when they start thinking about cloud options. So now you get Ernest’s Cloud Procurement Pecking Order.

What many people are doing is moving in order of comfort, basically, as they start moving from old school on prem into the cloud.  “I’ll start with private cloud… Then maybe public IaaS… Eventually we’ll look at that other whizbang stuff.” But here’s what your decision path should be instead. It’s the logical extension of the basic buy vs build strategy decision you’re used to doing.

Cloud Procurement Flowchart

Look at the functionality you are trying to fulfull.  Now ask in order:

  1. Is it available as a SaaS solution?  If so, use that. You shouldn’t need to host servers or write code for many of your needs – everything from email to ERP is commoditized nowadays. This is the modern equivalent of “buy, don’t build.” You don’t get 100% control over the functionality if you buy it, but unless the function is super core to your business you should simply get over that.
  2. [Optional] Does it fit the functional profile to do it serverless? Serverless is basically “second gen PaaS with less fiddly IaaS in it” so this would be your second step. Amazon has Lambda and Azure and Google have shipped competitors already. Right this moment serverless tech is still pretty bleeding edge, so you’d be forgiven for skipping this step if you don’t have pretty high caliber techies on staff.
  3. Can I do it in a public PaaS?  Then use a public PaaS (Heroku/Beanstalk/Google App Engine/Azure), unless you have some real (not FUD) requirements to do it in house.
  4. Can I do it in a private PaaS? Then use Cloudfoundry or similar. Or do you really (for non-FUD reasons) need access to the hardware?
  5. Can I do it in public IaaS?  Then use Amazon, or Azure. Or do you really (for non-FUD reasons) need it “on premise” (probably not really on premise, but in some datacenter you’re leasing – which is different from being outsourced in the cloud why)?  Even hardcore hardware render is done in the cloud nowadays (you can get GPU driven instances, SSDs, etc.)
  6. Can I do it in a private cloud? Use VMWare Cloud or Openstack. This is your final recourse before doing it the old fashioned way – unless you have extremely unique hardware requirements, you probably can. Also, you can do hybrid cloud – basically private cloud plus public cloud (IaaS only really). This gets you some of the IaaS benefits while complicating your architecture.

What About Compliance?

Very few compliance requirements exist that cannot be satisfied in the cloud.  There are large financials operating in the cloud, people with SOX and PCI and FISMA and NIST and ISO compliance needs… If your reason for running on prem is “but compliance” there’s a 90% chance you are just plain wrong, and coasting on decade-old received wisdom instead of being well informed about the modern state of cloud technology and security and compliance. I’ve personally helped pure-cloud solutions hit ISO and TUV and various other compliance goals.

What About The Cost?

This ordering seems to be inverted from how people are inching into the cloud. But the lower on this list you are, the less additional value you are getting from the solution (assuming the same price point). You should instead be reluctantly dragged into the lower levels on this list – which require more effort and often (though not always) more expense. A higher level needs to be a lot more expensive to justify the additional complexity and lag of doing more of the work yourself.

“But what about the cost,” you say, “the cloud gets more expensive than me running a couple servers?” It’s easy to be penny wise but pound foolish when making cloud cost decisions.

You need to keep in mind the real costs of your infrastructure when you do this – I see a lot of people spending a lot of work on private cloud that they really shouldn’t be. If you simply compare “buying servers” with “cost per month in Amazon” it can seem, using a naive analysis, like you need to go hybrid on prem after a couple hundred thousand dollars appear on your bill. But:

1. Make sure you are taking into account your fully loaded cost (includes data center, power cooling, etc.) of all assets (servers, storage, network…) you are using to do this private. Use the real numbers, not the “funny money” numbers – at a previous company we allocated network and other shared costs across the entire company, while “our IT budget” had to pay for servers, so that was the only number used in a comparison since it was our own department’s costs only that were considered – don’t be a goon (technical term for a local optimizer),  you should consider what it’s costing your entire company. Storage especially is way cheaper in the cloud versus enterprise SANs.

2. Make sure you are taking into account the cost of the manpower to run it.  And that’s not just the techies’ salary (fully loaded with benefits/bonuses), and the proportion of each layer of management going up that has to deal with their concerns (Even if the director only has to spend 30% of his time messing with the data center team, and the VP 10%, and the CTO 5%, and the CEO 1% – that’s a lot of freaking money you need to account for). It’s also the opportunity cost of having people (smart technical people) doing your plumbing instead of doing things to forward your company.  I would argue that instead of putting in the employee’s salary in this calculation, you’d do better to put in your revenue per employee!  Why? Because for that same money you could have someone improving product, making sales, etc. and making you additional revenue. If all you are looking at is “cost reduction” you are probably divorced enough from the business goals of your organization that you are not making good decisions. This isn’t to say you don’t need any of that manpower, but ideally with more plumbing being outsourced you can turn their technical skills to something of more productive use.

3. Make sure you are taking into account the additional lag time and the cost of that time to market delay from DIYing. Some people couch this as just for purposes of innovation – “well, if you’re a small, quick moving, innovative firm or startup, then this velocity matters to you – if you’re a larger enterprise, with yearly budget cycles, not so much.” That’s not true. Assuming you are implementing all this stuff with some end goal in mind, you are burning value along with time the longer it takes you to deliver it – we like to call that cost of delay. Heck, just plain cost of money over that period is significant – I’ve seen companies go through quite a set of gyrations to be able to bill 30 days earlier to get that additional benefit; if you can deliver projects a month earlier from leveraging reusable work (which is all that SaaS/PaaS/IaaS solutions are) then you accelerate your cashflow. If you have to wait 12 months for the IT group to get a private cloud working, you are effectively losing the benefit of your deliverable * 12 months. “We saved $10k/year on hosting costs!”  “Great, can we deliver our product that will make us $10k/month now, or do we get to continue to put ourselves out of business with cost cutting?”

4. Account for complexity.  The problem with “hybrid cloud,” in most implementations, is that it’s not seamless from on prem to public, and therefore your app architecture has to be doubly complicated.  In a previous position where I ran a large SaaS service, we were spread across AWS (virtual everything) and Rackspace (vserver, F5 LBs, etc.) and it was a total nightmare – we were trying to migrate all the way out to the cloud just so we could delete half of the cruft in all our code that touched the infrastructure – complexity that caused production issues (frequently) and slowed our rate of delivering new functionality. The KISS principle is wrathful when ignored.

I’m not saying hybrid cloud, private cloud, etc. are never the answer – but I would say that on average they are usually not the right answer, and if you are using them as your default approach then it’s better than even money you’re being inefficient. Furthermore, using SaaS and PaaS requires less expertise (and thus money) than IaaS which uses less than private cloud – people justify “starting with private” because you are “leveraging skill sets” or whatever – and then 6 months later you have a whole team still trying to bake off OpenStack vs Eucalyptus when you could have had your app (you know, the thing you actually need to fulfill a business goal) already running in a public PaaS. I’m not sure why I need to say out loud “delivering the most amount of value with the least amount of effort, time, and expenditure is good” – but apparently I do. Just because you *can* do something does not mean you *should* do it.  You need to carefully shepherd your time to delivery and your costs, and not just let things float in a morass of IT because “these things take time…”

5 Comments

Filed under Cloud

What Is Cloud Computing?

My recent post on how sick I am of people being confused by the basic concept of cloud computing quickly brought out the comments on “what cloud is” and “what cloud is not.” And the truth is, it’s a little messy, there’s not a clear definition, especially across “the three aaSes“. So now let’s have a post for the advanced students. Chip in with your thoughts!

Here’s my Grand Unified Theory of Cloud Computing. Rather than being a legalistic definition that will always be wrong for some instances of cloud, it attempts to convey the history and related concepts that inform the cloud.

The Grand Unified Theory of Cloud Computing

( ISP -> colo -> MSP ) + virtualization + HPC + (AJAX + SOAP -> REST APIs) = IaaS
(( web site -> web app -> ASP ) + virtualization + fast ubiquitous Internet + [ RIA browsers & mobile ] = SaaS
( IDEs & 4GLs ) + ( EAI -> SOA ) +  SaaS + IaaS = PaaS
[ IaaS | PaaS | SaaS ] + [ devops | open source | noSQL ] = cloud

* Note, I don’t agree with all those Wikipedia definitions, they are only linked to clue in people unsure about a given term

Sure, that’s where the cloud comes from, but “what is the cloud?” Well, here’s my thoughts, the Seven Pillars of Cloud Computing.  Having more of these makes something “more cloudy” and having fewer makes something “less cloudy.” Arguments over whether some specific offering “is cloud” or not, however, is for people without sufficiently challenging jobs.

The Seven Pillars of Cloud Computing

“The Cloud” may be characterized as:

  • An outsourced managed service
  • providing hosted computing or functionality
  • delivered over the Internet
  • offering extreme scalability
  • by using dynamically provisioned, multitenant, virtualized systems, storage, and applications
  • controlled via REST APIs
  • and billed in a utility manner.

You can remove one or more of these pillars to form most of the things people sell you as “private cloud,” for example, losing specific cloud benefits in exchange for other concerns.

Now there’s also the new vs old argument. There’s the technohipsters that say “Cloud is nothing new, I was doing that back in the ’90’s.” And some of that is true, but only in the most uninteresting way. The old and the new have, via alchemy, begun to help users realize benefits beyond what they did before.

Benefits of Cloud – What and How

Not New:

  • Virtualization
  • Outsourcing
  • Integration
  • Intertubes
Pretty New:

  • Multitenancy
  • Massively scalable
  • Elastic self provisioning
  • Pay as you go
Resulting Benefits:

  • agility
  • economy of scale
  • low initial investment
  • scalable cost
  • resilience
  • improved service delivery
  • universal access

Okay, Clouderati – what do you think?

2 Comments

Filed under Cloud

Amazon CloudFormation: Model Driven Automation For The Cloud

You may have heard about Amazon’s newest offering they announced today, CloudFormation.  It’s the new hotness, but I see a lot of confusion in the Twitterverse about what it is and how it fits into the landscape of IaaS/PaaS/Elastic Beanstalk/etc. Read what Werner Vogels says about CloudFormation and its uses first, but then come back here!

Allow me to break it down for you and explain why this is such a huge leverage point for cloud developers.

What Has Come Before

Up till now on Amazon you could configure up a single virtual image the way you wanted it, with an AMI. You could even kind of construct a scalable tier of similar systems using Auto Scaling, by defining Launch Configurations. But if you wanted to construct an entire multitier system it was a lot harder.  There are automated configuration management tools like chef and puppet out there, but their recipes/models tend to be oriented around getting a software loadout on an existing system, not the actual system provisioning – in general they come from the older assumption you have someone doing that on probably-physical systems using bcfg2 or cobber or vagrant or something.

So what were you to do if you wanted to bring up a simple three tier system, with a Web tier, app server tier, and database tier?  Either you had to set them up and start them manually, or you had to write code against the Amazon APIs to explicitly pull up what you wanted. Or you had to use a third party provisioning provider like RightScale or EngineYard that would let you define that kind of model in their Web consoles but not construct your own model programmatically and upload it. (I’d like my product functionality in my own source control and not your GUI, thanks.)

Now, recently Amazon launched Elastic Beanstalk, which is more way over on the PaaS side of things, similar to Google App Engine.  “Just upload your application and we’ll run it and scale it, you don’t have to worry about the plumbing.” Of course this sharply limits what you can do, and doesn’t address the question of “what if my overall system consists of more than just one Java app running in Beanstalk?”

If your goal is full model driven automation to achieve “infrastructure as code,” none of these solutions are entirely satisfactory. I understand CloudFormation deeply because we went down that same path and developed our own system model ourselves as a response!

I’ll also note that this is very similar to what Microsoft Azure does.  Azure is a hybrid IaaS/PaaS solution – their marketing tries to say it’s more like Beanstalk or Google App Engine, but in reality it’s more like CloudFormation – you have an XML file that describes the different roles (tiers) in the system, defines what software should go on each, and lets you control the entire system as a unit.

So What Is CloudFormation?

Basically CloudFormation lets you model your Amazon cloud-based system in JSON and then provision and control it as a unit.  So in our use case of a three tier system, you would model it up in their JSON markup and then CloudFormation would understand that the whole thing is a unit.  See their sample template for a WordPress setup. (A mess more sample templates are here.)

Review the WordPress template; it lets you define the AMIs and instance types, what the security group and ELB setups should be, the RDS database back end, and feed in variables that’ll be used in the consuming software (like WordPress username/password in this case).

Once you have your template you can tell Amazon to start your “stack” in the console! It’ll even let you hook it up to a SNS notification that’ll let you know when it’s done. You name the whole stack, so you can distinguish between your “dev” environment and your “prod” environment for example, as opposed to the current state of the Amazon EC2 console where you get to see a big list of instance IDs – they added tagging that you can try to use for this, but it’s kinda wonky.

Why Do I Want This Again?

Because a system model lets you do a number of clever automation things.

Standard Definition

If you’ve been doing Amazon yourself, you’re used to there being a lot of stuff you have to do manually.  From system build to system build even you do it differently each time, and God forbid you have multiple techies working on the same Amazon system. The basic value proposition of “don’t do things manually” is huge.  You configure the security groups ONCE and put it into the template, and then you’re not going to forget to open port 23 AGAIN next time you start a system. A core part of what DevOps is realizing as its value proposition is treating system configuration as code that you can source control, fix bugs in and have them stay fixed, etc.

And if you’ve been trying to automate your infrastructure with tools like Chef, Puppet, and ControlTier, you may have been frustrated in that they address single systems well, but do not really model “systems of systems” worth a damn.  Via new cloud support in knife and stuff you can execute raw “start me a cloud server” commands but all that nice recipe stuff stops at the box level and doesn’t really extend up to provisioning and tracking parts of your system.

With the CloudFormation template, you have an actual asset that defines your overall system.  This definition:

  • Can be controlled in source control
  • Can be reviewed by others
  • Is authoritative, not documentation that could differ from the reality
  • Can be automatically parsed/generated by your own tools (this is huge)

It’s also nicely transparent; when you go to the console and look at the stack it shows you the history of events, the template used to start it, the startup parameters it used… Moving away from the “mystery meat” style of system config.

Coordinated Control

With CloudFormation, you can start and stop an entire environment with one operation. You can say “this is the dev environment” and be able to control it as a unit. I assume at some point you’ll be able to visualize it as a unit, right now all the bits are still stashed in their own tabs (and I notice they don’t make any default use of their own tagging, which makes it annoying to pick out what parts are from that stack).

This is handy for not missing stuff on startup and teardown… A couple weeks ago I spent an hour deleting a couple hundred rogue EBSes we had left over after a load test.

And you get some status eventing – one of the most painful parts of trying to automate against Amazon is the whole “I started an instance, I guess I’ll sit around and poll and try to figure out when the damn thing has come up right.”  In CloudFront you get events that tell you when each part and then the whole are up and ready for use.

What It Doesn’t Do

It’s not a config management tool like Chef or Puppet. Except for what you bake onto your AMI it has zero software config capabilities, or command dispatch capabilities like Rundeck or mcollective or Fabric. Although it should be a good integration point with those tools.

It’s not a PaaS solution like Beanstalk or GAE; you use those when you just have an app you want to deploy to something that’ll run it.  Now, it does erode some use cases – it makes a middle point between “run it all yourself and love the complexity” and “forget configurable system bits, just use PaaS.”  It allows easy reusability, say having a systems guy develop the template and then a dev use it over and over again to host their app, but with more customization than the pure-play PaaSes provide.

It’s not quite like OVF, which is more fiddly and about virtually defining the guts of a single machine than defining a set of systems with roles and connections.

Competitive Analysis

It’s very similar to Microsoft Azure’s approach with their .cscfg and .csdef files which are an analogous XML model – you really could fairly call this feature “Amazon implements Azure on Amazon” (just as you could fairly call Elastic Beanstalk “Amazon implements Google App Engine on Amazon”.) In fact, the Azure Fabric has a lot more functionality than the primitive Amazon events in this first release. Of course, CloudFormation doesn’t just work on Windows, so that’s a pretty good width vs depth tradeoff.

And it’s similar to something like a RightScale, and ideally will encourage them to let customers actually submit their own definition instead of the current clunky combo of ServerArrays and ServerTemplates (curl or Web console?  Really? Why not a model like this?). RightScale must be in a tizzy right now, though really just integrating with this model should be easy enough.

Where To From Here?

As I alluded, we actually wrote our own tool like this internally called PIE that we’re looking to open source because we were feeling this whole problem space keenly.  XML model of the whole system, Apache Zookeeper-based registry, kinda like CloudFormation and Azure. Does CloudFormation obsolete what we were doing?  No – we built it because we wanted a model that could describe cloud systems on multiple clouds and even on premise systems. The Amazon model will only help you define Amazon bits, but if you are running cross-cloud or hybrid it is of limited value. And I’m sure model visualization tools will come, and a better registry/eventing system will come, but we’re way farther down that path at least at the moment. Also, the differentiation between “provisioning tools” that control and start systems like CloudFormation and bcfg2 and “configuration” tools that control and start software like Chef and Puppet (and some people even differentiate between those and “deploy” tools that control and start applications like Capistrano) is a false dichotomy. I’m all about the “toolchain” approach but at some point you need a toolbelt. This tool differentiation is one of the more harmful “Dev vs Ops” differentiations.

I hope that this move shows the value of system modeling and helps people understand we need an overarching model that can be used to define it all, not just “Amazon” vs “Azure” or “system packages” vs “developed applications” or “UNIX vs Windows…” True system automation will come from a UNIVERSAL model that can be used to reason about and program to your on premise systems, your Amazon systems, your Azure systems, your software, your apps, your data, your images and files…

Conclusion

You need to understand CloudFormation, because it is one of the most foundational changes that will have a lot of leverage that AWS has come out with in some time. I don’t bother to blog about most of the cool new AWS features, because they are cool and I enjoy them but this is part of a more revolutionary change in the way systems are managed, the whole DevOps thing.

7 Comments

Filed under Cloud, DevOps

Microsoft Azure for Dummies – or for Smarties?

What Is Microsoft Azure?

I’m going to attempt to explain Microsoft Azure in “normal Web person” language.  Like many of you, I am more familiar with Linux/open source type solutions, and like many of you, my first forays into cloud computing have been with Amazon Web Services.  It can often be hard for people not steeped in Redmondese to understand exactly what the heck they’re talking about when Microsoft people try to explain their offerings.  (I remember a time some years ago I was trying to get a guy to explain some new Microsoft data access thing with the usual three letter acronym name.  I asked, “Is it a library?  A language?  A protocol?  A daemon?  Branding?  What exactly is this thing you’re trying to get me to uptake?”  The reply was invariably “It’s an innovative new way to access data!”  Sigh.  I never did get an answer and concluded “Never mind.”)

Microsoft has released their new cloud offering, Azure.  Our company is a close Microsoft partner since we use a lot of their technologies in developing our company’s desktop software products, so as “cloud guy” I’ve gotten some in depth briefings and even went to PDC this year to learn more (some of my friends who have known me over the course of my 15 years of UNIX administration were horrified).  “Cloud computing” is an overloaded enough term that it’s not highly descriptive and it took a while to cut through the explanations to understand what Azure really is.  Let me break it down for you and explain the deal.

Point of Comparison: Amazon (IaaS)

In Amazon EC2, as hopefully everyone knows by now, you are basically given entire dynamically-provisioned, hourly-billed virtual machines that you load OSes on and install software and all that.  “Like servers, but somewhere out in the ether.”  Those kinds of cloud offerings (e.g. Amazon, Rackspace, most of them really) are called Infrastructure As A Service (IaaS).  You’re responsible for everything you normally would be, except for the data center work.  Azure is not an IaaS offering but still bears a lot of similarities to Amazon; I’ll get into details later.

Point of Comparison: Google App Engine (PaaS)

Take Google’s App Engine as another point of comparison.  There, you just upload your Python or Java application to their portal and “it runs on the Web.”  You don’t have access to the server or OS or disk or anything.  And it “magically” scales for you.  This approach is called Platform as a Service (PaaS).   They provide the full platform stack, you only provide the end application.  On the one hand, you don’t have to mess with OS level stuff – if you are just a Java programmer, you don’t have to know a single UNIX (or Windows) command to transition your app from “But it works in Eclipse!” to running on a Web server on the Internet.  On the other hand, that comes with a lot of limitations that the PaaS providers have to establish to make everything play together nicely.  One of our early App Engine experiences was sad – one of our developers wrote a Java app that used a free XML library to parse some XML.  Well, that library had functionality in it (that we weren’t using) that could write XML to disk.  You can’t write to disk in App Engine, so its response was to disallow the entire library.  The app didn’t work and had to be heavily rewritten.  So it’s pretty good for code that you are writing EVERY SINGLE LINE OF YOURSELF.  Azure isn’t quite as restrictive as App Engine, but it has some of that flavor.

Azure’s Model

Windows Azure falls between the two.  First of all, Azure is a real “hosted cloud” like Amazon Web Services, like most of us really think about when we think cloud computing; it’s not one of these on premise things that companies are branding as “cloud” just for kicks. That’s important to say because it seems like nowadays the larger the company, the more they are deliberately diluting the term “cloud” to stick their products under its aegis.  Microsoft isn’t doing that, this is a “cloud offering” in the classical (where classical means 2008, I guess) sense.

However, in a number of important ways it’s not like Amazon.  I’d definitely classify it as a PaaS offering.  You upload your code to “Roles” which are basically containers that run your application in a Windows 2008(ish) environment.  (There are two types – a “Web role” has a stripped down IIS provided on it, a “Worker role” doesn’t – the only real difference between the two.)  You do not have raw OS access, and cannot do things like write to the registry.  But, it is less restrictive than App Engine.  You can bundle up other stuff to run in Azure – even run Java apps using Apache Tomcat.  You have to be able to install whatever you want to run “xcopy only” – in other words, no fancy installers, it needs to be something you could just copy the files to a Windows PC, without administrative privilege, and run a command from the command line and have it work.  Luckily, Tomcat/Java fits that description. They have helper packs to facilitate doing this with Tomcat, memcached, and Apache/PHP/MediaWiki.  At PDC they demoed Domino’s Pizza running their Java order app on it and a WordPress blog running on it.  So it’s not only for .NET programmers.  Managed code is easier to deploy, but you can deploy and run about anything that fits the “copy and run command line” model.

I find this approach a little ironic actually.  It’s been a lot easier for us to get the Java and open source (well, the ones with Windows ports) parts of our infrastructure running on Azure than Windows parts!  Everybody provides Windows stuff with an installer, of course, and you can’t run installers on Azure.  Anyway, in its core computing model it’s like Google App Engine – it’s more flexible than that (good) but it doesn’t do automatic scaling (bad).  If it did autoscaling I’d be willing to say “It’s better than App Engine in every way.”

In other ways, it’s a lot like Amazon.  They offer a variety of storage options – blobs (like S3), tables (like SimpleDB), queues (like SQS), drives (like EBS), SQL Azure (like RDS).  They have an integral CDN.  They do hourly billing.  Pricing is pretty similar to Amazon – it’s hard to totally equate apples to apples, but Azure compute is $0.12/hr and an Amazon small Windows image compute is $0.12/hr (Coincidence?  I think not.).  And you have to figure out scaling and provisioning yourself on Amazon too – or pay a lot of scratch to one of the provisioning companies like RightScale.

What’s Unique and Different

Well, the largest thing that I’ve already mentioned is the PaaS approach.  If you need OS level access, you’re out of luck;  if you don’t want to have to mess with OS management, you’re in luck!  So to the first order of magnitude, you can think of Azure as “like Amazon Web Services, but the compute uses more of a Google App Engine model.”

But wait, there’s more!

One of the biggest things that Azure brings to the table is that, using Visual Studio, you can run a local Azure “fabric” on your PC, which means you can develop, test, and run cloud apps locally without having to upload to the cloud and incur usage charges.  This is HUGE.  One of the biggest pains about programming for Amazon, for instance, is that if you want to exercise any of their APIs, you have to do it “up there.”  Also, you can’t move images back and forth between Amazon and on premise.  Now, there are efforts like EUCALYPTUS that try to overcome some of this problem but in the end you pretty much just have to throw in the towel and do all dev and test up in the cloud.  Amazon and Eclipse (and maybe Xen) – get together and make it happen!!!!

Here’s something else interesting.  In a move that seems more like a decision from a typical cranky cult-of-personality open source project, they have decided that proper Web apps need to be asynchronous and message-driven, and by God that’s what you’re going to do.  Their load balancers won’t do sticky sessions (only round robin) and time out all connections between all tiers after 60 seconds without exception.  If you need more than that, tough – rewrite your app to use a multi-tier message queue/event listener model.  Now on the one hand, it’s hard for me to disagree with that – I’ve been sweating our developers, telling them that’s the correct best-practice model for scalability on the Web.  But again you’re faced with the “Well what if I’m using some preexisting software and that’s not how it’s architected?” problem.  This is the typical PaaS pattern of “it’s great, if you’re writing every line of code yourself.”

In many ways, Azure is meant to be very developer friendly.  In a lot of ways that’s good.  As a system admin, however, I wince every time they go on about “You can deploy your app to Azure just by right clicking in Visual Studio!!!”  Of course, that’s not how anyone with a responsibly controlled production environment would do it, but it certainly does make for fast easy adoption in development.   The curve for a developer who is “just” a C++/Java/.NET/whatever wrangler to get up and going on an IaaS solution like Amazon is pretty large comparatively; here, it’s “go sign up for an account and then click to deploy from your IDE, and voila it’s running on the Intertubes.”  So it’s a qualified good – it puts more pressure on you as an ops person to go get the developers to understand why they need to utilize your services.  (In a traditional server environment, they have to go through you to get their code deployed.)  Often, for good or ill, we use the release process as a touchstone to also engage developers on other aspects of their code that need to be systems engineered better.

Now, that’s my view of the major differences.  I think the usual Azure sales pitch would say something different – I’ve forgotten two of their huge differentiators, their service bus and access control components.  They are branded under the name “AppFabric,” which as usual is a name Microsoft is also using for something else completely different (a new true app server for Windows Server, including projects formerly code named Dublin and Velocity – think of it as a real WebLogic/WebSphere type app server plus memcache.)

Their service bus is an ESB.  As alluded to above, you’re going to want to use it to do messaging.   You can also use Azure Queues, which is a little confusing because the ESB is also a message queue – I’m not clear on their intended differentiation really.  You can of course just load up an ESB yourself in any other IaaS cloud solution too, so if you really want one you could do e.g. Apache ServiceMix hosted on Amazon.  But, they are managing this one for you which is a plus.  You will need to use it to do many of the common things you’d want to do.

Their access control – is a mess.  Sorry, Microsoft guys.  The whole rest of the thing, I’ve managed to cut through the “Microsoft acronyms versus the rest of the world’s terms and definitions” factor, but not here.   “You see, you use ACS’s WIF STS to generate a SWT,” says our Microsoft rep with a straight face.   They seem to be excited that it will use people’s Microsoft Live IDs, so if you want people to have logins to your site and you don’t want to manage any of that, it is probably nice.  It takes SAML tokens too, I think, though I’m not sure if the caveats around that end up equating to “Well, not really.”  Anyway, their explanations have been incoherent so far and I’m not smelling anything I’m really interested in behind it.  But there’s nothing to prevent you from just using LDAP and your own Internet SSO/federation solution.  I don’t count this against Microsoft because no one else provides anything like this, so even if I ignore the Azure one it doesn’t put it behind any other solution.

The Future

Microsoft has said they plan to add on some kind of VM/IaaS offering eventually because of the demand.  For us, the PaaS approach is a bit of a drawback – we want to do all kinds of things like “virus scan uploaded files,” “run a good load balancer,” “run an LDAP server”, and other things that basically require more full OS access.  I think we may have an LDAP direction with the all-Java OpenDS, but it’s a pain point in general.

I think a lot of their decisions that are a short term pain in the ass (no installs, no synchronous) are actually good in the long term.  If all developers knew how to develop async and did it by default, and if all software vendors, even Windows based ones, provided their product in a form that could just be “copy and run without admin privs” to install, the world would be a better place.  That’s interesting in that “Sure it’s hard to use now but it’ll make the world better eventually” is usually heard from the other side of the aisle.

Conclusion

Azure’s a pretty legit offering!  And I’m very impressed by their velocity.  I think it’s fair to say that overall Azure isn’t quite as good as Amazon except for specific use cases (you’re writing it all in .NET by hand in Visual Studio) – but no one else is as good as Amazon either (believe me, I evaluated them) and Amazon has years of head start; Azure is brand new but already at about 80%! That puts them into the top 5 out of the gate.

Without an IaaS component, you still can’t do everything under the sun in Azure.  But if you’re not depending on much in the way of big third party software chunks, it’s feasible; if you’re doing .NET programming, it’s very compelling.

Do note that I haven’t focused too much on the attributes and limitations of cloud computing in general here – that’s another topic – this article is meant to compare and contrast Azure to other cloud offerings so that people can understand its architecture.

I hope that was clear.  Feel free and ask questions in the comments and I’ll try to clarify!

Leave a comment

Filed under Cloud