Tag Archives: Cloud Computing

What Is Cloud Computing?

My recent post on how sick I am of people being confused by the basic concept of cloud computing quickly brought out the comments on “what cloud is” and “what cloud is not.” And the truth is, it’s a little messy, there’s not a clear definition, especially across “the three aaSes“. So now let’s have a post for the advanced students. Chip in with your thoughts!

Here’s my Grand Unified Theory of Cloud Computing. Rather than being a legalistic definition that will always be wrong for some instances of cloud, it attempts to convey the history and related concepts that inform the cloud.

The Grand Unified Theory of Cloud Computing

( ISP -> colo -> MSP ) + virtualization + HPC + (AJAX + SOAP -> REST APIs) = IaaS
(( web site -> web app -> ASP ) + virtualization + fast ubiquitous Internet + [ RIA browsers & mobile ] = SaaS
( IDEs & 4GLs ) + ( EAI -> SOA ) +  SaaS + IaaS = PaaS
[ IaaS | PaaS | SaaS ] + [ devops | open source | noSQL ] = cloud

* Note, I don’t agree with all those Wikipedia definitions, they are only linked to clue in people unsure about a given term

Sure, that’s where the cloud comes from, but “what is the cloud?” Well, here’s my thoughts, the Seven Pillars of Cloud Computing.  Having more of these makes something “more cloudy” and having fewer makes something “less cloudy.” Arguments over whether some specific offering “is cloud” or not, however, is for people without sufficiently challenging jobs.

The Seven Pillars of Cloud Computing

“The Cloud” may be characterized as:

  • An outsourced managed service
  • providing hosted computing or functionality
  • delivered over the Internet
  • offering extreme scalability
  • by using dynamically provisioned, multitenant, virtualized systems, storage, and applications
  • controlled via REST APIs
  • and billed in a utility manner.

You can remove one or more of these pillars to form most of the things people sell you as “private cloud,” for example, losing specific cloud benefits in exchange for other concerns.

Now there’s also the new vs old argument. There’s the technohipsters that say “Cloud is nothing new, I was doing that back in the ’90’s.” And some of that is true, but only in the most uninteresting way. The old and the new have, via alchemy, begun to help users realize benefits beyond what they did before.

Benefits of Cloud – What and How

Not New:

  • Virtualization
  • Outsourcing
  • Integration
  • Intertubes
Pretty New:

  • Multitenancy
  • Massively scalable
  • Elastic self provisioning
  • Pay as you go
Resulting Benefits:

  • agility
  • economy of scale
  • low initial investment
  • scalable cost
  • resilience
  • improved service delivery
  • universal access

Okay, Clouderati – what do you think?


Filed under Cloud

Inside Microsoft Azure

Recently, I delivered a presentation at the Austin Cloud User Group introducing them to Microsoft Azure.  I’m a UNIX bigot and have been doing the Amazon Cloud and open source thing, but we are delivering a product via Azure next so our team is learning it.  It’s actually quite interesting and has a number of good points; it’s mainly hindered by the Microsoft marketing message trying to pretend it’s all magical fairy dust instead of clearly explaining what it is and what it can do. So if you want to hear what Azure is in straight shooting UNIX admin speak, check it out!

1 Comment

Filed under Cloud

Austin Cloud User Group Nov 17 Meeting Notes

This month’s ACUG meeting was cooler than usual – instead of having one speaker talk on a cloud-related topic, we had multiple group members do short presentation on what they’re actually doing in the cloud.  I love talks like that, it’s where you get real rubber meets the road takeaways.

I thought I’d share my notes on the presentations.  I’ll write up the one we did separately, but I got a lot out of these:

  1. OData to the Cloud, by Craig Vyal from Pervasive Software
  2. Moving your SaaS from Colo to Cloud, by Josh Arnold from Arnold Marziani (previously of PeopleAdmin)
  3. DevOps and the Cloud, by Chris Hilton from Thoughtworks
  4. Moving Software from On Premise to SaaS, by John Mikula from Pervasive Software
  5. The Programmable Infrastructure Environment, by Peco Karayanev and Ernest Mueller from National Instruments (see next post!)

My editorial comments are in italics.  Slides are linked into the headers where available.

OData to the Cloud

OData was started by Microsoft (“but don’t hold that against it”) under the Open Specification Promise.  Craig did an implementation of it at Pervasive.

It’s a RESTful protocol for CRUDdy GET/POST/DELETE of data.  Uses AtomPub-based feeds and returns XML or JSON.  You get the schema and the data in the result.

You can create an OData producer of a data source, consume OData from places that support it, and view it via stuff like iPhone/Android apps.

Current producers – Sharepoint, SQL Azure, Netflix, eBay, twitpic, Open Gov’t Data Initiative, Stack Overflow

Current consumers – PowerPivot in Excel, Sesame, Tableau.  Libraries for Java (OData4J), .NET 4.0/Silverlight 4, OData SDK for PHP

It is easier for “business user” to consume than SOAP or REST.  Craig used OData4J to create a producer for the Pervasive product.

Questions from the crowd:

Compression/caching?  Nothing built in.  Though normal HTTP level compression would work I’d think. It does “page” long lists of results and can send a section of n results at a time.

Auth? Your problem.  Some people use Oauth.  He wrote a custom glassfish basic HTTP auth portal.

Competition?  Gdata is kinda like this.

Seems to me it’s one part REST, one part “making you have a DTD for your XML”.  Which is good!  We’re very interested in OData for our data centric services coming up.

Moving your SaaS from Colo to Cloud

Josh Arnold was from PeopleAdmin, now he’s a tech recruiter, but can speak to what they did before he left.  PeopleAdmin was a Sungard type colo setup.  Had a “rotting” out of country DR site.

They were rewriting their stack from java/mssql to ruby/linux.

At the time they were spending $15k/mo on the colo (not including the cost of their HW).  Amazon estimated cost was 1/3 that but really they found out after moving it’s 1/2.  What was the surprise cost?  Lower than expected perf (disk io) forced more instances than physical boxes of equivalent “size.”

Flexible provisioning and autoscaling was great, the colo couldn’t scale fast enough.  How do you scale?

The cloud made having an out of country DR site easy, and not have it rot and get old.

Question: What did you lose in the move?  We were prepared for mental “control issues” so didn’t have those.  There’s definitely advanced functionality (e.g. with firewalls) and native hardware performance you lose, but that wasn’t much.

They evalled Rackspace and Amazon (cursory eval).  They had some F5s they wanted to use and the ability to mix in real hardware was tempting but they mainly went straight to Amazon.  Drivers were the community around it and their leadership in the space.

Timeline was 2 years (rewrite app, slowly migrate customers).  It’ll be more like 3-4 before it’s done.  There were issues where they were glad they didn’t mass migrate everyone at once.

Technical challenges:

Performance was a little lax (disk performance, they think) and they ended up needing more servers.  Used tricks like RAIDed EBSes to try to get the most io they could (mainly for the databases).

Every customer had a SSL cert, and they had 600 of them to mess with.  That was a problem because of the 5 Elastic IP limit.  Went to certs that allow subsidiary domains – Digicert allowed 100 per cert (other CAs limit to much less) so they could get 100 per IP.

App servers did outbound LDAP conns to customer premise for auth integration and they usually tried to allow those in via IP rules in their corporate firewalls, but now on Amazon outbound IPs are dynamic.  They set up a proxy with a static (elastic) Ip to route all that through.

Rightscale – they used it.  They like it.

They used nginx for the load balancing, SSL termination.  Was a single point of failure though.

Remember that many of the implementations you are hearing about now were started back before Rackspace had an API, before Amazon had load balancers, etc.

In discussion about hybrid clouds, the point was brought up a lot of providers talk about it – gogrid, opsource, rackspace – but often there are gotchas.

DevOps and the Cloud

Chris Hilton from Thoughtworks is all about the DevOps, and works on stuff like continuous deployment for a living.

DevOps is:

  • collaboration between devs and operations staff
  • agile sysadmin, using agile dev tools
  • dev/ops/qa integration to achieve business goals

Why DevOps?

Silos.  agile dev broke down the wall between dev/qa (and biz).

devs are usually incentivized for change, and ops are incentivized for stability, which creates an innate conflict.

but if both are incentivized to deliver business value instead…

DevOps Practices

  • version control!
  • automated provisioning and deployment (Puppet/chef/rPath)
  • self healing
  • monitoring infra and apps
  • identical environments dev/test/prod
  • automated db mgmt

Why DevOps In The Cloud?

cloud requires automation, devops provides automation


  • “Continuous Delivery” Humble and Farley
  • Rapid and Reliable Releases InfoQ
  • Refactoring Databases by Ambler and Sadalage

Another tidbit: they’re writing puppet lite in powershell to fill the tool gap – some tool suppliers are starting, but the general degree of tool support for people who use both Windows and Linux is shameful.

Moving Software from On Premise to SaaS

John Mikula of Pervasive tells us about the Pervasive Data Cloud.  They wanted to take their on premise “Data Integrator” product, basically a command line tool ($, devs needed to implement), to a wider audience.

Started 4 years ago.  They realized that the data sources they’re connecting to and pumping to, like Quickbooks Online, Salesforce, etc are all SaaS from the get go.   “Well, let’s make our middle part the same!”

They wrote a Java EE wrapper, put it on Rackspace colo initally.

It gets a customer’s metadata, puts it on a queue, another system takes it off and process it.  A very scaling-friendly architecture.  And Rackspace (colo) wasn’t scaling fast enough, so they moved it to Amazon.

Their initial system had 2 glassfish front ends, 25 workers

For queuing, they tried Amazon SQS but it was limited, then went to Apache Zookeeper

First effort was about “deploy a single app” – namely salesforce/quickbooks integration.  Then they made a domain specific model and refactored and made an API to manage the domain specific entities so new apps could be created easily.

Recommended approach – solve easy problems and work from there.  That’s more than enough for people to buy in.

Their core engine’s not designed for multitenancy – have batches of workers for one guy’s code – so their code can be unsafe but it’s in its own bucket and doesn’t mess up anyone else.

Changing internal business processes in a mature company was a challenge – moving from perm license model to per month just with accounting and whatnot was a big long hairy deal.

Making the API was rough.  His estimate of a couple months grew to 6.  Requirements gathering was a problem, very iterative.  They weren’t agile enough – they only had one interim release and it wasn’t really usable; if they did it again they’d do the agile ‘right thing’ of putting out usable milestones more frequently to see what worked and what people really needed.

In Closing

Whew!  I found all the presentations really engaging and thank everyone for sharing the nuts and bolts of how they did it!

Leave a comment

Filed under Cloud

Velocity 2010 – Dueling Cloud Management Suppliers

Two cloud systems management suppliers talk about their bidness!  My comments in italics.

Cloud Autoscaling in Enterprise Computing by George Reese (enStratus Networks LLC)

How the Top Social Games Scale on the Cloud by Michael Crandell (RightScale, Inc)

I am more familiar with RightScale, but just read Reese’s great Cloud Application Architectures book on the plane here.  Whose cuisine will reign supreme?


Reese starts talking about “naive autoscaling” being a problem.  The cloud isn’t magic; you have to be careful.  He defines “enterprise” autoscaling as scaling that is cognizant of financial constraints and not this hippy VC-funded twitter type nonsense.

Reactive autoscaling is done when the system’s resource requirements exceed demand.  Proactive autoscaling is done in response to capacity planning – “run more during the day.”

Proactive requires planning.  And automation needs strict governors in place.

In our PIE autoscaling, we have built limits like that into the model – kinda like any connection pool.  Min, max, rate of increase, etc.

He says your controls shouldn’t be all “number of servers,” but be “budget” based.  Hmmm.  That’s ideal but is it too ideal?  And so what do you do, shut down all your servers if you get to the 28th of the month and you run out of cash?

CPU is not a scaling metric. Have better metrics tied to things that matter like TPS/response time.  Completely agree there; scaling just based on CPU/memory/disk is primitive in the extreme.

Efficiency is a key cloud metric.  Get your utilization high.

Here’s where I kinda disagree – it can often be penny wise and pound foolish.  In the name of “efficiency” I’ve seen people put a bunch of unrelated apps on one server and cause severe availability problems.  Screw utilization.  Or use a cloud provider that uses a different charging model – I forget which one it was, but we had a conf call with one cloud provider that only charged on CPU used, not “servers provisioned.”

Of course you don’t have to take it to an extreme, just roll down to your minimum safe redundancy number on a given tier when you can.

Security – well, you tend not to do some centralized management things (like add to Active Directory) in the cloud.  It makes user management hard.  Or just makes you use LDAP, like God intended.

Cloud bursting – scaling from on premise into the cloud.

Case study – a diaper company.  Had a loyalty program.  It exceeded capacity within an hour of launch.  Humans made a scaling decision to scale at the load balancing tier, and enStratus executed the auto-scale change.  They checked it was valid traffic and all first.

But is this too fiddly for many cases?  If you are working with a “larger than 5 boxes” kind of scale don’t you really want some more active automation?


The RightScale blog is full of good info!

They run 1.2 million cloud servers!  hey see things like 600k concurrent users, 100x scaling in 4 days, 15k instances, 1:2000 management ratio…

Now about gaming and social apps.  They power the top 10 Facebook apps.  They are an open management environment that lives atop the cloud suppliers’ APIs.

Games have a natural lifecycle where they start small, maybe take off, get big, eventually taper off.  It’s not a flat demand curve, so flat supply is ‘tarded.

During the early phase, game publishers need a cheap, fast solution that can scale.  They use Chef and other stuff in server templates for dynamic boot-time configuration.

Typically, game server side tech looks like normal Web stuff!  Apache+HAproxy LB, app servers, db cache (memcached), db (sharded mySQL master/slave pairs).  Plus search, queues, admin, logs.

Instance types – you start to see a lot of larger instances – large and extra large.  Is this because of legacy comfort issues?  Is it RAM needs?

CentOS5 dominates!  Generic images, configured at boot.  One company rebundles for faster autoscale.  Not much ubuntu or Windows.  To be agile you need to do that realtime config.

A lot of the boxes are used for databases.  Web/app and load balancing significant too.  There’s a RightScale paper showing a 100k packets per second LB limit with Amazon.

People use autoscaling a lot, but mainly for web app tier.  Not LBs because the DNS changing is a pain.  And people don’t autoscale their DBs.

They claim a lot lower human need on average for management on RightScale vs using the APIs “or the consoles.”  That’s a big or.  One of our biggest gripes with RightScale is that they consume all those lovely cloud APIs and then just give you a GUI and not an API.  That’s lame.  It does a lot of good stuff but then it “terminates” the programmatic relationship. [Edit: Apparently they have a beta API now, added since we looked at them.]

He disagrees with Reese – the problem isn’t that there is too much autoscaling, it’s that it has never existed.  I tend to agree. Dynamic elasticity is key to these kind of business models.

If your whole DB fits into memcache, what is mySQL for?  Writes sometimes?  NoSQL sounds cool but in the meantime use memcache!!!

The cloud has enabled things to exist that wouldn’t have been able to before.  Higher agility, lower cost, improved performance with control, anew levels of resiliency and automation, and full lifecycle support.

1 Comment

Filed under Cloud, Conferences, DevOps

CloudCamp Austin Is Soon!

Mark your calendars; Thursday of next week (June 10) is CloudCamp here in Austin!  It’s in North Austin at Pervasive’s offices (Riata Trace) from 5:30-10:00 PM.  Get details and sign up here.

Leave a comment

Filed under Cloud

Amazon Web Services – Convert To/From VMs?

In the recent Amazon AWS Newsletter, they asked the following:

Some customers have asked us about ways to easily convert virtual machines from VMware vSphere, Citrix Xen Server, and Microsoft Hyper-V to Amazon EC2 instances – and vice versa. If this is something that you’re interested in, we would like to hear from you. Please send an email to aws-vm@amazon.com describing your needs and use case.

I’ll share my reply here for comment!

This is a killer feature that allows a number of important activities.

1.  Product VMs.  Many suppliers are starting to provide third-party products in the form of VMs instead of software to ease install complexity, or in an attempt to move from a hardware appliance approach to a more-software approach.  This pretty much prevents their use in EC2.  <cue sad music>  As opposed to “Hey, if you can VM-ize your stuff then you’re pretty close to being able to offer it as an Amazon AMI or even SaaS offering.”  <schwing!>

2.  Leveraging VM Investments.  For any organization that already has a VM infrastructure, it allows for reduction of cost and complexity to be able to manage images in the same way.  It also allows for the much promised but under-delivered “cloud bursting” theory where you can run the same systems locally and use Amazon for excess capacity.  In the current scheme I could make some AMIs “mostly” like my local VMs – but “close” is not good enough to use in production.

3.  Local testing.  I’d love to be able to bring my AMIs “down to me” for rapid redeploy.  I often find myself having to transfer 2.5 gigs of software up to the cloud, install it, find a problem, have our devs fix it and cut another release, transfer it up again (2 hour wait time again, plus paying $$ for the transfer)…

4.  Local troubleshooting. We get an app installed up in the cloud and it’s not acting quite right and we need to instrument it somehow to debug.  This process is much easier on a local LAN with the developers’ PCs with all their stuff installed.

5.  Local development. A lot of our development exercises the Amazon APIs.  This is one area where Azure has a distinct advantage and can be a threat; in Visual Studio there is a “local Azure fabric” and a dev can write their app and have it running “in Azure” but on their machine, and then when they’re ready deploy it up.  This is slightly more than VM consumption, it’s VMs plus Eucalyptus or similar porting of the Amazon API to the client side, but it’s a killer feature.

Xen or VMWare would be fine – frankly this would be big enough for us I’d change virtualization solutions to the one that worked with EC2.

I just asked one of our developers for his take on value for being able to transition between VMs and EC2 to include in this email, and his response is “Well, it’s just a no-brainer, right?”  Right.

1 Comment

Filed under Cloud

Come to CloudCamp Austin 2!

The second CloudCamp in Austin is happening June 10.  It’s an unconference about, of course, cloud computing.  Read about it and sign up here!

I missed the first one but loved OpsCamp so I’m going!

Leave a comment

Filed under Cloud