Category Archives: Cloud

Cloud computing and all its permutations.

Velocity 2010: Cloud Security: It Ain’t All Fluffy and Blue Sky Out There!

Cloud security, bugbear of the masses.  For my last workshop of Velocity Day 1 I went to a talk on that topic.  I read some good stuff on it in Cloud Application Architectures on the plane in and could stand some more.  I “minor” in security, being involved in OWASP and all, and if there’s one area full of more FUD right now than cloud computing, it is cloud security.  Let’s see if they can dispel confusion!  (I hope it’s not a fluffy presentation that’s nothing but cloud pictures and puns; so many of these devolve into that.)

Anyway, Ward Spangenberg us Directory of Security operations for Zynga Game Networks, which does Farmville and Mafia Wars.  He gets to handle things like death threats.  He is a founding member of the Cloud Security Alliance ™.

Gratuitous Definition of Cloud Computing time!  If you don’t know it, then you don’t need to worry about it, and should not be reading this right now.

Cloud security is “a nightmare,” says a Cisco guy who wants to sell you network gear.  Why?  Well, it’s so complicated.  Security, performance, and availability are the top 3 rated challenges (read: fears) about the cloud model.

In general the main security fuss is because it’s something new.  Whenever there is anything new and uncharted all the risk averse types flip out.

With the lower level stuff (like IaaS), you can build in security, but with SaaS you have to “RFP” it in because you don’t have direct control.

Top threats to cloud computing:

  • Abuse/nefarious use
  • Insecure APIs
  • And more but the slide is gone.  We’ll go over it later, I hope.  Oh, here’s the list.

Multitenancy

The “process next door” may be acting badly, and with IPs being passed around and reused you can get blacklisted ones or get DoSsed from traffic headed to one.  No one likes to share.  You could get germs.  Anyway, they have to manage 13,000 IPs and whitelisting them is arduous.

Not Hosted Here Syndrome

You don’t have insight into locations and other “data center level” stuff.  Even if they have something good, like a SAS 70 certification, you still don’t have insight into who exactly is touching your stuff.  Azure is nice, but have you tried to get your logs?  You can’t see them.  Sad.

Management tools and development frameworks don’t have all the security features they should.  Toolsets are immature and stuff like forensics are nonexistent.  And PaaS environments that don’t upgrade quickly end up being a large attack surface for “known vulnerabilities.”  You can reprovision “quickly” but it’s not instantaneous.

DoS

Stuff like DDoS and botnets are classic abuse.  He says there’s “always something behind it” – people don’t just DoS you for no profit!  And only IaaS and PaaS should be concerned about it!  I think that’s quite an overstatement, especially for those of us who don’t run 13,000 servers – people do DoS for kicks and for someone with 100 or fewer servers, they can be effective at it.

Note “Clobbering the Cloud” from DefCon 17.

Insecure Coding

XSS, injection, CSRF, all the usual… Use the tools.  Validate input.  Review code.  And insecure crypto, because doing real crypto is hard.

Malicious insiders/Pissy outsiders

Devs, consultants, and the cloud company.  You need redundant checks.  Need transparent review.

Shared Technology Issues

With a virtualized level, you can always potentially attack through it.  Check out Cloudburst and Red Pill/Blue Pill.

Data Loss and Leakage

Can happen.  Do what you would normally do to control it.  Encrypt some stuff.

Account or Service Hijacking

Users aren’t getting brighter.  Phishing etc. works great.  There’s companies like Damballa that work against this.  Malware is very smart in lots of cases, using metrics, self-improving.

Public deployment security impacts

Advantages – anonymizing effect, large security investments, pre-certification, multisite redundancy, fault tolerance.

Disadvantages – collateral damage, data & AAA security requirements, regulatory, multi-jurisdictional data stores, known vulnerabilities are global.

Going hybrid public/private helps some but increases complexity and adds data and credential exchange issues.

IaaS issues

Advantages: Control of encryption, minimized privileged user attacks, familiar AAA mechanisms, standardized and cross-vendor deployment, full control at VM level.

Disadvantages: Account hijacking, credential management, API security risks, lack of role based auth, full responsibility for ops, and dependence on the security of the virtualization layer.

PaaS Issues

Advantages: Less operational responsibility, multi-site business continuity, massive scale and resiliency, simpler compliance analysis, framework security features.

Disadvantages: Less operational control, vendor lockin, lack of security tools, increased likelihood of privileged user attack, cloud provider viability.

SaaS Issues

Advantages: Clearly defined access controls, vendor’s responsible for data center and app security, predictable scope of account compromise, integrationwith directory services, simplified user ACD.

Disadvantages: Inflexible reporting and features, lack of version control, inability to layer security controls, increased vulnerability to privileged user attacks, no control over legal discovery.

Q&A

If  you are using something like Flash that goes in the client, how do you protect your IP?  You don’t.  Can’t.  It’ll get reverse engineered.  You can do some mitigations.  Try to detect it.  Sic lawyers on them.  Fingerprint code.

Yes, he plays all their games.

In the end, it’s about risk management.  You can encrypt all the data you put in the cloud, but what if they compromise your boxes you do the encryption on,  or what if they try to crack your encryption with a whole wad of cloud boxes?  Yep.  It brings the real nature of security into clearer relief – it’s a continuum of stopping attacks by goons and being vulnerable to attacks by Chinese government and organized crime funded ninja Illuminati.

Can you make a cloud PCI compliant?  Sure.  Especially if you know how to “work” your QSA, because in the end there’s a lot of judgment calls in the audit process.  Lots of encryption even on top of SSL; public key crypt it from browser up using JS or something, then recrypt with an internal only key.  Use your payment provider’s facilities for hashing or 30-day authorizations and re-auth.  Throw the card number away ASAP and you’re good!  Protecting your keys is the main problem in the all-public cloud.  (Could you ssh-agent it, inject it right into memory of the cloud boxes from on premise?)

Private cloud vs public cloud?  Well, with private you own the infrastructure.

This session was OK; I suspect most Velocity people expect something a little more technical.  There weren’t a lot of takeaways for an ops person – it was more of an ISSA or OWASP “technology decisionmaker”  focused presentation.  If he had just put in a couple hardcore techie things it would have helped.  As it was, it was a long list of security threats that are all existing system security threats too.  How’s this different?  What are some specific mitigations; many of these were offered as “be careful!”  Towards the end with the specific IaaS/PaaS/SaaS implications it got better though.

8 Comments

Filed under Cloud, Conferences, DevOps, Security

Another CloudCamp Austin Wrapup

James already posted, but I took notes too so here’s my thoughts!

CloudCamp was a great time.  Dave Nielsen did a great job facilitating it.  Pervasive Software hosted the shindig.  It started with Mike Hoskins, Pervasive CTO, telling us about how they started an “innovation lab” to reinvigorate Pervasive after being in business for 25 years, and that led to their DataCloud2 product hosted on EC2.

Then there were three lightning talks.

Barton James, Dell cloud evangelist, talked about the continuum between traditional compute to private cloud to public cloud, and how the midsection of that curve will shift over time to solidly center over private cloud.  I think that’s accurate; all the data center nonsense of the last number of years is certainly starting to convince us that you only want to manage hardware if there’s no other choice…   He talked about paths to the cloud- either starting with virtualization and then adding on capabilities until something is really cloud-ready, or just greenfielding something new (that’s what we’re doing!). It was good, apparently Dell has thought more about the cloud since their original ill-conceived attempt to trademark it as a server name.

Oscar Padilla, a senior engineer with Pervasive, spoke about their path moving their existing software to the cloud (very interesting to us,  since we’re doing the same) and the duality in being a both a cloud consumer (Amazon IaaS) and a cloud provider (Pervasive’s SaaS product).  This is an increasingly common pattern; I’d say that being a SaaS provider and not using IaaS  (unless you’re really huge) is likely a mistake on your part.  He also talked about the importance of adding an API so others can leverage your software – this is a huge point and it’s bizarre to me other people still aren’t getting this.

Finally, Walter Falk of IBM spoke about how the hybrid cloud is the bomb.  Hybrid cloud, or “cloud bursting,” is where you run your own nice and cheap local hardware for minimum loads and scale into the cloud for extra capacity.  He also showed a diagram indicating what kinds of workloads are low hanging fruit for cloudification (information intensive, isolated workloads, mature processes…  You’ve probably all seen the slide by now).  And he talked about how ecosystem is very important even for IBM – other people doing good stuff in the space.  “Go to ibm.com/cloud!”

Then we did a little impromptu panel thing, where I and some other folks were drafted up to answer questions.  This revealed something interesting, which is that a LOT of the people there were apparently coming from the cloud provider point of view, and had questions about power consumption and what hypervisor options there are.  As an IaaS consumer/SaaS provider, my main input there is “I don’t want to care about all that nonsense, thus I use IaaS!”   I answered a question about “how to define PaaS,” but my response was not thrilling enough to relate here.

Next came the conference sessions – we did the normal unconference thing of random people writing down topics and doing shows of hands on who cares about that.  The ones that got the largest response were Application Architecture for the Cloud and Systems Management for Cloud Consumers (the latter was mine; the panel gave me the heads up that I’d best add “consumers” to the end of that to not get stuck in storage-container-datacenter hell).

I didn’t go to Application Architecture for the Cloud but spoke to our guys that did and they did something that IMO should have been done in the larger group – did some quick demographics voting!  Bill, one of our devs, tells me that the responses were:

  • What language are you using?  2/3 Java, 1/3 .NET.
  • What cloud are you using?  Vast majority Amazon (even among the .NETters), notable minority Azure, trace amounts of others.
  • Are you internal IT or product focused?  50/50 split.
  • Are you using noSQL stuff?  A small number.
  • Are you using Rails?  No.
  • Are you using SOA/SOAP stuff?  No.
  • Are you using memcache?  A couple are, but more are doing app level caching with JPA or whatnot.

James covered the goings-on in Systems Management for the Cloud well; besides the specific tool takeaways I enjoyed the quote from one of the ServiceMesh guys about the practice of taking your traditional static infrastructure and just implementing it on the cloud without rearchitecting to take advantage of its dynamic nature is called “moving shit to shit.”  I was very impressed with the guys from ServiceMesh and from Pervasive that we met there; we’ve all already hooked up and done lunch to talk more.  All great guys doing some cutting edge stuff.

The last session was on Software to SaaS – taking existing software you sell for on premise use and turning it into a cloud offering.  Phil Fritz from IBM broke a lot of it down very accurately – there are some challenges from the customer side (trust, opex vs capex) but the vast majority of problems you face are internal.  And only a few of those internal issues are really technical in the “make it work in the cloud” sense, the rest are about metering, billing, the sales force not selling it because they don’t understand it or it’s against their usual commission model, forking of code and testing inefficiency, (IBM has a strict rule that there’s not a separate SaaS branch of the software, you have to fold fixes into trunk, which is extremely wise).  This is all very good stuff – our main issues with bringing SaaS to market similarly hasn’t been the technical side, it’s been the product marketers’ doubt, the “it’s not supported” in our ERP/billing system, sales and support staff education…

Then there was a wrapup, but it was like 10 at night on a weeknight so most of the norms had cleared out already.

In closing, it was an awesome event and we made some great contacts for further discussion.  Thanks to Dave and Pervasive for bringing CloudCamp to Austin, and I hope to see another soon!

Leave a comment

Filed under Cloud

Austin Cloud Camp Wrap-up

Austin recently had a CloudCamp and my guess is that it drew in close to 100 attendees.

Before I get into the actual event, let me start this post with a brief story.

During the networking time, I committed one of the worst networking faux pas that one can make when networking: I tried a lame joke upon meeting someone new. One of the other attendees asked me why my company was interested in CloudCamp. I sarcastically replied to his inquisition by explaining that we were really excited about CloudCamp because we do a lot of work with weather instrumentation. Anything to do with clouds, we are so there… Silence.

Blink.

Another blink…. Fail.

At this point I explain that I am an idiot and making sarcastic jokes that fail all the time and I duck out to a different conversation. So, forgetting about my awkward sense of humor, lets move on. Learn from me, don’t make weather jokes at a CloudCamp.

Notes from CloudCamp Austin

At any event, one of the best things that can happen is meeting people in your field. I was able to meet some cool guys in Austin with ServiceMesh and Pervasive. There are also beginning plans to start an AWS User Group in Austin which will be really awesome. Ping me if you want the scoop and I will let you know as I find anything out about it.

The talk I attended was led by the agile admin’s very own: Ernest Mueller. The notes from it are below.

Systems Management in the Cloud

One of the discussion points was how people were implementing dynamic scaling and what infrastructure they are wrapping around that.

Tools people are using in the cloud to achieve dynamic scaling in Amazon Web Services (AWS):
OSSEC for change control and security
Ganglia for reporting
Collectd for monitoring
– Cron tasks for other reporting and metric gathering
Pentaho and Jasper for metrics
– RESTful interface for the managed services layer. Reporting also gets done via RESTful service.
Quartz scheduler to do scaling with metrics around what collectd is monitoring.

When monitoring, we have to start by understanding the perspective of the customers and then try to wrap monitors around that. Are we focused on user or provider? Infrastructure monitoring or application monitoring? The creator of the application that is deployed to the cloud and the environment can provide hooks for the monitoring platform. Which means that developers need to be looking on the horizon of ops early in the development phase.

This is a summary of what I saw at CloudCamp Austin, but I would love to hear what other sessions people went to and what the big takeaways were for them.

Leave a comment

Filed under Cloud, DevOps

CloudCamp Austin Is Soon!

Mark your calendars; Thursday of next week (June 10) is CloudCamp here in Austin!  It’s in North Austin at Pervasive’s offices (Riata Trace) from 5:30-10:00 PM.  Get details and sign up here.

Leave a comment

Filed under Cloud

Amazon Web Services – Convert To/From VMs?

In the recent Amazon AWS Newsletter, they asked the following:

Some customers have asked us about ways to easily convert virtual machines from VMware vSphere, Citrix Xen Server, and Microsoft Hyper-V to Amazon EC2 instances – and vice versa. If this is something that you’re interested in, we would like to hear from you. Please send an email to aws-vm@amazon.com describing your needs and use case.

I’ll share my reply here for comment!

This is a killer feature that allows a number of important activities.

1.  Product VMs.  Many suppliers are starting to provide third-party products in the form of VMs instead of software to ease install complexity, or in an attempt to move from a hardware appliance approach to a more-software approach.  This pretty much prevents their use in EC2.  <cue sad music>  As opposed to “Hey, if you can VM-ize your stuff then you’re pretty close to being able to offer it as an Amazon AMI or even SaaS offering.”  <schwing!>

2.  Leveraging VM Investments.  For any organization that already has a VM infrastructure, it allows for reduction of cost and complexity to be able to manage images in the same way.  It also allows for the much promised but under-delivered “cloud bursting” theory where you can run the same systems locally and use Amazon for excess capacity.  In the current scheme I could make some AMIs “mostly” like my local VMs – but “close” is not good enough to use in production.

3.  Local testing.  I’d love to be able to bring my AMIs “down to me” for rapid redeploy.  I often find myself having to transfer 2.5 gigs of software up to the cloud, install it, find a problem, have our devs fix it and cut another release, transfer it up again (2 hour wait time again, plus paying $$ for the transfer)…

4.  Local troubleshooting. We get an app installed up in the cloud and it’s not acting quite right and we need to instrument it somehow to debug.  This process is much easier on a local LAN with the developers’ PCs with all their stuff installed.

5.  Local development. A lot of our development exercises the Amazon APIs.  This is one area where Azure has a distinct advantage and can be a threat; in Visual Studio there is a “local Azure fabric” and a dev can write their app and have it running “in Azure” but on their machine, and then when they’re ready deploy it up.  This is slightly more than VM consumption, it’s VMs plus Eucalyptus or similar porting of the Amazon API to the client side, but it’s a killer feature.

Xen or VMWare would be fine – frankly this would be big enough for us I’d change virtualization solutions to the one that worked with EC2.

I just asked one of our developers for his take on value for being able to transition between VMs and EC2 to include in this email, and his response is “Well, it’s just a no-brainer, right?”  Right.

1 Comment

Filed under Cloud

Come to CloudCamp Austin 2!

The second CloudCamp in Austin is happening June 10.  It’s an unconference about, of course, cloud computing.  Read about it and sign up here!

I missed the first one but loved OpsCamp so I’m going!

Leave a comment

Filed under Cloud

Amazon EC2 EBS Instances and Ephemeral Storage

Here’s a couple tidbits I’ve gleaned that are useful.

When  you start an “instance-store” Amazon EC2 instance, you get a certain amount of ephemeral storage allocated and mounted automatically.  The amount of space varies by instance size and is defined here.  The storage location and format also varies by instance size and is defined here.

The upshot is that if you start an “instance-store” small Linux EC2 instance, it automagically has a free 150 GB /mnt disk and a 1 GB swap partition up and runnin’ for ya.  (mount points vary by image, but that’s where they are in the Amazon Fedora starter.)

[root@domU-12-31-39-00-B2-01 ~]# df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1             10321208   1636668   8160252  17% /
/dev/sda2            153899044    192072 145889348   1% /mnt
none                    873828         0    873828   0% /dev/shm
[root@domU-12-31-39-00-B2-01 ~]# free
total       used       free     shared    buffers     cached
Mem:       1747660      84560    1663100          0       4552      37356
-/+ buffers/cache:      42652    1705008
Swap:       917496          0     917496

But, you say, I am not old or insane!  I use EBS-backed images, just as God intended.  Well, that’s a good point.  But when you pull up an EBS image, these ephemeral disk areas are not available to you.  The good news is, that’s just by default.

The ephemeral storage is still available and can be used (for free!) by an EBS-backed image.  You just have to set the block devices up either explicitly when you run the instance or bake them into the image.

Runtime:

You refer to the ephemeral chunks as “ephemeral0”, “ephemeral1”, etc. – they don’t tell you explicitly which is which but basically you just count up based on your instance type (review the doc).  For a small image, it has an ephemeral0 (ext3, 15 GB) and an ephemeral1 (swap, 1 GB).  To add them to an EBS instance and mount them in the “normal” places, you do:

ec2-run-instances <ami id> -k <your key> --block-device-mapping '/dev/sda2=ephemeral0'
--block-device-mapping '/dev/sda3=ephemeral1'

On the instance you have to mount them – add these to /etc/fstab and mount -a or do whatever else it is you like to do:

/dev/sda3                 swap                    swap    defaults 0 0
/dev/sda2                 /mnt                    ext3    defaults 0 0

And if you want to turn the swap on immediately, “swapon /dev/sda3”.

Image:

You can also bake them into an image.  Add a fstab like the one above and when you create the image, do it like this, using the exact same –block-device-mapping flag:

ec2-register -n <ami id> -d "AMI Description" --block-device-mapping  /dev/sda2=ephemeral0
--block-device-mapping '/dev/sda3=ephemeral1' --snapshot your-snapname --architecture i386
--kernel<aki id>  --ramdisk <ari id>

Ta da. Free storage that doesn’t persist.  Very useful as /tmp space.  Opinion is split among the Linuxerati about whether you want swap space nowadays or not; some people say some mix of  “if you’re using more than 1.8 GB of RAM you’re doing it wrong” and “swapping is horrid, just let bad procs die due to lack of memory and fix them.”  YMMV.

Ephemeral EBS?

As another helpful tip, let’s say you’re adding an EBS to an image that you don’t want to be persistent when the instance dies.  By default, all EBSes are persistent and stick around muddying up your account till you clean them up.   If you don’t want certain EBS-backed drives to persist, what you do is of the form:

ec2-modify-instance-attribute --block-device-mapping "/dev/sdb=vol-f64c8e9f:true" i-e2a0b08a

Where ‘true’ means “yes, please, delete me when I’m done.”  This command throws a stack trace to the tune of

Unexpected error: java.lang.ClassCastException: com.amazon.aes.webservices.client.InstanceBlockDeviceMappingDescription
cannot be cast to com.amazon.aes.webservices.client.InstanceBlockDeviceMappingResponseDescription

But it works, that’s just a lame API tools bug.

8 Comments

Filed under Cloud, Uncategorized

Microsoft Azure for Dummies – or for Smarties?

What Is Microsoft Azure?

I’m going to attempt to explain Microsoft Azure in “normal Web person” language.  Like many of you, I am more familiar with Linux/open source type solutions, and like many of you, my first forays into cloud computing have been with Amazon Web Services.  It can often be hard for people not steeped in Redmondese to understand exactly what the heck they’re talking about when Microsoft people try to explain their offerings.  (I remember a time some years ago I was trying to get a guy to explain some new Microsoft data access thing with the usual three letter acronym name.  I asked, “Is it a library?  A language?  A protocol?  A daemon?  Branding?  What exactly is this thing you’re trying to get me to uptake?”  The reply was invariably “It’s an innovative new way to access data!”  Sigh.  I never did get an answer and concluded “Never mind.”)

Microsoft has released their new cloud offering, Azure.  Our company is a close Microsoft partner since we use a lot of their technologies in developing our company’s desktop software products, so as “cloud guy” I’ve gotten some in depth briefings and even went to PDC this year to learn more (some of my friends who have known me over the course of my 15 years of UNIX administration were horrified).  “Cloud computing” is an overloaded enough term that it’s not highly descriptive and it took a while to cut through the explanations to understand what Azure really is.  Let me break it down for you and explain the deal.

Point of Comparison: Amazon (IaaS)

In Amazon EC2, as hopefully everyone knows by now, you are basically given entire dynamically-provisioned, hourly-billed virtual machines that you load OSes on and install software and all that.  “Like servers, but somewhere out in the ether.”  Those kinds of cloud offerings (e.g. Amazon, Rackspace, most of them really) are called Infrastructure As A Service (IaaS).  You’re responsible for everything you normally would be, except for the data center work.  Azure is not an IaaS offering but still bears a lot of similarities to Amazon; I’ll get into details later.

Point of Comparison: Google App Engine (PaaS)

Take Google’s App Engine as another point of comparison.  There, you just upload your Python or Java application to their portal and “it runs on the Web.”  You don’t have access to the server or OS or disk or anything.  And it “magically” scales for you.  This approach is called Platform as a Service (PaaS).   They provide the full platform stack, you only provide the end application.  On the one hand, you don’t have to mess with OS level stuff – if you are just a Java programmer, you don’t have to know a single UNIX (or Windows) command to transition your app from “But it works in Eclipse!” to running on a Web server on the Internet.  On the other hand, that comes with a lot of limitations that the PaaS providers have to establish to make everything play together nicely.  One of our early App Engine experiences was sad – one of our developers wrote a Java app that used a free XML library to parse some XML.  Well, that library had functionality in it (that we weren’t using) that could write XML to disk.  You can’t write to disk in App Engine, so its response was to disallow the entire library.  The app didn’t work and had to be heavily rewritten.  So it’s pretty good for code that you are writing EVERY SINGLE LINE OF YOURSELF.  Azure isn’t quite as restrictive as App Engine, but it has some of that flavor.

Azure’s Model

Windows Azure falls between the two.  First of all, Azure is a real “hosted cloud” like Amazon Web Services, like most of us really think about when we think cloud computing; it’s not one of these on premise things that companies are branding as “cloud” just for kicks. That’s important to say because it seems like nowadays the larger the company, the more they are deliberately diluting the term “cloud” to stick their products under its aegis.  Microsoft isn’t doing that, this is a “cloud offering” in the classical (where classical means 2008, I guess) sense.

However, in a number of important ways it’s not like Amazon.  I’d definitely classify it as a PaaS offering.  You upload your code to “Roles” which are basically containers that run your application in a Windows 2008(ish) environment.  (There are two types – a “Web role” has a stripped down IIS provided on it, a “Worker role” doesn’t – the only real difference between the two.)  You do not have raw OS access, and cannot do things like write to the registry.  But, it is less restrictive than App Engine.  You can bundle up other stuff to run in Azure – even run Java apps using Apache Tomcat.  You have to be able to install whatever you want to run “xcopy only” – in other words, no fancy installers, it needs to be something you could just copy the files to a Windows PC, without administrative privilege, and run a command from the command line and have it work.  Luckily, Tomcat/Java fits that description. They have helper packs to facilitate doing this with Tomcat, memcached, and Apache/PHP/MediaWiki.  At PDC they demoed Domino’s Pizza running their Java order app on it and a WordPress blog running on it.  So it’s not only for .NET programmers.  Managed code is easier to deploy, but you can deploy and run about anything that fits the “copy and run command line” model.

I find this approach a little ironic actually.  It’s been a lot easier for us to get the Java and open source (well, the ones with Windows ports) parts of our infrastructure running on Azure than Windows parts!  Everybody provides Windows stuff with an installer, of course, and you can’t run installers on Azure.  Anyway, in its core computing model it’s like Google App Engine – it’s more flexible than that (good) but it doesn’t do automatic scaling (bad).  If it did autoscaling I’d be willing to say “It’s better than App Engine in every way.”

In other ways, it’s a lot like Amazon.  They offer a variety of storage options – blobs (like S3), tables (like SimpleDB), queues (like SQS), drives (like EBS), SQL Azure (like RDS).  They have an integral CDN.  They do hourly billing.  Pricing is pretty similar to Amazon – it’s hard to totally equate apples to apples, but Azure compute is $0.12/hr and an Amazon small Windows image compute is $0.12/hr (Coincidence?  I think not.).  And you have to figure out scaling and provisioning yourself on Amazon too – or pay a lot of scratch to one of the provisioning companies like RightScale.

What’s Unique and Different

Well, the largest thing that I’ve already mentioned is the PaaS approach.  If you need OS level access, you’re out of luck;  if you don’t want to have to mess with OS management, you’re in luck!  So to the first order of magnitude, you can think of Azure as “like Amazon Web Services, but the compute uses more of a Google App Engine model.”

But wait, there’s more!

One of the biggest things that Azure brings to the table is that, using Visual Studio, you can run a local Azure “fabric” on your PC, which means you can develop, test, and run cloud apps locally without having to upload to the cloud and incur usage charges.  This is HUGE.  One of the biggest pains about programming for Amazon, for instance, is that if you want to exercise any of their APIs, you have to do it “up there.”  Also, you can’t move images back and forth between Amazon and on premise.  Now, there are efforts like EUCALYPTUS that try to overcome some of this problem but in the end you pretty much just have to throw in the towel and do all dev and test up in the cloud.  Amazon and Eclipse (and maybe Xen) – get together and make it happen!!!!

Here’s something else interesting.  In a move that seems more like a decision from a typical cranky cult-of-personality open source project, they have decided that proper Web apps need to be asynchronous and message-driven, and by God that’s what you’re going to do.  Their load balancers won’t do sticky sessions (only round robin) and time out all connections between all tiers after 60 seconds without exception.  If you need more than that, tough – rewrite your app to use a multi-tier message queue/event listener model.  Now on the one hand, it’s hard for me to disagree with that – I’ve been sweating our developers, telling them that’s the correct best-practice model for scalability on the Web.  But again you’re faced with the “Well what if I’m using some preexisting software and that’s not how it’s architected?” problem.  This is the typical PaaS pattern of “it’s great, if you’re writing every line of code yourself.”

In many ways, Azure is meant to be very developer friendly.  In a lot of ways that’s good.  As a system admin, however, I wince every time they go on about “You can deploy your app to Azure just by right clicking in Visual Studio!!!”  Of course, that’s not how anyone with a responsibly controlled production environment would do it, but it certainly does make for fast easy adoption in development.   The curve for a developer who is “just” a C++/Java/.NET/whatever wrangler to get up and going on an IaaS solution like Amazon is pretty large comparatively; here, it’s “go sign up for an account and then click to deploy from your IDE, and voila it’s running on the Intertubes.”  So it’s a qualified good – it puts more pressure on you as an ops person to go get the developers to understand why they need to utilize your services.  (In a traditional server environment, they have to go through you to get their code deployed.)  Often, for good or ill, we use the release process as a touchstone to also engage developers on other aspects of their code that need to be systems engineered better.

Now, that’s my view of the major differences.  I think the usual Azure sales pitch would say something different – I’ve forgotten two of their huge differentiators, their service bus and access control components.  They are branded under the name “AppFabric,” which as usual is a name Microsoft is also using for something else completely different (a new true app server for Windows Server, including projects formerly code named Dublin and Velocity – think of it as a real WebLogic/WebSphere type app server plus memcache.)

Their service bus is an ESB.  As alluded to above, you’re going to want to use it to do messaging.   You can also use Azure Queues, which is a little confusing because the ESB is also a message queue – I’m not clear on their intended differentiation really.  You can of course just load up an ESB yourself in any other IaaS cloud solution too, so if you really want one you could do e.g. Apache ServiceMix hosted on Amazon.  But, they are managing this one for you which is a plus.  You will need to use it to do many of the common things you’d want to do.

Their access control – is a mess.  Sorry, Microsoft guys.  The whole rest of the thing, I’ve managed to cut through the “Microsoft acronyms versus the rest of the world’s terms and definitions” factor, but not here.   “You see, you use ACS’s WIF STS to generate a SWT,” says our Microsoft rep with a straight face.   They seem to be excited that it will use people’s Microsoft Live IDs, so if you want people to have logins to your site and you don’t want to manage any of that, it is probably nice.  It takes SAML tokens too, I think, though I’m not sure if the caveats around that end up equating to “Well, not really.”  Anyway, their explanations have been incoherent so far and I’m not smelling anything I’m really interested in behind it.  But there’s nothing to prevent you from just using LDAP and your own Internet SSO/federation solution.  I don’t count this against Microsoft because no one else provides anything like this, so even if I ignore the Azure one it doesn’t put it behind any other solution.

The Future

Microsoft has said they plan to add on some kind of VM/IaaS offering eventually because of the demand.  For us, the PaaS approach is a bit of a drawback – we want to do all kinds of things like “virus scan uploaded files,” “run a good load balancer,” “run an LDAP server”, and other things that basically require more full OS access.  I think we may have an LDAP direction with the all-Java OpenDS, but it’s a pain point in general.

I think a lot of their decisions that are a short term pain in the ass (no installs, no synchronous) are actually good in the long term.  If all developers knew how to develop async and did it by default, and if all software vendors, even Windows based ones, provided their product in a form that could just be “copy and run without admin privs” to install, the world would be a better place.  That’s interesting in that “Sure it’s hard to use now but it’ll make the world better eventually” is usually heard from the other side of the aisle.

Conclusion

Azure’s a pretty legit offering!  And I’m very impressed by their velocity.  I think it’s fair to say that overall Azure isn’t quite as good as Amazon except for specific use cases (you’re writing it all in .NET by hand in Visual Studio) – but no one else is as good as Amazon either (believe me, I evaluated them) and Amazon has years of head start; Azure is brand new but already at about 80%! That puts them into the top 5 out of the gate.

Without an IaaS component, you still can’t do everything under the sun in Azure.  But if you’re not depending on much in the way of big third party software chunks, it’s feasible; if you’re doing .NET programming, it’s very compelling.

Do note that I haven’t focused too much on the attributes and limitations of cloud computing in general here – that’s another topic – this article is meant to compare and contrast Azure to other cloud offerings so that people can understand its architecture.

I hope that was clear.  Feel free and ask questions in the comments and I’ll try to clarify!

Leave a comment

Filed under Cloud

Beware the Deceptive SLA, My Friend

We’re trying to come to an agreement with a SaaS vendor about performance and availability service level agreements (SLAs).  I discussed this topic some in my previous “SaaS Headaches” post.  I thought it would be instructive to show people the standard kind of “defense in depth” that suppliers can have to protect against being held responsible for what they host for you.

We’ve been working on a deal with one specific supplier.  As part of it, they’ll be hosting some images for our site.  There’s a business team primarily responsible for evaluating their functionality etc., we’re just in the mix as the faithful watchdogs of performance and availability for our site.

Round 1 – “What are these SLAs you speak of?”  The vendor offers no SLA.  “Unacceptable,” we tell the project team.  They fret about having to worry about that along with the 100 other details of coming to an agreement with the supplier, but duly go back and squeeze them.  It takes a couple squeezes because the supplier likes to forget about this topic – send a list of five questions with one of them being “SLA,” you get four answers back, ignoring the SLA question.

Round 2 – “Oh, you said ‘SLA’!  Oh, sure, we have one of those.”  We read the SLA and it only commits to their main host being pingable.  Our service could be completely down, and it doesn’t speak to that.  Back to our project team, who now between the business users, procurement agent, and legal guy need more urging to lean on the supplier.  The supplier plays dumb for a while, and then…

Continue reading

1 Comment

Filed under Cloud, General

Cloud Headaches?

The industry is abuzz with people who are freaked out about the outages that Amazon and other cloud vendors have had.  “Amazon S3 Crash Raises Doubts Among Cloud Customers,” says InformationWeek!

This is because people are going into cloud computing with retardedly high expectations.  This year at Velocity, Interop, etc. I’ve seen people just totally in love with cloud computing – Amazon’s specifically but in general as well.  And it’s a good concept for certain applications.  However, it is a computing system just like every other computing system devised previously by man.  And it has, and will have, problems.

Whether you are using in house systems, or a SaaS vendor, or building “in the cloud,” you have the same general concerns.  Am I monitoring my systems?  What is my SLA?  What is my recourse if my system is not hitting it?  What’s my DR plan?

SaaS is a special case of cloud computing in general.  And if you’re a company relying on it, when you contract with a SaaS vendor you get SLAs established and figure out what the remedy is if they breach it.  If you are going into a relationship where you are just paying money for a cloud VM, storage, etc. and there is no enforceable SLA in the relationship, then you need to build the risk of likely and unremediable outages into your business plan.

I hate to break it to you, but the IT people working at Amazon, Google, etc. are not all that smarter than the IT people working with you.  So an unjustified faith in a SaaS or cloud vendor – “Oh, it’s Amazon, I’m sure they’ll never have an outage of any sort – either across their entire system or localized to my part of it – and if they do I’m sure the $100/month I’m paying them will cause them to give a damn about me” – is an unreasonable expectation on its face.

Clouds and cloud vendors are a good innovation.  But they’re like every other computing innovation and vendor selling it to you.  They’ll have bugs and failures.  But treating them as if they won’t is a failure on your part, not theirs.

2 Comments

Filed under Cloud, Uncategorized