Velocity 2010: Ignite!

Welcome to the bonus round.  Tonight there is an Ignite! round, which is where a bunch (14 in this case) people do super quick 20 slide/15 seconds per slide presentations on interesting things.

Alex Polvi from Cloudkick on Cloud Data Porn!

James just talked to these guys before we came out here and we’re supposed to talk with him.  The service looks boss (it’s cloud based cloud monitoring).  And he has the “best hair in cloud computing!”

Amazon vs Rackspace vs Slicehost stats.  Machines off – on slicehost they keep them up, and on Amazon they all eventually get turned off.  Amazon has larger RAM values and Slicehost has larger disk, and Amazon has larger % disk used.

Based on that – he guesses AWS has 230k hosts online; the others have way way fewer.  700 TB of memory; Rackspace and Slicehost are more in the 150 range.  But they have more like 5 PB of disk and Amazon has less than half.

And he calcu-guesses EC2 (us-east) has about 11,000 physical hosts, and they only get about 10 VMs per host while the others get more.

Justin Huff on How to do a Triathlon!

No, really.  Swim, bike, run.  And you get to follow some hot buns.

Brent Chapman on Netomata Automates Network Configuration

If you do something right 95% of the time, when you do it 6 times it’s only 74% correct.  So automated networks are better.  More reliable from consistent configs.  Doesn’t rely on personal consistency.  Easier to maintain and troubleshoot.  Easier to scale.

Automation cycle – design, generate configs, deploy, control via change control, feedback look.  Netomata does the generation – you put in a network description and config templates (in Ruby) and it generates configs.  And it’s open source and community sharing based.  “Chef for Networks!”

Amy Lightholder on How to Network like a Ninja When You’re Nobody

Aka “the glad-handing dandy HOWTO”.  Scope out the event in advance – people, venue, etc.  Check in early.  Be friendly, real, yourself, etc.  Volunteer for things.  Hell, just pitch in. And twitter your uterus off – read and contribute to the hashtag stream (hint: use tweetchat).  Be funny, useful, smart, etc.  Give attention – listen.  Get attention by organizing impromptu events or taking pictures and posting them.  Use “foursquare”, it’s like Yelp checkins but more generic.  Get or give rides (even in your trunk).  Find workout buddies.  Don’t eat alone – get people. Take breaks to make notes an do followup.   Send invitations to LinkedIn and whatnot now not later.  And have fun.  Be fun, not a snob.

Mark Lin, on Metrics Simplified

Making sense of all those metrics is a challenge.  You have to know your system – you can only improve to “five nines” if you do.  Sending/collecting metrics is complicated.  Doing a poll based collection server isn’t fun.  They did it using graphite, rabbitmq, a graphite local proxy, and something internal called RockSteady that uses the Esper CEP engine.  They stick events to a port through netcat into a RabbitMQ server.  Then they graph it in graphite.  The graphs have lines for version changes vs latencies and metrics.  Graph = post event forensics.  Rocksteady treats each metric as an event.  Has SQL-like syntax.  Auto thresholding and prediction.  Correlation and determine dependencies.  Capture metrics when something crosses a threshold.  Assemble timing info per request.  Actual time spent in each component in an app.

Make metric sending simple, have a nice UI to make sense of the data, and real time processing of  the metrics rocks.

That all sounded cool and I wish I knew what the hell he was talking about.  I’ll sic Peco on hunting him down and talking metrics!

Petascale Storage by Tim Lossen from Germany

Data is growing, and it’s expensive to store.  Could we use open source hardware to get a good “return on byte?”  In the Open Storage Pod, he used laptop disks.  Denser, cooler, less power.  Use port multipliers.  Made 20 TB nodes. and put into 6 node pods in a single rackmount enclosure (4U).   A rack has 10 pods = 1.2 PB.  A container has 8 racks.  diaspora! (?) using PCMCIA.  5 TB cube.  openstoragepod.org.

When Social Edges Go Black by Jesper Andersen

Social software messes you up when things go wrong.  Info has emotional content, but social software just passes it on through.  He made a site called Avoidr that uses Foursquare to avoid people.  Facebook similarly, keeps bugging you to refriend people who hurt you.

Matching systems work well.  What if we match complementary emotions and make “The Forgiveness Engine,” based on the confessional idea. certaintylabs.com

Anderson Lin on Ten Things You Have To Do Differently In China

It’s a huge emerging Internet market.

10.  You need a license (ICP) to Web in China. used to be a rubber stamp and now not so much.

9.   You have to get in country.  Bandwidth going in is all jacked up.  Internal is 5x faster.

8.  Not all IDC (data centers) are created equal.  Some are very ghetto.

7.  You need multiple carriers to cover China – north and south and they don’t talk.  China Telecom and China Unicom, you have to buy from both.

6.  The “China Price” of bandwidth is high.

5.  No Visa, no Mastercard, no checks.  Debit cards only – UnionPay, Alipay, COD.

4.  You have to reach a young audience – 30% under 18.

3.  Do you qq?  Massive IM protocol in China.

2.  You have to get used to regulations.

1.  Now for the really bad news… IE6 is 60-70% of the market.  Everyone caters to it.

libcloud: A Unified Interface to the Cloud by Paul Querna, Cloudkick

libcloud is a Python library that brokers to cloud APIs – Amazon, Rackspace, Softlayer – they all use different APi tehcnologies and that’s obnoxious.  It does compute, not storage and stuff.  Cloud API standards haven’t worked, so it’s translation time.  It supports 16 cloud providers.  It has a simple API – list_nodes(), reboot, destroy, create.

Can do neat tricks like get locations, price per hour, and boot a box in the currently cheapest location!

Don’t update a wiki, pssh on list list_nodes().

Fabric + libcloud to pull data.

Silver Lining: python Django deployment on the cloud built on top of libcloud.

Mercury: Drupal deployment on the cloud, same deal.

Image formats – there is no standard yet – AMIs and dd’ing.

Experimenting with Java version in progress.

It’s open source (Apache).

  • JClouds – Java-world equivalent
  • Apache Deltacloud – Ruby-ish equivalent
  • Fog (Ruby)

libcloud.org

Web Performance Across the HTTP to HTTPS Transition by Sean Walbran, Digital River

HTTP is all great, but when you go HTTPS for encryption things go awry.  Only use it for sharing secrets.  Performance at the transition is critical. It’s slow by default. Not just the encryption overhead, but interaction with the CDN and browser cache and new network connection on port 443.  You can try to connect ahead on mouseover.  Do SSL offload.  And HTTPS gets LRUed out of cache.  Use different domains and prefetch.  Try to leverage the browser cache, but the browser doesn’t believe your previously cached stuff.  Set Cache-Control: public.  IE is even worse.  Use prefetching while they’re browsing insecurely.  Firefox+JQuery you try to prefetch but get zero byte stuff anf it’s hinky.

Chuck Kindred on The ABCs of Gratitude

Things he’s grateful for, to purge his soul.

Mandi Walls on Lies, Damn Lies, and Statistics

Turn your intuition into data.  Means and medians you know.  Standard deviation, significance, and regression are the next level.

Mean is a way of addressing a metric across a large group.  But it hides outliers if it’s the only metric.  Mean vs median difference shows outlier pull.  A normal distribution is defined by the  mean and standard deviation.  Standard deviation is a measurement of the spread of the distribution.  In Excel, there’s a Data Analysis package add-on and use “Descriptive Statistics” and it can pull all that.

If your data is pretty normal, 68% of stuff siwitin one standard deviation.  NORMDIST function tells you how far out you are.

Regression shows relationships between sets of data.  CPU vs hits per minute for example.  “R square” – close to 1 means relationship!

Billy Hoffman on The 2010 state of web performance.  Zoompf!

Evaluated the Alexa top 1000 to see what they’re doing in terms of perf optimization.  There’s a lot of bloat.  78% of sites aren’t compressing right.  80% of sites are not optimizing images (and pngcrush/jpgtran get you 15% improvement).

Too many requests!!  30% of all static resources aren’t being cached.  Don’t forget there’s a cache somewhere inthe middle usually, but query strings and stuff bust them.  Combine!

Silly requests – like for grey images instead of defining colors.  JS with no executable.  CSS with no rules.  1 in 5 sites are giving PHP without executing, or other 404s, 500s, etc.

But, the companies were clearly doing something.  But we suck at it.  We’re failing the basics.  Why can’t we do crap that’s a decade old?  Manage it, apply it uniformly?  You can go to zoompf.com (/2010report) now and check your shit!

Sam Ramji (Apigee) on Open APIs, Darnwin’s Finches, and Success vs Extinction

Competitive pressures force adaptation, like Darwin’s finches.  Your API is your DNA now.  Web 2010 is about using glue APIs.  Siri is some thing that replicates open APIs.  They were bought by Apple and now all those APIs will be in Apple stuff.  Most traffic is coming on APIs now not pages.  It’s what has made Twitter mutate like finches.

Whew!  Superfast knowledge transfer.  After, we got beers with the Cloudkick guys, they are some pretty cool cats.  Next time… Day 2!

2 Comments

Filed under Conferences, DevOps

2 responses to “Velocity 2010: Ignite!

  1. Pingback: Zoompf, Playboy Centerfolds, and Velocity | Zoompf

  2. Pingback: HTTP Compression use by Alexa Top 1000 | Zoompf

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.