Tag Archives: performance

Velocity 2010: Day 2 Keynotes

The Huddled Masses

Day 2 of Velocity 2010 starts off with a bang!  It’s Day 1 for a lot of people; Day 1 is optional workshops.  Anyway, Steve Souders (repping performance) and Jesse Robbins (repping ops) took the stage and told us that there are more than 1000 people at the show this year, more than both previous years combined!  And sure enough, we’re stuffed past capacity into the ballroom and they have satellite rooms set up; the show is sold out and there’s a long waiting list.  The fire marshal is thrilled.  Peco and I are long term Velocity alumni and have been to it every year – you can check out all the blog writeups from Velocity 2008 and 2009!  As always, my comments are in italics.

Jesse stripped down to show us the Velocity T-shirt, with the “fast by default” tagline on it.  I think we should get Ops representation on there too, and propose “easy by design.”   “fast by default/easy by design.”  Who’s with me?

Note that all these keynotes are being Webcast live and on demand so you can follow along!

Datacenter Revolution

The first speaker is James Hamilton of AWS on Datacenter Infrastructure Innovation.  There’s been a big surge in datacenter innovation, driven by the huge cloud providers (AWS, Google, Microsoft, Baidu, Yahoo!), green initiatives, etc. A fundamental change in the networking world is coming, and he’s going to talk about it.

Cloud computing will be affecting the picture fundamentally; driving DC stuff to the big guys and therefore driving professionalization and innovation.

Where does the money go in high scale infrastructure?  54% servers, 21% power distribution/cooling, 13% power, 8% networking, 5% other infrastructure.  Power stuff is basically 34% of the total and trending up.  Also, networking is high at 8% (19% of your total server cost).

So should you virtualize and crush it onto fewer servers and turn the others off?  You’re paying 87% for jack crap at this point.  Or should you find some kind of workload that pays more than the fractional cost of power?  Yes.  Hence, Amazon spot instances. The closer you can get to a flat workload, the better it is for you, everyone, and the environment.

Also, keeping your utilization up is critical to making $ return.

In North America, 11% of all power is lost in distribution.  Each step is 1-2% inefficient (substation, transformer, UPS, to rack) and it all adds up.  And that’s not counting the actual server power supply (80% efficient) and on board voltage regulators (80% efficient though you can buy 95% efficient ones for a couple extra dollars).

Game consoles are more efficient computing resources than many data centers – they’ve solved the problem of operating in hot and suboptimal conditions.  There’s also potential for using more efficient cooling than traditional HVAC – like, you know, “open a window”.

Sea Change in Net Gear!  Network gear is oversubscribed.  ASIC vendors are becoming more competitive and innovating more – and Moore’s Law is starting to kick in there (as opposed to the ‘slow ass law’ that’s ruled for a while). Networking gear is one of the major hindrances to agility right now – you need to be able to put servers wherever you want in the datacenter, and that’s coming.

Speed Matters

The second keynote is Urs Hölzle from Google on “Speed Matters.”

Google, as they know and see all, know that the average load time of a Web page is 4.9 seconds, average size 320 kb.  Average user bandwidth is 1.8 MB.  math says that load should be 1.4 seconds – so what up?

Well, webpagetest.org shows you – it’s not raw data transfer, it’s page composition and render.  Besides being 320 kB, it has 44 resources, makes 7 DNS resources, and doesn’t compress 1/3 of its content.

Google wants to make the Web faster.  First, the browser – Chrome!  It’s speed 20 vs Firefox at 10 vs IE8 at like 2.  They’re doing it as open source to spur innovation and competition (HTML5,  DNS prefectch, VP8 codec, V8 JS engine).  So “here’s some nice open source you could adopt to make it better for your users, and if you don’t, here’s a reference browser using it that will beat your pants off.  Enjoy!”

TCP needs improvements too.  It was built for slow networks and nuke-level resiliency, not speed.  They have a tuning paper that shows some benefits – fast start, quick loss recovery makes Google stuff 12% faster (on real sites!).  And no handshake delay (app payload in SYN packets).

DNS needs to propagate the client IP in DNS requests to allow servers to better map to closest servers – when DNS requests go up the chain that info is lost.  Of course it’s a little Big Brother, too.

SSL is slow.  False start (reducing 1 roudn trip from the handshake) makes Android 10% faster.  Snap start and OCSP stapling are proposed improvements to avoid round trips to the client and CA.

HTTP itself, they promote SPDY.  Does header compression and other stuff, reduces packets by 40%.  It’s the round trips that kill you.

DNS needs to be faster too.  Enter Google’s Public DNS.  It’s not really that much data, so for them to load it into memory is no big deal.

And 1 Gbps broadband to everyone’s home!  Who’s with me?  100x improvement!

This is a good alignment of interests.  Everyone wants the Web to be faster then they use it, and obviously Google and others want it to be faster so you can consume their stuff/read their ads/give them your data faster and faster.

They are hosting for popular cross-Web files like jQuery, fonts, etc.  This improves caching on the client and frees up your servers.

For devs, they are trying to make tools for you.  Page Speed, Closure Compiler, Speed Tracer, Auto Spriter, Browserscope, Site Performance data.

“Speed is product feature #1.”  Sped affects search ranking now.  Go to code.google.com/speed and get your speed on!  Google can’t do it alone…  And slow performance is reducing your revenue (they have studies on that).

Are You Experienced?

Keynote 3 is From Browsers to Mobile Devices: The End User Experience Still Mattersby Vik Chaudhary (Keynote Systems, Inc.).  As usual, this is a Keynote product pitch.  I guess you have to pay the piper for that sponsorship. Anyway, they’re announcing a new version of MITE, their mobile version of KITE, the Keynote Internet Testing Environment.

Mobile!  It’s big!  Shakira says so! Mee mee mee mee!

Use MITE to do mobile testing; it’ll test and feed it into the MyKeynote portal.  You can see performance and availability for a site on  iPhone, Blackberry, Palm Pre, etc.  You can see waterfalls and screenshots!  That is, if you pay a bunch extra, at least that’s what we learned from our time as a Keynote customer…

MITE is pretty sexy though.  You can record a transaction on an emulated iPhone.  And analyze it.  Maybe I’m biased because I already know all this, and because we moved off Keynote to Gomez despite their frankly higher prices because their service and technology were better.  KITE was clever but always seemed to be more of a freebie lure-them-in gimmick than a usable part of a real Keynote customer’s work.

Now it’s break time!  I’ll be back with more coverage from Velocity 2010 in a bit!

Leave a comment

Filed under Conferences, DevOps

Velocity 2010: Scalable Internet Architectures

My first workshop is  Scalable Internet Architectures by Theo Schlossnagle, CEO of OmniTI.  He gave a nearly identical talk last year but I missed some of it, and it was really good, so I went!  (Robert from our Web Admin team attended as well.)

There aren’t many good books on scalability.  Mainly there are three – Art of Scalability, Cal Henderson’s Building Scalable Web Sites, and his, Scalable Internet Architectures.  So any tips you can get a hold of are welcome.

Following are my notes from the talk; my own thoughts are in italics.

Architecture

What is architecture?  It encompasses everything form power up to the client touchpoint and everything in between.

Of necessity, people are specialized into specific disciplines but you have to overcome that to make a whole system make sense.

The new push towards devops (development/operations collaboration) tries to address this kind of problem.

Operations

Operations is a serious part of this, and it takes knowledge, tools, experience, and discipline.

Knowledge
– Is easy to get; Internet, conferences (Velocity, Structure, Surge), user groups

Tools – All tools are good; understand the tools you have.  Some of operations encourages hackiness because when there is a disruption, the goal is “make it stop as fast as possible.”

You have to know how to use tools like truss, strace, dtrace through previous practice before the outage comes.  Tools (and automation) can help you maintain discipline.

Experience comes from messing up and owning up.

Discipline is hardest.  It’s the single most lacking thing in our field. You have to become a craftsman. To learn discipline through experience, and through practice achieve excellence. You can’t be too timid and not take risks, or take risks you don’t understand.

It’s like my old “Web Admin Standing Orders” that tried to delineate this approach for  my ops guys – “1.  Make it happen.  2.  Don’t f*ck it up.  3.  There’s the right way, the wrong way, and the standard way.”  Take risks, but not dumb risks, and have discipline and tools.

He recommends the classic Zen and the Art of Motorcycle Maintenance for operations folks.  Cowboys and heroes burn out.  Embrace a Zen attitude.

Best Practices

  1. Version Control everything.  All tools are fine, but mainly it’s about knowing how to use it and using it correctly, whether it’s CVS or Subversion or git.
  2. Know Your Systems – Know what things look like normally so you have a point of comparison.  “Hey, there’s 100 database connections open!  That must be the problem!”  Maybe that’s normal.  Have a baseline (also helps you practice using the tools).  Your brain is the best pattern matcher.
    Don’t say “I don’t know” twice.  They wrote an open source tool called Reconnoiter that looks at data and graphs regressions and alerts on it (instead of cacti, nagions, and other time consuming stuff).  Now available as SaaS!
  3. Management – Package rollout, machine management, provisioning. “You should use puppet or chef!  Get with the times and use declarative definition!”  Use the tools you like.  He uses kickstart and cfengine and he likes it just fine.

Dynamic Content

Our job is all about the dynamic content.  Static content – Bah, use akamai or cachefly or panther or whatever.  it’s a solved problem.

Premature optimization is the root of all evil – well, 97% of it.  It’s the other 3% that’s a bitch.  And you’re not smart enough to know where that 3% is.

Optimization means “don’t do work you don’t have to.”  Computational reuse and caching,  but don’t do it in the first place when possible.
He puts comments for things he decides not to optimize explaining the assumptions and why not.

Sometimes naive business decisions force insane implementations down the line; you need to re-check them.

Your content is not as dynamic as you think it is.  Use caching.

Technique – Static Element Caching

Applied YSlow optimizations – it’s all about the JavaScript, CSS, images.  Consolidate and optimize.  Make it all publicly cacheable with 10 year expiry.

RewriteRule (.*)\.([0-9]+)\.css $1.css makes /s/app.23412 to /s/app.css – you get unique names but with new cached copy.  Bump up the number in the template.  Use “cat” to consolidate files, freaks!

Images, put a new one at a new URI.  Can’t trust caches to really refresh.

Technique – Cookie Caching

Announcing a distributed database cache that always is near the user and is totally resilient!  It’s called cookies.  Sign it if you don’t want tampering.  Encrypt if you don’t want them to see its contents.  Done.  Put user preferences there and quit with the database lookups.

Technique – Data Caching

Data caching.  Caching happens at a lot of layers.  Cache if you don’t have to be accurate, use a materialized view if you do.    Figuring out the state breakdown of your users?  Put it in a separate table at signup or state change time, don’t query all the time.  Do it from the app layer if you have to.

Technique – Choosing Technologies

Understand how you’ll be writing and retrieving data – and how everyone else in the business will be too!  (Reports, BI, etc.)  You have to be technology agnostic and find the best fit for all the needs – business requirements as well as consistency, availability, recoverability, performance, stability.  That’s a place where NoSQL falls down.

Technique – Database

Shard your database then shoot yourself.  Horizontal scaling isn’t always better.  It will make your life hell, so scale vertically first.  If you have to, do it, and try not to have regrets.

Do try “files,” NoSQL, cookies, and other non-ACID alternatives because they scale more easily.  Keep stuff out of the DB where you can.

When you do shard, partition to where you don’t need more than one shard per OLTP question.  Example – private messaging system.  You can partition by recipient and then you can see your messages easily.  But once someone looks for messages they sent, you’re borked.  But you can just keep two copies!  Twice the storage but problem solved.  Searching cross-user messages, however, borks you.

Don’t use multimaster replication.  It sucks – it’s not ready for prime time.  Outside ACID there are key-value stores, document databases, etc.  Eventual consistency helps.  MongoDB, Cassandra, Voldemort, Redis,  CouchDB – you will have some data loss with all of them.

NoSQL isn’t a cure-all; they’re not PCI compliant for example.  Shiny is not necessarily good.  Break up the problem and implement the KISS principle.  Of course you can’t get to the finish line with pure relational for large problems either – you have to use a mix; there is NO one size fits all for data management.

Keep in mind your restore-time and restore-point needs as well as ACID requirements of your data set.

Technique – Service Decoupling

One of the most fundamental techniques to scaling.  The theory is, do it asynchronous.  Why do it now if you can postpone it?  Break down the user transaction and determine what parts can be asynchronous.  Queue the info required to complete the task and process it behind the scenes.

It is hard, though, and is more about service isolation than postponing work.  The more you break down the problem into small parts, the more you have in terms of problem simplification, fault isolation, simplified design, decoupling approach, strategy, and tactics, simpler capacity planning, and more accurate performance modeling.  (Like SOA, but you know, that really works.)

One of my new mantras while building our cloud systems is “Sharing is the devil,” which is another way of stating “decouple heavily.”

Message queueing is an important part of this – you can use ActiveMQ, OpenAMQ, RabbitMQ (winner!).  STOMP sucks but is a universal protocol most everyone uses to talk to message queues.

Don’t decouple something small and simple, though.

Design & Implementation Techniques

Architecture and implementation are intrinsically tied, you can’t wholly separate them.  You can’t label a box “Database” and then just choose Voldemort or something.

Value accuracy over precision.

Make sure the “gods aren’t angry.”  The dtrace guy was running mpstat one day, and the columns didn’t line up.  The gods intended them to, so that’s your new problem instead of the original one!  OK, that’s a confusing anecdote.  A better one is “your Web servers are only handling 25 requests per second.”  It should be obvious the gods are angry.   There has to be something fundamentally wrong with the universe to make that true. That’s not a provisioning problem, that’s an engineering problem.

Develop a model.  A complete model is nearly impossible, but a good queue theory model is easy to understand and provides good insight on dependencies.

Draw it out, rationalize it.  When a user comes in to the site all what happens?  You end up doing a lot of I/O ops.  Given traffic you should then know about what each tier will bear.

Complexity is a problem – decoupling helps with it.

In the end…

Don’t be an idiot.  A lot of scalability problems are from being stupid somewhere.  High performance systems don’t have to scale as much.  Here’s one example of idiocy in three acts.

Act 1 – Amusing Error By Marketing Boneheads – sending a huge mailing with an URL that redirects. You just doubled your load, good deal.

Act 2 – Faulty Capacity Planning – you have 100k users now.  You try to plan for 10 million.  Don’t bother, plan only to 10x up, because you just don’t understand the problems you’ll have at that scale – a small margin of error will get multiplied.

Someone into agile operations might point out here that this is a way of stating the agile principle of “iterative development.”

Act 3 – The Traffic Spike – I plan on having a spike that gives me 3000 more visitors/second to a page with various CSS/JS/images.  I do loads of math and think that’s 5 machines worth.  Oh whoops I forgot to do every part of the math – the redirect issue from the amusing error above!  Suddenly there’s a huge amount more traffic and my pipe is saturated (Remember the Internet really works on packets and not bytes…) .

This shows a lot of trust in engineering math…  But isn’t this why testing was invented?  Whenever anyone shows me math and hasn’t tested it I tend to assume they’re full of it.

Come see him at Surge 2010!  It’s a new scalability and performance conference in Baltimore in late Sep/early Oct.

A new conference, interesting!  Is that code for “server side performance, ” where Velocity kinda focuses on client side/front end a lot?

Leave a comment

Filed under Conferences, DevOps

Upcoming Free Velocity WebOps Web Conference

O’Reilly’s Velocity conference is the only generalized Web ops and performance conference out there.  We really like it; you can go to various other conferences and have 10-20% of the content useful to you as a Web Admin, or you can go here and have most of it be relevant!

They’ve been doing some interim freebie Web conferences and there’s one coming up.  Check it out.  They’ll be talking about performance functionality in Google Webmaster Tools, mySQL, Show Slow, provisioning tools, and dynaTrace’s new AJAX performance analysis tool.

O’Reilly Velocity Online Conference: “Speed and Stability”
Thursday, March 17; 9:00am PST
Cost: Free

Leave a comment

Filed under Conferences, DevOps

Velocity 2009 – Best Tidbits

Besides all the sessions, which were pretty good, a lot of the good info you get from conferences is by networking with other folks there and talking to vendors.  Here are some of my top-value takeaways.

Aptimize is a New Zealand-based company that has developed software to automatically do the most high value front end optimizations (image spriting, CSS/JS combination and minification, etc.).  We predict it’ll be big.  On a site like ours, going back and doing all this across hundreds of apps will never happen – we can engineer new ones and important ones better, but something like this which can benefit apps by the handful is great.

I got some good info from the MySpace people.  We’ve been talking about whether to run our back end as Linux/Apache/Java or Windows/IIS/.NET for some of our newer stuff.  In the first workshop, I was impressed when the guy asked who all runs .NET and only one guy raised his hand.   MySpace is one of the big .NET sites, but when I talked with them about what they felt the advantage was, they looked at each other and said “Well…  It was the most expeditious choice at the time…”  That’s damning with faint praise, so I asked about what they saw the main disadvantage being, and they cited remote administration – even with the new PowerShell stuff it’s just still not as easy as remote admin/CM of Linux.  That’s top of my list too, but often Microsoft apologists will say “You just don’t understand because you don’t run it…”  But apparently running it doesn’t necessarily sell you either.

Our friends from Opnet were there.  It was probably a tough show for them, as many of these shops are of the “I never pay for software” camp.  However, you end up wasting far more in skilled personnel time if you don’t have the right tools for the job.  We use the heck out of their Panorama tool – it pulls metrics from all tiers of your system, including deep in the JVM, and does dynamic baselining, correlation and deviation.  If all your programmers are 3l33t maybe you don’t need it, but if you’re unsurprised when one of them says “Uhhh… What’s a thread leak?” then it’s money.

ControlTier is nice, they’re a commercial open source CM tool for app deploys – it works at a higher level than chef/puppet, more like capistrano.

EngineYard was a really nice cloud provisioning solution (sits on top of Amazon or whatever).  The reality of cloud computing as provided by the base IaaS vendors isn’t really the “machines dynamically spinning up and down and automatically scaling your app” they say it is without something like this (or lots of custom work).  Their solution is, sadly, Rails only right now.  But it is slick, very close to the blue-sky vision of what cloud computing can enable.

And also, I joined the EFF!  Cyber rights now!

You can see most of the official proceedings from the conference (for free!):

1 Comment

Filed under Conferences, DevOps

Velocity 2009 – Monday Night

After a hearty trip to Gordon Biersch, Peco went to the Ignite battery of five minute presentations, which he said was very good.  I went to two Birds of a Feather sessions, which were not.  The first was a general cloud computing discussion which covered well-trod ground.  The second was by a hapless Sun guy on Olio and Fabian.  No, you don’t need to know about them.  It was kinda painful, but I want to commend that Asian guy from Google for diplomatically continuing to try to guide the discussion into something coherent without just rolling over the Sun guy.  Props!

And then – we were lame and just turned in.  I’m getting old, can’t party every night like I used to.  (I don’t know what Peco’s excuse is!)

Leave a comment

Filed under Conferences, DevOps

Velocity 2009 – Scalable Internet Architectures

OK, I’ll be honest.  I started out attending “Metrics that Matter – Approaches to Managing High Performance Web Sites” (presentation available!) by Ben Rushlo, Keynote proserv.  I bailed after a half hour to the other one, not because the info in that one was bad but because I knew what he was covering and wanted to get the less familiar information from the other workshop.  Here’s my brief notes from his session:

  • Online apps are complex systems
  • A siloed approach of deciding to improve midtier vs CDN vs front end engineering results in suboptimal experience to the end user – have to take holistic view.  I totally agree with this, in our own caching project we took special care to do an analysis project first where we evaluated impact and benefit of each of these items not only in isolation but together so we’d know where we should expend effort.
  • Use top level/end user metrics, not system metrics, to measure performance.
  • There are other metrics that correlate to your performance – “key indicators.”
  • It’s hard to take low level metrics and take them “up” into a meaningful picture of user experience.

He’s covering good stuff but it’s nothing I don’t know.  We see the differences and benefits in point in time tools, Passive RUM, tagging RUM, synthetic monitoring, end user/last mile synthetic monitoring…  If you don’t, read the presentation, it’s good.  As for me, it’s off to the scaling session.

I hopped into this session a half hour late.  It’s Scalable Internet Architectures (again, go get the presentation) by Theo Schlossnagle, CEO of OmniTI and author of the similarly named book.

I like his talk, it starts by getting to the heart of what Web Operations – what we call “Web Admin” hereabouts – is.  It kinda confuses architecture and operations initially but maybe that’s because I came in late.

He talks about knowledge, tools, experience, and discipline, and mentions that discipline is the most lacking element in the field. Like him, I’m a “real engineer” who went into IT so I agree vigorously.

What specifically should you do?

  • Use version control
  • Monitor
  • Serve static content using a CDN, and behind that a reverse proxy and behind that peer based HA.  Distribute DNS for global distribution.
  • Dynamic content – now it’s time for optimization.

Optimizing Dynamic Content

Don’t pay to generate the same content twice – use caching.  Generate content only when things change and break the system into components so you can cache appropriately.

example: a php news site – articles are in oracle, personalization on each page, top new forum posts in a sidebar.

Why abuse oracle by hitting it every page view?  updates are controlled.  The page should pull user prefs from a cookie.  (p.s. rewrite your query strings)
But it’s still slow to pull from the db vs hardcoding it.
All blog sw does this, for example
Check for a hardcoded php page – if it’s not there, run something that puts it there.  Still dynamically puts in user personalization from the cookie.  In the preso he provides details on how to do this.
Do cache invalidation on content change, use a message queuing system like openAMQ for async writes.
Apache is now the bottleneck – use APC (alternative php cache)
or use memcached – he says no timeouts!  Or… be careful about them!  Or something.

Scaling Databases

1. shard them
2. shoot yourself

Sharding, or breaking your data up by range across many databases, means you throw away relational constraints and that’s sad.  Get over it.

You may not need relations – use files fool!  Or other options like couchdb, etc.  Or hadoop, from the previous workshop!

Vertically scale first by:

  • not hitting the damn db!
  • run a good db.  postgres!  not mySQL boo-yah!

When you have to go horizontal, partition right – more than one shard shouldn’t answer an oltp question.   If that’s not possible, consider duplication.

IM example.  Store messages sharded by recipient.  But then the sender wants to see them too and that’s an expensive operation – so just store them twice!!!

But if it’s not that simple, partitioning can hose you.

Do math and simulate it before you do it fool!   Be an engineer!

Multi-master replication doesn’t work right.  But it’s getting closer.

Networking

The network’s part of it, can’t forget it.

Of course if you’re using Ruby on Rails the network will never make your app suck more.  Heh, the random drive-by disses rile the crowd up.

A single machine can push a gig.  More isn’t hard with aggregated ports.  Apache too, serving static files.  Load balancers too.  How to get to 10 or 20 Gbps though?  All the drivers and firmware suck.  Buy an expensive LB?

Use routing.  It supports naive LB’ing.  Or routing protocol on front end cache/LBs talking to your edge router.  Use hashed routes upstream.  User caches use same IP.  Fault tolerant, distributed load, free.

Use isolation for floods.  Set up a surge net.  Route out based on MAC.  Used vs DDoSes.

Service Decoupling

One of the most overlooked techniques for scalable systems.  Why do now what you can postpone till later?

Break transaction into parts.  Queue info.  Process queues behind the scenes.  Messaging!  There’s different options – AMQP, Spread, JMS.  Specifically good message queuing options are:

Most common – STOMP, sucks but universal.

Combine a queue and a job dispatcher to make this happen.  Side note – Gearman, while cool, doesn’t do this – it dispatches work but it doesn’t decouple action from outcome – should be used to scale work that can’t be decoupled.  (Yes it does, says dude in crowd.)

Scalability Problems

It often boils down to “don’t be an idiot.”  His words not mine.  I like this guy. Performance is easier than scaling.  Extremely high perf systems tend to be easier to scale because they don’t have to scale as much.

e.g. An email marketing campaign with an URL not ending in a trailing slash.  Guess what, you just doubled your hits.  Use the damn trailing slash to avoid 302s.

How do you stop everyone from being an idiot though?  Every person who sends a mass email from your company?  That’s our problem  – with more than fifty programmers and business people generating apps and content for our Web site, there is always a weakest link.

Caching should be controlled not prevented in nearly any circumstance.

Understand the problem.  going from 100k to 10MM users – don’t just bucketize in small chunks and assume it will scale.  Allow for margin for error.  Designing for 100x or 1000x requires a profound understanding of the problem.

Example – I plan for a traffic spike of 3000 new visitors/sec.  My page is about 300k.  CPU bound.  8ms service time.  Calculate servers needed.  If I varnish the static assets, the calculation says I need 3-4 machines.  But do the math and it’s 8 GB/sec of throughput.  No way.  At 1.5MM packets/sec – the firewall dies.  You have to keep the whole system in mind.

So spread out static resources across multiple datacenters, agg’d pipes.
The rest is only 350 Mbps, 75k packets per second, doable – except the 302 adds 50% overage in packets per sec.

Last bonus thought – use zfs/dtrace for dbs, so run them on solaris!

1 Comment

Filed under Conferences, DevOps

Velocity 2009 – Hadoop Operations: Managing Big Data Clusters

Hadoop Operations: Managaing Big Data Clusters (see link on that page for preso) was given by Jeff Hammerbacher of Cloudera.

Other good references –
book: “Hadoop: The Definitive Guide
preso: hadoop cluster management from USENIX 2009

Hadoop is an Apache project inspired by Google’s infrastructure; it’s software for programming warehouse-scale computers.

It has recently been split into three main subprojects – HDFS, MapReduce, and Hadoop Common – and sports an ecosystem of various smaller subprojects (hive, etc.).

Usually a hadoop cluster is a mess of stock 1 RU servers with 4x1TB SATA disks in them.  “I like my servers like I like my women – cheap and dirty,” Jeff did not say.

HDFS:

  • Pools servers into a single hierarchical namespace
  • It’s designed for large files, written once/read many times
  • It does checksumming, replication, compression
  • Access is from from Java, C, command line, etc.  Not usually mounted at the OS level.

MapReduce:

  • Is a fault tolerant data layer and API for parallel data processing
  • Has a key/value pair model
  • Access is via Java, C++, streaming (for scripts), SQL (Hive), etc
  • Pushes work out to the data

Subprojects:

  • Avro (serialization)
  • HBase (like Google BigTable)
  • Hive (SQL interface)
  • Pig (language for dataflow programming)
  • zookeeper (coordination for distrib. systems)

Facebook used scribe (log aggregation tool) to pull a big wad of info into hadoop, published it out to mysql for user dash, to oracle rac for internal…
Yahoo! uses it too.

Sample projects hadoop would be good for – log/message warehouse, database archival store, search team projects (autocomplete), targeted web crawls…
As boxes you can use unused desktops, retired db servers, amazon ec2…

Tools they use to make hadoop include subversion/jira/ant/ivy/junit/hudson/javadoc/forrest
It uses an Apache 2.0 license

Good configs for hadoop:

  • use 7200 rpm sata, ecc ram, 1U servers
  • use linux, ext3 or maybe xfs filesystem, with noatime
  • JBOD disk config, no raid
  • java6_14+

To manage it –

unix utes: sar, iostat, iftop, vmstat, nfsstat, strace, dmesg, friends

java utes: jps, jstack, jconsole
Get the rpm!  http://www.cloudera.com/hadoop

config: my.cloudera.com
modes – standalong, pseudo-distrib, distrib
“It’s nice to use dsh, cfengine/puppet/bcfg2/chef for config managment across a cluster; maybe use scribe for centralized logging”

I love hearing what tools people are using, that’s mainly how I find out about new ones!

Common hadoop problems:

  • “It’s almost always DNS” – use hostnames
  • open ports
  • distrib ssh keys (expect)
  • write permissions
  • make sure you’re using all the disks
  • don’t share NFS mounts for large clusters
  • set JAVA_HOME to new jvm (stick to sun’s)

HDFS In Depth

1.  NameNode (master)
VERSION file shows data structs, filesystem image (in memory) and edit log (persisted) – if they change, painful upgrade

2.  Secondary NameNode (aka checkpoint node) – checkpoints the FS image and then truncates edit log, usually run on a sep node
New backup node in .21 removes need for NFS mount write for HA

3.  DataNode (workers)
stores data in local fs
stored data into blk_<id> files, round robins through dirs
heartbeat to namenode
raw socket to serve to client

4.  Client (Java HDFS lib)
other stuff (libhdfs) more unstable

hdfs operator utilities

  • safe mode – when it starts up
  • fsck – hadoop version
  • dfsadmin
  • block scanner – runs every 3 wks, has web interface
  • balancer – examines ratio of used to total capacity across the cluster
  • har (like tar) archive – bunch up smaller files
  • distcp – parallel copy utility (uses mapreduce) for big loads
  • quotas

has users, groups, permissions – including x but there is no execution, but used for dirs
hadoop has some access trust issues – used through gateway cluster or in trusted env
audit logs – turn on in log4j.properties

has loads of Web UIs – on namenode go to /metrics, /logLevel, /stacks
non-hdfs access – HDFS proxy to http, or thriftfs
has trash (.Trash in home dir) – turn it on

includes benchmarks – testdfsio, nnbench

Common HDFS problems

  • disk capacity, esp due to log file sizes – crank up reserved space
  • slow but not dead disks and flapping NICS to slow mode
  • checkpointing and backing up metadata – monitor that it happens hourly
  • losing write pipeline for long lived writes – redo every hour is recommended
  • upgrades
  • many small files

MapReduce

use Fair Share or Capacity scheduler
distributed cache
jobcontrol for ordering

Monitoring – They use ganglia, jconsole, nagios and canary jobs for functionality

Question – how much admin resource would you need for hadoop?  Answer – Facebook ops team had 20% of 2 guys hadooping, estimate you can use 1 person/100 nodes

He also notes that this preso and maybe more are on slideshare under “jhammerb.”

I thought this presentation was very complete and bad ass, and I may have some use cases that hadoop would be good for coming up!

Leave a comment

Filed under Conferences, DevOps

Velocity 2009 – Introduction to Managed Infrastructure with Puppet

Introduction to Managed Infrastructure with Puppet
by Luke Kanies, Reductive Labs

You can get the work files from git://github.com/reductivelabs/velocity_puppet_workshop_2009.git, and the presentation’s available here.

I saw Luke’s Puppet talk last year at Velocity 2008, but am more ready to start uptaking some conf management back home.  Our UNIX admins use cfengine, and puppet is supposed to be a better-newer cfengine.  Now there’s also an (allegedly) better-newer one called chef I read about lately.  So this should be interesting in helping to orient me to the space.  At lunch, we sat with Luke and found out that Reductive just got their second round funding and were quite happy, though got nervous and prickly when there was too much discussion of whether they were all buying Teslas now.  Congrats Reductive!

Now, to work along, you git the bundle and use it with puppet.  Luke assumes we all have laptops, all have git installed on our laptops, and know how to sync his bundle of goodness down.  And have puppet or can quickly install it.  Bah.  I reckon I’ll just follow along.

You can get puppet support via IRC, or the puppet-users google group.

First we exercise “ralsh”, the resource abstraction layer shell, which can interact with resources like packages, hosts, and users.  Check em, add em, modify em.

You define abstraction packages.  Like “ssh means ssh on debian, openssh on solaris…”  It requires less redundancy of config than cfengine.

“puppet”  consists of several executables – puppet, ralsh, puppetd, puppetmasterd, and puppetca.

As an aside, cft is a neat config file snapshot thing in red hat.

Anyway, you should use puppet not ralsh directly.  Anyway the syntax is similar.  Here’s an example invocation:

puppet -e 'file { "/tmp/eh": ensure => present }'

There’s a file backup, or “bucket”, functionality when you change/delete files.

You make a repository and can either distribute it or run it all from a server.

There is reporting.

There’s a gepetto addon that helps you build a central repo.

A repo has (or should have) modules, which are basically functional groupings.  Modules have “code.”  The code can be a class definition.  init.pp is the top/special one.   There’s a modulepath setting for puppet.  Load the file, include the class, it runs all the stuff in the class.

It has “nodes” but he scoffs at them.  Put them in manifests/site.pp.  default, or hostname specific (can inherit default).   But you should use a different application, not puppet, to do this.

You have to be able to completely and correctly describe a task for puppet to do it.  This is a feature not a bug.

Puppet uses a client-server pull architecure.  You start a puppetmasterd on a server.  Use the SSH defaults because that’s complicated and will hose you eventually.  Then start a puppetd on a client and it’ll pull changes from the server.

This is disjointed.  Sorry about that.  The session is really just reading the slide equivalent of man pages while flipping back and forth to a command prompt to run basic examples.  I don’t feel like this session gave enough of an intro to puppet, it was just “launch into the man pages and then run individual commands, many of which he tells you to never do.”  I don’t feel like I’m a lot more informed on puppet than when I started, which makes me sad.  I’m not sure what the target audience for this is.  If it’s people totally new to puppet, like me, it starts in the weeds too much.  If it’s for someone who has used puppet, it didn’t seem to have many pro tips or design considerations, it was basic command execution.  Anyway, he ran out of time and flipped through the last ten slides in as many seconds.  I’m out!

Leave a comment

Filed under DevOps

Velocity 2009 – Death of a Web Server

The first workshop on Monday morning was called Death of a Web Server: A Crisis in Caching.  The presentation itself is downloadable from that link, so follow along!  I took a lot of notes though because much of this was coding and testing, not pure presentation.  (As with all these session writeups, the presenter or other attendees are welcome to chime in and correct me!)  I will italicize my thoughts to differentiate them from the presenter’s.

It was given by Richard Campbell from Strangeloop Networks, which makes a hardware device that sits in front of and accelerates .NET sites.

Richard started by outing himself as a Microsoft guy.   He asks, “Who’s developing on the Microsoft stack?”  Only one hand goes up out of the hundreds of people in the room.  “Well, this whole demo is in MS, so strap in.”  Grumbling begins to either side of me.  I think that in the end, the talk has takeaway points useful to anyone, not just .NET folks, but it is a little off-putting to many.

“Scaling is about operations and development working hand in hand.”   We’ll hear this same refrain later from other folks, especially Facebook and Flickr.  If only developers weren’t all dirty hippies… 🙂

He has a hardware setup with a batch of cute lil’ AOpen boxes.  He has a four server farm in a rolly suitcase.  He starts up a load test machine, a web server, and a database; all IIS7, Visual Studio 2008.

We start with a MS reference app, a car classifieds site.  When you jack up the data set to about 10k rows – the developer says “it works fine on my machine.”  However, once you deploy it, not so much.

He makes a load test using MS Visual Studio 2008.  Really?  Yep – you can record and playback.  That’s a nice “for free” feature.  And it’s pretty nice, not super basic; it can simulate browsers and connection speeds.  He likes to run two kinds of load tests,and neither should be short.

  • Step load for 3-4 hrs to test to failure
  • Soak test for 24 hrs to hunt for memory leaks

What does IIS have for built-in instrumentation?  Perfmon.  We also get the full perfmon experience, where every time he restarts the test he has to remove and readd some metrics to get them to collect.  What metrics are the most important?

  • Requests/sec (ASP.NET applications) – your main metric of how much you’re serving
  • Reqeusts queued (ASP.NET)  – goes up when out of threads or garbage collecting
  • %processor time – to keep an eye on
  • #bytes in all heaps (.NET CLR memory) – also to keep an eye on

So we see pages served going down to 12/sec at 200 users in the step load, but the web server’s fine – the bottleneck is the db.  But “fix the db” is often not feasible.  We run ANTS to find the slow queries, and narrow it to one stored proc.  But we assume we can’t do anything about it.  So let’s look at caching.

You can cache in your code – he shows us, using _cachelockObject/HttpContext.Current.Cache.Get, a built in .NET cache class.

Say you have a 5s initial load but then caching makes subsequent hits fast.  But multiple first hits contend with each other, so you have to add cache locking.  There’s subtle ways to do that right vs wrong.  A common best practice patter he shows is check, lock, check.

We run the load test again.  “If you do not see benefit of a change you make, TAKE THE CODE BACK OUT,” he notes.  Also, the harder part is the next steps, deciding how long to cache for, when to clear it.  And that’s hard and error-prone; content change based, time based…

Now we are able to get the app up to 700 users, 300 req/sec, and the web server CPU is almost pegged but not quite (prolly out of load test capacity).  Half second page response time.  Nice!  But it turns out that users don’t use this the way the load test does and they still say it’s slow.  What’s wrong?  We built code to the test.  Users are doing various things, not the one single (and easily cacheable) operation our test does.

You can take logs and run them through webtrace to generate sessions/scenarios.  But there’s not quite enough info in the logs to reproduce the hits.  You have to craft the requests more after that.

Now we make a load test with variety of different data (data driven load test w/parameter variation), running the same kinds of searches customers are.  Whoops, suddenly the web server cpu is low and we see steady queued requests.  200 req/sec.  Give it some time – caches build up for 45 mins, heap memory grows till it gets garbage collected.

As a side note, he says “We love Dell 1950s, and one of those should do 50-100 req per sec.”

How much memory “should” an app server consume for .NET?  Well, out of the gate, 4 GB RAM really = 3.3, then Windows and IIS want some…  In the end you’re left with less than 1 GB of usable heap on a 32-bit box.  Once you get to a certain level (about 800 MB), garbage collection panics.  You can set stuff to disposable in a crisis but that still generates problems when your cache suddenly flushes.

  • 64 bit OS w/4 GB yields 1.3 GB usable heap
  • 64 bit OS w/8 GB, app in 32-bit mode yields 4 GB usable heap (best case)

So now what?  Instrumentation; we need more visibility. He adds a Dictionary object to log how many times a given cache object gets used.  Just increment a counter on the key.  You can then log it, make a Web page to dump the dict on demand, etc.  These all affect performance however.

They had a problem with an app w/intermittent deadlocks, and turned on profiling – then there were no deadlocks because of observer effect.  “Don’t turn it off!”  They altered the order of some things to change timing.

We run the instrumented version, and check stats to ensure that there’s no major change from the instrumentation itself.  Looking at cache page – the app is caching a lot o fcontent that’s not getting reused ever.  There are enough unique searches that they’re messing with the cache.  Looking into the logs and content items to determine why this is, there’s an advanced search that sets different price ranges etc.  You can do logic to try to exclude “uncachable” items from the cache.  This removes memory waste but doesn’t make the app any faster.

We try a new cache approach.  .NET caching has various options – duration and priority.  Short duration caching can be a good approach.  You get the majority of the benefit – even 30s of caching for something getting hit several times a second is nice.  So we switch from 90 minute to 30 second cache expiry to get better (more controlled) memory consumption.  This is with a “flat” time window – now, how about a sliding window that resets each time the content is hit?  Well, you get longer caching but then you get the “content changed” invalidation issue.

He asks a Microsoft code-stunned room about what stacks they do use instead of .NET, if there’s similar stuff there…  Speaking for ourselves, I know our programmers have custom implemented a cache like this in Java, and we also are looking at “front side” proxy caching.

Anyway, we still have our performance problem in the sample app.  Adding another Web server won’t help, as the bottleneck is still the db.  Often our fixes create new other problems (like caching vs memory).  And here we end – a little anticlimactically.

Class questions/comments:
What about multiserver caching?  So far this is read-only, and not synced across servers.  The default .NET cache is not all that smart.  MS is working on a new library called, ironically, “velocity” that looks a lot like memcached and will do cross-server caching.

What about read/write caching?  You can do asynchronous cache swapping for some things but it’s memory intensive.  Read-write caches are rarer- Oracle/Tangosol Coherence and Terracotta are the big boys there.

Root speed –  At some point you also have to address the core query, it can’t take 10 seconds or even caching cant’ save you.  Prepopulating the cache can help but you have to remember invalidations, cache clearing events, etc.

Four step APM process:

  1. Diagnosis is most challenging part of performance optimization
  2. Use facts – instrument your application to know exactly what’s up
  3. Theorize probable cause then prove it
  4. Consider a variety of solutions

Peco has a bigger twelve-step more detailed APM process he should post about here sometime.

Another side note, sticky sessions suck…  Try not to use them ever.

What tools do people use?

  • Hand written log replayers
  • Spirent avalanche
  • wcat (MS tool, free)

I note that we use LoadRunner and a custom log replayer.  Sounds like everyone has to make custom log replayers, which is stupid, we’ve been telling every one of our suppliers in at all related fields to build one.  One guy records with a proxy then replays with ec2 instances and a tool called “siege” (by Joe Dog).  There’s more discussion on this point – everyone agrees we need someone to make this damn product.

“What about Ajax?”  Well, MS has a “fake” ajax that really does it all server side.  It makes for horrid performance.  Don’t use that.  Real ajax keeps the user entertained but the server does more work overall.

An ending quip repeating an earlier point – you should not be proud of 5 req/sec – 50-100 should be possible with a dynamic application.

And that’s the workshop.  A little microsofty but had some decent takeaways I thought.

Leave a comment

Filed under DevOps, Uncategorized

The Velocity 2009 Conference Experience

Velocity 2009 is well underway and going great!  Here’s my blow by blow of how it went down.

Peco, my erstwhile Bulgarian comrade, and I came in to San Jose  from Austin on Sunday.  We got situated at the fairly swank hotel, the Fairmont, and wandered out to find food.  There was some festival going on so the area was really hopping.  After a bit of wandering, we had a reasonably tasty dinner at Original Joe’s.  Then we walked around the cool pedestrian part of downtown San Jose and ended up watching “Terminator:  Salvation” at a neighborhood movie theater.

We went down at 8  AM the next morning for registration.  We saw good ol’ Steve Souders, and hooked up with a big crew from BazaarVoice, a local Austin startup that’s doing well.  (P.S. I don’t know who that hot brunette is in the lead image on their home page, but I can clearly tell that she wants me!)

This first day is an optional “workshop” day with a number of in depth 90 minute sessions.  There were two tracks, operations and performance.   Mostly I covered ops and Peco covered performance.  Next time – the first session!

Leave a comment

Filed under DevOps, Uncategorized