Well, I’m finally home with a spare minute to write. I and the two guys who went to the conference with me (Peco and Robert) got a lot out of it. I apologize for the brevity of style of the conference writeups, but they were notes taken on a precariously balanced laptop, under bad network and power conditions, while I was also trying to get around and participate meaningfully in a very fast-paced event. I’ve gone back and tried to soften them a little bit, but there’s no rest for the wicked. You can access many of the slides for the sessions here.
The conference was quite a success. Everyone we spoke to was enthusiastic about the people and information there. O’Reilly is happy because attendance was above their expectations, and it looks like it’s been expanded to 3 days next year, which is good – it was *so* session packed and fast paced I didn’t get to talk to all the suppliers I wanted in the dealer room and at times it felt like the Bataan death march. The first day we barely had time to grab a fast food dinner, and we often found ourselves hungry and hurrying. We enjoyed talking with the people there, but it seemed less conversational than other conventions – maybe because of the pace, maybe because half the people there were from the area and thus needed to scamper off to work/home and were therefore not into small talk.
We’ve reached the last couple sessions at Velocity 2008. Read me! Love me!
We hear about Capacity Planning with John Allspaw of Flickr. He says: No benchmarks! Use real production data. (How? We had to develop a program called WebReplay to do this because no one had anything. We’re open sourcing it soon, stay tuned.)
Use “safety factors” (from traditional engineering). Aka a reserve, overhead, etc.
They use squid a bunch. At NI we’ve been looking at Oracle’s WebCache – mainly because it supports ESIs and we’re thinking that may be a good way to go. There’s a half assed ESI plugin for squid but we hear it doesn’t work; apparently Zope paid for ESI support in squid 3.0 but no traction on that in 4 years best as we can tell. But I’d be happy not to spend the money.
After a tasty pseudo-Asian hotel lunch (though about anything would be tasty by now!), we move into the final stretch of afternoon sessions for Velocity. Everyone seems in a good mood after the interesting demos in the morning and the general success of the conference.
First, it’s the eagerly awaited Even Faster Web Sites. Steve Souders, previously Yahoo! performance guru and now Google performance guru, has another set of recommendations regarding Web performance. His previous book with its 14 rules and the Firebug plugin, YSlow, that supported it, are one of the things that really got us hooked deeply into the Web performance space.
First, he reviews why front end performance is so important. In the steady state, 80-90% of your average page’s load time the user sees is time after the server has spit it out. “Network time.” Optimizing your code speed is therefore a smaller area of improvement than optimizing the front end. And it can be improved, often in simple ways.
Man, there’s a wide variance in how people’s pages perform with a primed cache – from no benefit (most of the Alexa top 10) to incredible benefit (Google and MS live Search results pages). Anyway, Steve developed his original 14 best practices for optimizing front end performance, and then built YSlow to measure them.
Welcome to the second (and final) day of the new Velocity Web performance and operations conference! I’m here to bring you the finest in big-hotel-ballroom-fueled info and drama from the day.
In the meantime, Peco had met our old friend Alistair Croll, once of Coradiant and now freelance, blogging on “Bitcurrent.” Oh, and also at the vendor expo yesterday we saw something exciting – an open source offering from a company called ControlTier, which is a control and deployment app. We have one in house largely written by Peco called “Monolith” – more for control (self healing) and app deploys, which is why we don’t use cfengine or puppet, which have very different use cases. His initial take is that ControlTier has all the features he’s implemented and all the ones on his list to implement for Monolith, so we’re very intrigued.
We kick off with a video of base jumpers, just to get the adrenaline going. Then, a “quirkily humorous” video about Faceball.
Steve and Jesse kick us off again today, and announce that the conference has more than 600 attendees, which is way above predictions! Sweet. And props to the program team, Artur Bergman (Wikia), Cal Henderson (Yahoo!), Jon Jenkins (Amazon), and Eric Shurman (Microsoft). Velocity 2009 is on! This makes us happy, we believe that this niche – web admin, web systems, web operations, whatever you call it – is getting quite large and needs/deserves some targeted attention.
OK, now we’re to the final stretch of presentations for Day One.
“Cadillac or Nascar: A Non-Religious Investigation of Modern Web Technologies,” by Akara and Shanti from Sun.
Web20kit is a new reference architecture from Sun to evaluate modern Web technologies. It’s implemented in PHP, JavaEE, and Ruby. It’ll be open sourced in the fall.
It uses a web/app server – apache, glassfish, and mongrel – with a cache (memcached), a db (mySQL), an object store (NFS/MogileFS), a driver, and a geocoder. The sample app is a social event calendar with a good bit of AJAX frontend.
I apologize for any lack of coherence in this writeup, but I was at the back of the hall, the mike wasn’t turned up enough, and there were accents to drill through.
In the afternoon, we move into full session mode. There’s two tracks, and I can only cover one, but that’s what I have Peco and Robert around for! Well, that and to have someone to outdrink. (Ooo burn!) They’ll be posting their writeups at some point as well – you can go to the Velocity schedule page to see the other sessions and to the presentations page to get slides where they exist.
First afternoon session: My panel! I am on the “Measuring Performance“ panel with Steve Souders, Ryan Breen of Gomez, Bill Scott of Netflix, and Scott Ruthfield from whitepages.com (a fellow Rice U/Lovetteer!) It went well. We talked about end user performance monitoring, all the other kinds of tools you can use and their drawbacks, and about “newfangled” monitoring of perf w/AJAX, SOA, RIAs, etc. No questions; not sure if the audience liked it or not. But I did get a number of people saying “good work” later so I’ll declare victory. 🙂
“Actionable Logging for Smoother Operation and Faster Recovery,” by Mandi Walls of AOL. It’s a quick 30 minute session. Logging should be actionable – concise, express symptoms. Anything logged is something fixable. It should be giving you less downtime – shorter time to resolution. Logging takes resources, so make it worth it.
Filter down your logs to be concise and actionable. Production logging has different goals from dev/QA logging. You’re looking for problem diagnosis and recovery, and then statistics and monitoring. Insight into what the app’s doing.
You need a standard log file location. On our UNIX servers, the UNIX team gives us “/opt/apps” as the place where we can put stuff and gets cranky about any files outside of that. We make everyone log to one place – /opt/apps/logs/<appname> for this reason. Makes it easy to manage disk space, rotate logs, run “find”s, etc.
Just two more keynotes till lunch, but these are larger ones (the previous speakers were 15 minutes apiece; these are 45). I’ll try to take good notes; every conference always says they’re going to make all the slides available afterwards but at best they usually get a 50% success rate on that.
First, Luiz Barroso from Google speaks on energy efficient operations. Now, server usage is only about 1% of total electricity consumption, but it doubled between 2000 and 2005. Measuring computing energy efficiency is harder than measuring a refrigerator or the like. Efficiency is defined as work done/energy used in physics terms. Efficiency for IT can be broken down into computing efficiency (work done/chip energy), server efficiency (chip energy/server energy) and server room efficiency (server energy/server room energy). Surveys show an average PUE (1/server room efficiency) of 1.83, and power supplies dissipate 25% of the power going to servers uselessly, more in PCs. Servers have poor (computing) energy efficiency in their most common usage range.
How do we address this? First, the power provisioning problem in the data center. Energy isn’t the largest cost – building the center itself takes $10-$22 per watt, but the 10 year power is $9/watt. Efficiency saves on both. According to the uptime institute, the average cost breakdown is datacenter – 28%, electricity – 22%, hardware – 50%. (Software dwarfs this in many shops, I’ll note.)