OK, now we’re to the final stretch of presentations for Day One.
“Cadillac or Nascar: A Non-Religious Investigation of Modern Web Technologies,” by Akara and Shanti from Sun.
Web20kit is a new reference architecture from Sun to evaluate modern Web technologies. It’s implemented in PHP, JavaEE, and Ruby. It’ll be open sourced in the fall.
It uses a web/app server – apache, glassfish, and mongrel – with a cache (memcached), a db (mySQL), an object store (NFS/MogileFS), a driver, and a geocoder. The sample app is a social event calendar with a good bit of AJAX frontend.
I apologize for any lack of coherence in this writeup, but I was at the back of the hall, the mike wasn’t turned up enough, and there were accents to drill through.
The Java EE implementation used servlets, JSPs, JPA, the Whalin memcached client, and localFS/NFS/distributed FS on GlassFish. PHP used UnixODBC/PDO, the PECL memcache client, and same storage on apache/lighttpd. The Rails framework isn’t using memcached or distributed FS yet. The AJAX used was JSON. (I have no idea what JAMM or AMMP are, I think he’s making them up.)
Admittedly, PHP is more of a language, RoR more of a framework, with Java in between, with increasing amounts of object-relational mapping provided, with the obvious pros & cons of any abstraction layer.
Damn, it’s hot in here. And packed. And I can’t understand half of what this guy is saying.
The lady comes up for the testing results. Ah, louder and more legible. For PHP, throughput vs users grows linearly, but the network usage outstrips CPU usage; it’s the bottleneck. For Java, scaling is linear with a single process. The Java Persistence API eased development with its O/R mapping and built in caching. Rails is underway, but prelim results are that thin is better than mongrel; JRuby is better than Ruby, and on Solaris Ruby in Cool Stack 1.3 (Sun compiled open source) gives 40% improvement.
There are some memcached results in pretty graphs but I’m not clear what they mean. In fact, larger memcacheds were faster. There are performance issues in the client libraries – memcached server is good but clients don’t scale. For Java, Whalin and Spy both suck but Spy might be better. In PHP, the PECL client is most common but it’s unstable so people roll their own.
mySQL scaling is good and 5.1 is way ass better than 5.0. As in 75% CPU usage reduction.
Apache/PHP Tuning Tips – tune TCP time_wait, don’t load apache modules you don’t need, tune listenbacklog (8092), serverlimit (2048), maxclients (2048), PHP – turn odd safe mode and increase realpath_cache_size if you have lots of files.
Glassfish tuning – eap up to 3 GB for 32bit, GC use parallel, increase http threads to 128, for a JPA provider use eclipselink not toplink, and run your web container in production mode (you have to redeploy when you change this).
Memcached tuning – ensure network processing is distributed across CPUs. Bind memcached to CPUs not processing interrupts. Run memcached 1.2.5 with 4 threads and use in 64-bitmode for large cache sizes (preferential to more memcached procs). Needs horizontal scaling.
mySQL Tuning – tune your queries. Use joins over subqueries (unlike oracle), use limits. Innodb – avoid too frequent cache flushes (innodb_flush_log_at)trx_commit = 2). Use separate read/write dbs to avoid trashing your query cache.
Conclusion – network is starting to be a bottleneck at 1 Gb. Use link aggregation to get around till 10 Gb is there.
Also note Faban, an open source benchmark development kit.
More in the Sun blogs.
I’m a little dubious about this whole thing – not sure what the goal is. A lot of performance has to do with how you code things, so I’m not sure you can do a fair compare between languages/frameworks by just coding a similar app in all of them… I think it’s more useful to compare sub-parts (so not Rails vs PHP, but Whalin vs Spy.)
Improving Netflix Performance, with Bill Scott. (Yes, from Netflix.) Rather than do a ‘big bang” they put a good measurement framework in place and did incremental improvements. There’s a bunch of specific points you can measure, from the unload() of the previous page, to other events, to when something appears… They put these together into a bunch of measurement intervals. The slides are very interesting here.
They also correct for server vs client clocks. They segment by various metrics – browser, bandwidth, etc. They made a firebug panel to show these values.
Then they did analysis. They made some changes that “should” have been good, like changing images to css sprites, and that degraded performance due to old event handlers – they moved those and then it was good. The lesson is test and verify.
Gzip was a win, 13-25% user perf improvement and halved outbound network traffic. Scared the crap out of their network team as a result, heh.
They refactored the Netflix queue with mixed results. He has an interesting graph on browser speed – Aafari fastest, IE slowest.
In conclusion – use YSlow optimizations but test them!
Next, it’s back to back new browser wars. Mike Connor from Mozilla about Firefox 3 and Christian Stockwell from Microsoft on IE8!
Firefox3! They put in a lot of performance enhancements (and “human performance” enhancements). Goals of 3: safer, faster, better. Enforces plugin security more. Faster JS execution. Awesomebar, aka the new location bar with typeahead, learning, search.. Download manager. Poor guy’s nervous. OK, I’m bored.
IE8! Navigating to the top 100 sites in IE8 shows that most of the work is done in layout and rendering (70%) – less so in marshalling, DOM, and JScript, and very little in CSS and HTML. So they couldn’t just make “the JScript engine” or “the HTML render” faster. So they did work on the JScript engine, but also unblocked script downloads, increased the connection limit, reduced marshalling costs, and decreased memory usage. Tried to fix “known bad” issues like 1×1 transparent pngs and hover effects. Also, they have dev tools included in Beta 1 (unclear what this means). “Performance Analyzer Tools” as part of the SDK will give you the time-spent breakdown on your own site!
OK, at this point I am giving up on the small, hot room with bad sound and going back to the big ballroom. Which is fine, because it’s time for the Performance Metrics Panel!
John Rauser is moderating, with Peter Sevcik (NetForecast), Eric Goldsmith (AOL), Eric Schurman (Microsoft), and Vik Chaudary (Keynote).
What metrics are best for end user experience? Use percentiles. The distribution of performance times is not a normal distribution – it has a long tail. Use median rather than mean in all cases. You need to see and capture the tail. How do you settle on a specific metric? Some are using a metric that’s a “munged together” combo of the percentiles, because you don’t want to miss effects – like by doing something that benefits the 99th percentile time kinds of users but hoses the 20th percentile types of users.
Some slides on Apdex, the application performance index (see apdex.org). You bucketize user experiences into those in good time T that are satisfied, then classify up to 4T as tolerating. Beyond that is unacceptable. Apdex = (satisfied + (tolerating/2)) / total samples. But is this too simplistic? It’s certainly easy to calculate. But it isn’t very sensitive to changes due to the bucketizing. Natty shirt guy (unclear which guy he is) prefers the munged percentiles to something as simplistic as Apdex. Oh, it’s Eric from Microsloth.
My personal feel is that one number’s not enough. Even our current NI SLA “2 seconds, global” is getting too simplistic. Apdex seems like more of a management “number to keep them quiet,” and that’s how they’re describing it too. Although Apdex guy makes a good point that the number’s portable across companies so you can use it to have discussions outside with advertisers etc. – but that begs the question of “T tampering.” Hell, SPEC benchmarks are BS and they’re a lot more rigorous.
There’s an entertaining discussion of the old “8 second rule” (useful standard! hogwash!) and the newer “2 second rule” (useful standard! hogwash!). There’s comments that “but lots of people don’t hit 2s, and Amazon didn’t used to hit 8s, so it must not be true” but I’m not sure that’s relevant.
Anyway, being a EE and having a token amount of statistics and visualization experience this whole discussion makes me sad. Peco leans over to me and says “If Tufte were here he’d slap these guys.” (Referring to Edward Tufte, author of The Visual Display of Quantitative Information – and he would.) Man, we need just one decent statistician and visualization guy to come to the Web performance world and set everybody right. Whenever I see something other than a simplistic line chart in the Web world I get a chubby. (Opnet Panorama, with its deviations and histograms, is about as good as it gets for ITers.)
The Keynote guy says there’s an iGoogle index widget that’ll show your numbers. He’s showing a nice Google Maps mashup too… I’m not sure where you go to see this.
Question: What’s the relationship between performance and availability? Should you commingle them? Well, everyone has some kind of timeout… Turning poor performance into an availability hit. Yeah, we had a problem with that once, Keynote changed their cutoff and we were in a frenzy of trying to figure out why the hell our perf numbers changed. There’s discussion of “well does it really matter, do ops people really care…” Hell yeah we do. Every tool we have has a timeout that hits availability. Whether you combo the metrics or not, the two are related – it’s dangerous to hide the relation.
Question (Peco!): What about errors? Well, 404s don’t count towards performance, 500s do. Not really satisfying… Shouldn’t things like 404 be somehow factored in?
I hear there’s an open bar. I’m off like a prom dress!
After some booze and a trip to the In-N-Out burger (My first! Double double, animal style!) was the traditional O’Reilly Ignite superfast presentation session.
- Animoto scaling from 8 to 3500 servers in 3.5 days using RightScale, an Amazon EC2 provisioning manager.
- Porn scaling! Gamelink does adult video hosting. Streaming servers are hard to cache. Do it locally, doing it on EC2/S3 etc. ends up costing a lot for network xfer. Moving a TB per week – how do you get it there? Metro Ethernet in their case. And for storage – SANs too expensive, etc. And CDNs shy from adult content. Windows Media but moving to Flash, but that’s copyable. Network all gigabit backplane. Video streams reset when a stream fails. See also www.retina.net/tech.
- Freebase, a free creative commons database. Infochimps.org, Project Gutenberg, etc. Open data! Public.resource.org.
- Buy SAN, make profit. whitepages.com does 15MM searches/day. Nasty data, but read only, and little caching. Database sprawl threatened. Wanted to move to a SAN but heard expensive, tricky. iSCSI? They used EqualLogic iSCSI. Cheap and fast. 55% TCO win. Snapshots, replication.
- John Bryce of Mosso about rewriting the plane in flight. Management web app customers use to create stuff in a cloud. To a distributed provisioning system. Staff was increasing faster than functionality. Planned for new provisioning then new panel then new features. Fix things as they come up (don’t keep high interest technical debt). Estimating a complete overhaul is hard. Refactor in each release. Release in parallel/beta sites. Fight for your users.
- merb. Rails is easy but sucks scalability-wise. Merb is easy and better for enterprise. Modular, easily tested, stable interface. Very similar to rails…
- Startup Metrics for Pirates! Focus on a small set of good conversion metrics. Big ass preso – check it out at the Slideshare link provided. Web 2.0 model – 1. Drive traffic. 3. Profit.
- Jos from RIPE NCC, a regional internet registry. ARIN for Europe/Russia. IPv4 D-Day is in a couple years. IPv6 is not coming in time. But it needs to. Get your IPv6 shit together. No first mover incentive, so need to create demand. Like free porn on IPv6. IPv6experiment.com!
That was interesting, and I got to hobnob with my favorite CEO cutie from slideshare.com. Now, to the Bat-Bed!