March 16th, 2007

Orcas Dogfood Upgrade – CPU Utilization

Brian Harry
Corporate Vice President

I think we’ve got enough data now that we can put a stake in the ground about where we stand on CPU utilization improvements.  We’ve still got a bit more tuning and improvements to make but it’s probably within 10% of where it will turn out. We’ve made less progress investigating the regressions this week than I expected – too many other things going on.  Given that, I expect it will be another couple of weeks before we put it to bed.  That said, we did identify a significant issue in one of the usage patterns of QueryItems.  Although it was not a regression to start with, I expect it to go green once we apply the patch.  We have also fixed GetBuildUri.  It didn’t show up in the last post because there were no occurances in the sample that I used to generate it but previous samplings showed a significant regression.  Some progress – but not as much as I’d hoped. On to the CPU utilization… Because no two time periods are quite the same, any comparison is a little like apples to oranges.  The technique I have used is to average the CPU utilization from the week before the dogfood upgrade and from this week.  I then took this week’s CPU utilization and “normalized” it.  That means dividing it my the average # of requests per hour this week and multiplying by the average number of requests per hour in the earlier week.  This is the best attempt I can think of to make oranges look like apples.  So looking at this for the data tier (which as you will recall has always been our bottleneck), we get: Effective CPU utilization this week: 20.85% * 134,454 / 180,020 = 15.57% The previous week’s average CPU utilization was 28.82%. So comparing them: 15.57%/28.82% = 0.5404 In other words, overall Orcas uses about 46% less CPU cycles on the data tier to do the same amount of work as TFS 2005.  We’re pretty psyched about that. Doing the same analysis for the application tier yields an effective CPU utilization of 14.85% compared to 24.90%, meaning the application tier is about 40% more efficient. You’ll remember that in our configuration (and in our general recommendation) the application tier has half the number of cores that our data tier has (4 for the AT and 8 for the DT).  And still the AT CPU utilization is less than the DT CPU utilization.  I had been a bit worried that all of the improvements in DT efficiency would mean we needed to change our guidance and start recommending balanced AT/DT pairs but given what I see now, we are good to stick with our current guidance. We are expecting to get the I/O analysis tonight so I’ll write about that as soon as I can.  It may very well be mid next week before I get to it because I’m traveling to San Francisco to give a talk at SD West on Monday. Thanks,

Brian

Author

Brian Harry
Corporate Vice President

Corporate Vice President for Cloud Developer Services.

0 comments

Discussion are closed.

Feedback