February 27th, 2015

Moving TFS to cloud cadence and Visual Studio Online

Buck Hodges
Director of Engineering

We get quite a few questions from customers on how we made the transition to shipping both an on-premises product and a cloud service. We moved from shipping every 2-3 years to shipping Visual Studio Online every three weeks and TFS every 3-4 months. You’ve probably seen the great set of vignettes at Scaling Agile across the Enterprise. It was also recently covered in a report from McKinsey and in an interview on Forbes.com. What’s missing is a deeper description of what changes we’ve made to how we work.

A couple of years ago, we wrote document on how the team changed to meet the new demands of building a service. It will give you a lot more information on what we did and how for anyone who wants to go deeper. I’ve cleaned up the document by converting our internal terminology, but it’s essentially unchanged from when it was written. Here is the summary from the document, and you will find the entire document attached to this blog post as a PDF.

The adoption of Scrum for TFS 2012 was driven by our desire to deliver experiences incrementally, incorporate customer feedback on completed work before starting new experiences, and to work like our customers in order to build a great experience for teams using Scrum. We used team training, wikis, and a couple of pilot teams to start the adoption process.

We organized our work in four pillars of cloud, raving fans, agile, and feedback with each having a prioritized backlog of experiences. Teams progress through the backlog in priority order, working on a small number of experiences at any point in time. When starting an experience, teams break down the experience into user stories and meet with leadership for an experience review. Each three-week sprint starts with a kick off email from each team, describing what will be built. At the end of the sprint, each team sends a completion email describing what was completed and produces a demo video of what they built. We hold feature team chats after every other sprint to understand each team’s challenges and plans, identify gaps, and ensure a time for an interactive discussion. On a larger scale, we do ALM pillar reviews to ensure end to end cohesive scenarios.

With the first deployment of tfspreview.com in April 2011, we began our journey to cloud cadence. After starting with major and minor releases, we quickly realized that shipping frequently would reduce risk and increase agility. Our high-level planning for the service follows an 18 month road map and a six month fall/spring release plan in alignment with Azure. To control disclosure, we use feature flags to control which customers can access new features.

Our engineering process emphasizes frequent checkins with as much of the team as possible working in one branch and using feature branches for disruptive work. We optimize for high quality checkins with a gated checkin build-only system and a rolling self-test system that includes upgrade tests. During verification week, we deploy the sprint update to a “golden instance” that is similar to production. Finally, we ensure continuous investment in engineering initiatives through engineering backlog.

We’ve made a number of changes in the last couple of years, and I’ll write about those in upcoming posts. The biggest changes have been in our engineering system and in our organizational structure. We now have the TFS/VSO team in Visual Studio Online. When we moved into VSO, we also moved from TF version control into Git. As we previously moved to Scrum in part to be able to build and use experiences needed for Scrum teams (and now Kanban as well), we wanted to ensure we build a great experience for Git and that we also derive benefits from the workflows that it enables, including being able to easily branch, do work, and commit the change back into master. TFS 2015 CTP1 and Team Explorer in VS 2015 CTP6 are the first releases we’ve made to the on-premises products from this new engineering system.

We’ve also changed the organizational structure by combining development and testing into a single engineering discipline. We made the change four months ago, and we are still learning.

Where we are now is also not where we want to be. For example, our architecture clearly shows our on-premises software origins, and we still have work to do including splitting out version control, build, work item tracking, and test case management into separate, independent services. At the same time, we need to collapse them back into a product that’s easy to run on-premises (service hooks, for example, is a separate service in VSO that is going to ship in TFS 2015). It’s an evolution – making changes to the product while continuing to ship valuable features for both the cloud and on-premises customers, all from a common code base owned by a single team (rather than separate cloud and on-premises teams).

I’m sure this post will create many questions, which will build a backlog of posts for me.

[Update March 12, 2015] Brian has written a new post about the future of Team Foundation Version Control.

Follow me on Twitter at twitter.com/tfsbuck

Moving TFS to Cloud Cadence.pdf

Author

Buck Hodges
Director of Engineering

Director of Engineering, Azure DevOps

0 comments