{"id":9811,"date":"2007-02-04T09:01:51","date_gmt":"2007-02-04T09:01:51","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/bharry\/2007\/02\/04\/managing-quality-part-2-automated-testing\/"},"modified":"2018-08-14T00:34:19","modified_gmt":"2018-08-14T00:34:19","slug":"managing-quality-part-2-automated-testing","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/bharry\/managing-quality-part-2-automated-testing\/","title":{"rendered":"Managing Quality (part 2) &#8211; Automated Testing"},"content":{"rendered":"<p>As I described in my last article, our first layer of quality assessment is build &#8220;scouting&#8221;, designed to do a shallow but broad pass across the product to determine if it&#8217;s worthwhile to proceed with deeper testing.<\/p>\n<p>Our next layer is more thorough automated testing runs.&nbsp; Many would call this &#8220;functional&#8221; testing but words in the testing space get so overloaded.&nbsp; It includes both API level tests (which some may call Unit tests), UI automation tests and some&nbsp;broader &#8220;system integration tests&#8221;.<\/p>\n<p>We divide our automation tests at this level into two categories &#8211; Nightly Automation Runs (NARs) and Full Automation Runs (FARs).<\/p>\n<p>As their name suggests, NARs are designed to be run every day, on every build that is not SelfToast (see part 1 for a description of this term).&nbsp; Today we have on the order of 3,000 NAR tests.&nbsp; In our test planning excercise (when we design the test cases we want to run and choose what will be automated) we prioritize our test cases (1, 2, 3).&nbsp; The NARs are generally selected from the Pri 1 test cases and chosen so that they can run and the results analyzed within a few hours (important if you are going to do this every day).<\/p>\n<p>FARs are the sum of all automated tests that we have.&nbsp; For TFS, this amounts to something on the order of 20,000 tests today.&nbsp; A FAR run takes about a week to run and analyze the results.&nbsp; As a result, we don&#8217;t start doing them until later in the product cycle and we run them less frequently &#8211; the closer to the end we get, the more frequently we run them.&nbsp; Right now, I think we are running them every 2 or 3 weeks.<\/p>\n<p>For completeness, beyond FARs, we have what we call a Full Test Pass (FTP).&nbsp; Which is a period of time where we run multiple NAR and FAR runs on a cross section of our test matrix (the subject of a future Managing Quality post) and run our manual test cases.&nbsp; Last I checked, we had about 10,000 manual test cases on top of the 20,000 FAR cases.&nbsp; A Full Test Pass takes somewhere from 2 &#8211; 4 weeks.<\/p>\n<p>So, with that background, on to the reports&#8230;&nbsp; Here&#8217;s a recent NAR trend report:<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/8\/2019\/02\/image%7B0%7D%5B9%5D.png\"><img decoding=\"async\" style=\"border-right: 0px;border-top: 0px;border-left: 0px;border-bottom: 0px\" height=\"402\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/8\/2019\/02\/image%7B0%7D_thumb%5B7%5D.png\" width=\"720\" border=\"0\"><\/a> <\/p>\n<p>As you can see, we breakdown each result by the cause of the failure.&nbsp; These warrant a little discussion:<\/p>\n<ul>\n<li><strong>Initial Pass Rate<\/strong> &#8211; The test passed the first time it was run.<\/li>\n<li><strong>Final Pass Rate<\/strong> &#8211; When tests fail, we &#8220;analyze&#8221; the run and some of them, we are able to tweak something about the test and re-run them.&nbsp; Those that pass the second time are marked as &#8220;Final Pass Rate&#8221;.&nbsp; Over time this should go to zero as all tests should pass on first run but when the code is churning, it&#8217;s not uncommon to need to tweak tests to keep them up to date.<\/li>\n<li><strong>Product issue<\/strong> &#8211; This is what you&#8217;d expect &#8211; it&#8217;s a failure in the product and results in a &#8220;product bug report&#8221;.<\/li>\n<li><strong>Test Issue<\/strong> &#8211; Believe it or not, test code can have bugs too.&nbsp; These are failures in tests that can&#8217;t be fixed by tweaks and require significant work in the test itself.&nbsp; They can result from changes in the product or just improperly written test code.<\/li>\n<li><strong>Other Issue<\/strong> &#8211; Anything else.&nbsp; These might be test infrastructure issues, lab network issues, etc.<\/li>\n<\/ul>\n<p>Sidebar &#8211; I think I&#8217;ve described this before but since I&#8217;m showing a bunch of build numbers, let me tell you about them again.&nbsp; 20108.00 is a build number.&nbsp; The format is YMMDD.NN.&nbsp; Years is somewhat arbitrary but increases each year that the project is underway &#8211; we are using 2 for Orcas because this is the second calendar year (we started on it in &#8217;06).&nbsp; 0108 is January 8th.&nbsp; NN represents the number of rebuilds of this &#8220;build&#8221;.&nbsp; Mostly this is used when we branch a build for a release and we freeze the main part of the build number and only increment NN.&nbsp; During the main phase of development, it&#8217;s pretty much always 00.<\/p>\n<p>Here&#8217;s a FAR report.&nbsp; It looks pretty much the same:<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/8\/2019\/02\/clip_image001%5B2%5D%5B7%5D.jpg\"><img decoding=\"async\" style=\"border-right: 0px;border-top: 0px;border-left: 0px;border-bottom: 0px\" height=\"357\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/8\/2019\/02\/clip_image001%5B2%5D_thumb%5B2%5D.jpg\" width=\"640\" border=\"0\"><\/a><\/p>\n<p>This report is only showing 2 FAR runs because we&#8217;ve really just started getting going with FAR runs.&nbsp; We are dusting them off and getting them running again on the Orcas code base.&nbsp; You&#8217;ll notice the gap between these two runs is a little over 2 weeks.<\/p>\n<p>&nbsp;<\/p>\n<p>We also produce more detailed test result reports to drill into specific feature areas like this:<\/p>\n<p>&nbsp;<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/8\/2019\/02\/image%7B0%7D%5B14%5D.png\"><img decoding=\"async\" style=\"border-right: 0px;border-top: 0px;border-left: 0px;border-bottom: 0px\" height=\"429\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/8\/2019\/02\/image%7B0%7D_thumb%5B8%5D.png\" width=\"640\" border=\"0\"><\/a> <\/p>\n<p>The wide variations in number of scenarios tends to come from how granularly different feature teams break up their tests.&nbsp; As you&#8217;ll see in a future post, the code coverage metrics for all feature areas are about the same.<\/p>\n<p>Looking at all of this data from a project management perspective&#8230;&nbsp; In addition to the build quality data in part 1, these form a very important part of telling how the quality of the product is progressing.&nbsp; In my experience, these numbers rise and fall throughout the product cycle and tend to hit their low point right around our &#8220;code complete&#8221; deadline &#8211; that&#8217;s when developers are feeling pressure to get their functionality done and quality tends to suffer a bit.&nbsp; Then we move into a stabilization phase the the numbers trend up.&nbsp; They need to be in the mid-high 90&#8217;s to have a Beta quality release.&nbsp; We generally never achieve more than about a 99% pass rate because there are inevitably some tests which fail for very minor reasons and we decide they do not materially impact the quality of the product.<\/p>\n<p>This product cycle we&#8217;ve made some pretty significant changes to our macro level project management.&amp;nbsp\n; After I get through this quality series, perhaps I&#8217;ll talk a bit about that and the impact it&#8217;s had on how we manage quality.<\/p>\n<p>Hopefully this continuing thread is useful to you.&nbsp; Let me know if not, or if you would like me to focus on certain aspects.<\/p>\n<p>Brian<\/p>\n","protected":false},"excerpt":{"rendered":"<p>As I described in my last article, our first layer of quality assessment is build &#8220;scouting&#8221;, designed to do a shallow but broad pass across the product to determine if it&#8217;s worthwhile to proceed with deeper testing. Our next layer is more thorough automated testing runs.&nbsp; Many would call this &#8220;functional&#8221; testing but words in [&hellip;]<\/p>\n","protected":false},"author":244,"featured_media":14617,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1],"tags":[5],"class_list":["post-9811","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized","tag-tfs"],"acf":[],"blog_post_summary":"<p>As I described in my last article, our first layer of quality assessment is build &#8220;scouting&#8221;, designed to do a shallow but broad pass across the product to determine if it&#8217;s worthwhile to proceed with deeper testing. Our next layer is more thorough automated testing runs.&nbsp; Many would call this &#8220;functional&#8221; testing but words in [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/posts\/9811","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/users\/244"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/comments?post=9811"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/posts\/9811\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/media\/14617"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/media?parent=9811"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/categories?post=9811"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/bharry\/wp-json\/wp\/v2\/tags?post=9811"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}