{"id":26445,"date":"2018-07-20T00:53:00","date_gmt":"2018-07-20T00:53:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/premier_developer\/?p=26445"},"modified":"2019-02-14T20:17:57","modified_gmt":"2019-02-15T03:17:57","slug":"%e2%80%8b%e2%80%8bdevops-stories-interview-with-john-daniel-trask-of-raygun%e2%80%8b","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/premier-developer\/%e2%80%8b%e2%80%8bdevops-stories-interview-with-john-daniel-trask-of-raygun%e2%80%8b\/","title":{"rendered":"\u200b\u200bDevOps Stories \u2013 Interview with John-Daniel Trask of Raygun\u200b"},"content":{"rendered":"<p>App Dev Manager <a href=\"https:\/\/www.linkedin.com\/in\/rogueagile\/\">Dave Harrison<\/a> talks with <a href=\"https:\/\/www.linkedin.com\/in\/jotrask\/\">John-Daniel Trask<\/a>, co-founder and CEO of Raygun, about the adoption of DevOps.<\/p>\n<hr \/>\n<p>The following content is shared from an interview with <a href=\"https:\/\/www.linkedin.com\/in\/jotrask\/\">John-Daniel Trask<\/a>, co-founder and CEO of Raygun, a New Zealand-based company that specializes in error, crash, and performance monitoring. John-Daniel (or JD) started out with repairing laptops out of college, to working as a developer, to finally starting several very successful businesses, including what became Mindscape and its very successful monitoring product, Raygun.<\/p>\n<p>We covered a lot of ground here, and we think you\u2019ll love the following thoughts:<\/p>\n<ul>\n<li>Is a DevOps team really such a bad thing?<\/li>\n<li>Why forcing your devs to go to an event booth might be a very good thing<\/li>\n<li>When is a \u201crequirement\u201d not really a requirement?<\/li>\n<li>Starting from scratch, with nothing &#8211; where would you start?<\/li>\n<li>What\u2019s the golden ticket to get funding and support for your requests and projects?<\/li>\n<\/ul>\n<p>And last but not least \u2013 \u201cit\u2019s not the big that eat the small, it\u2019s the fast that eat the slow!\u201d<\/p>\n<p>Note &#8211; these and other interviews and case studies will form the backbone of the upcoming book \u201cAchieving DevOps\u201d from Apress, due out in late 2018. Please c<a href=\"https:\/\/www.linkedin.com\/in\/rogueagile\/\">ontact me<\/a> if you\u2019d like an advance copy!<\/p>\n<p><strong><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/31\/2019\/04\/johndanieltrask1.jpg\"><img decoding=\"async\" style=\"margin: 0px 10px 0px 0px;float: left\" title=\"johndanieltrask\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/31\/2019\/04\/johndanieltrask_thumb1.jpg\" alt=\"johndanieltrask\" width=\"271\" height=\"271\" align=\"left\" border=\"0\" \/><\/a>Is DevOps culture first?<\/strong> Well I definitely run into a lot of zealots who swing one side or another. Some people pound the table and say that DevOps is nothing about tools, that it\u2019s all culture and fluffy stuff. These are usually the same people who think a DevOps team is an absolute abomination. Others say it\u2019s all about automation and tooling.<\/p>\n<p>Personally, I&#8217;m not black and white on it. I don&#8217;t think you can go and buy DevOps in a box; I also don&#8217;t think that \u201cas long as we share the same psychology, we&#8217;ve solved DevOps.\u201d Let\u2019s take the whole idea of a DevOps team being an antipattern for example. For us it\u2019s not that simple \u2013 it\u2019s very easy, on a 16-person startup, to say that a DevOps team is a horrible idea. Well, of COURSE you\u2019d think that, for you cross team communication is as easy as turning around in your chair! But let\u2019s take a larger enterprise, 50,000 people or so, with hundreds of engineering teams. You can\u2019t just hand down \u201cwe\u2019re doing DevOps\u201d as an edict and it\u2019s solved. In that case, I have seen a DevOps team be a very successful as a template, something that helps spread the good word by example.<\/p>\n<p>What\u2019s a common blind spot you see with many programmers? It\u2019s really quite shocking how little empathy there is by most software engineers for their actual end users. You would think the stereotypical heads-down programmer would be a dinosaur, last of a dying breed, but it\u2019s still a very entrenched mindset. I sometimes joke that for most software engineers, you can measure their entire world as being the distance from the back of their head to the front of their monitor. There\u2019s a lack of awareness and even care about things like software breaking for your users, or a slow loading feed. No, what we care about is \u2013 how beautiful is this service that I\u2019ve written, look how cool this algorithm is that I wrote.<\/p>\n<p>We sometimes forget that it all comes down to human beings. If you don\u2019t think about that first and foremost, you\u2019re really starting off on the wrong leg.<\/p>\n<p>One of the things I like about Amazon is the mechanisms they have to put their people closer to the customer experience. We try to drive that at Raygun too. We often have to drag developers to events where we have a booth. Well, once they\u2019re there, the most amazing thing happens \u2013 we have a handful of customers come by and they start sharing about how amazing they think the product is. So you start to see them puff out their chests a little \u2013 life is good! And the customers start sharing a few things they\u2019d like to see \u2013 and you see the engineers start nodding their heads and thinking a little. We find those engineers come back with a completely different way of solving problems, where they\u2019re thinking holistically about the product, about the long term impact of the changes they\u2019re making. Unfortunately, the default behavior is still to avoid that kind of engagement, it\u2019s still way out of our comfort zone.<\/p>\n<p>Using Personas to Weed Out Red Herrings: I don\u2019t know if we talk enough in our industry about weeding out bad feedback. We often get requests from our customers to do things like dropping a data grid with RegEx on a page. That\u2019s the kind of request that comes from the nerdiest of the nerds \u2013 and if we were to take that seriously, think of the opportunity cost and what it would do to our own UX!<\/p>\n<p>We weed out outlier requests like this by using personas. For our application, we think in terms of either a CEO, a tech lead, or an operator. Each has their own persona and backstory, and we\u2019ve thought out their story end to end and how they want to work with our software. So for the CXO level, the VP\u2019s, the directors \u2013 these are people who understand their whole business hinges on the quality of their software. They need to keep this top of mind at the very top levels of decision making. So for this person, there are graphs and charts showing this strategic level fault and UX information, all ready to drop into reports to the executive board. Then there\u2019s the mid tier \u2013 these are your tech leads, the Director of Engineering \u2013 they need to know both high level strategic 30K foot information, and a summary of key issues. The cutting edge though is that third tier, your developer or operator. This person needs to have information when something goes bump in the night. So for them, you have stack traces, profiling raw data, user request waterfalls. Without that information, troubleshooting becomes totally a stab in the dark.<\/p>\n<p>Lots of companies use personas, I know. They\u2019re really critical to filter out noise and focus on a clear story that will thrill your true user base.<\/p>\n<p>How can error and crash reporting make for a better performing business? And yet, most of the DevOps literature and thinking I see focuses entirely on build pipelines, platform automation, the deployment story, and that\u2019s the end of it. Monitoring and checking your application\u2019s real-world performance and correcting faults usually just gets a token mention, very late in the game. But after you deploy, the story is just beginning!<\/p>\n<p>I hate to say this \u2013 but I think we\u2019re still way behind the times when it comes to having true empathy with our end users. It\u2019s surprising how entrenched that mindset of monitoring being an afterthought or a bolt-on can be. Sometimes we\u2019ll meet with customers and they\u2019ll say that they just aren\u2019t using any kind of monitoring, that it\u2019s not useful for them. And we show them that they\u2019re having almost 200,000 errors a day \u2013 impacting 25,000 users each day with a bad experience. It\u2019s always a much, much larger number than they were expecting \u2013 by a factor of 10 sometimes. Yet somehow, they\u2019ve decided that this isn\u2019t something they should care about. A lot of these companies have great ideas that their customers love \u2013 but because the app crashes nonstop, or is flaky, it strangles them. You almost get the thinking that a lot of people would really rather not know how many problems there really are with what they\u2019re building.<\/p>\n<p>Yet time and again, we see companies that really care about their customers excel. Let\u2019s say I take you back in time to 2008, and I give you $10,000 to invest in any company you want. Are you going to put that into Microsoft, Apple, Google, or Dominos Pizza? Well guess what \u2013 Dominos has kicked the crap out of all those big tech companies with their market cap growth rate. The answer is in their DNA \u2013 they devote all their attention into ensuring their customers have a great experience. Their online ordering experience is second to none. And that all comes from them being customer obsessive, paying attention to finding where that experience is subpar and fixing it. It\u2019s never a coincidence that customer centric companies consistently outperform and dominate.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/31\/2019\/04\/dominos.png\"><img decoding=\"async\" title=\"dominos\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/31\/2019\/04\/dominos_thumb.png\" alt=\"dominos\" width=\"644\" height=\"380\" border=\"0\" \/><\/a><\/p>\n<p>Source: <a href=\"https:\/\/www.theatlas.com\/charts\/S18QCJyhe\">https:\/\/www.theatlas.com\/charts\/S18QCJyhe<\/a><\/p>\n<p>What\u2019s forced us as an industry to change and driven a better user experience is Google, believe it or not. They started publishing a lot of research and data around application errors and performance, and prioritizing well performing sites. This democratized things that data scientists were just starting to figure out themselves. And it seemed like overnight, a lot of people cared very much that their website not be dog slow \u2013 because otherwise, it wouldn\u2019t be on the first page results of a web search, and their sales would tank. But folks often didn\u2019t care about performance or the end user experience \u2013 until Google forced us to.<\/p>\n<p>What would you say to the company that is starting from ground zero when it comes to DevOps? I\u2019m picturing here a shop where they take ZIP files and remote desktop onto VM\u2019s and copy-paste their deployments. If that\u2019s the case \u2013 I like to talk about what are the small things you could put into place that would dramatically improve the quality of life on the team. These are big impact, low cost type improvements. So where would I start?<\/p>\n<ul>\n<li>First would come automating the deployments. Just in reliability alone, that\u2019s a huge win. Suddenly I have real peace of mind. I can roll out releases and roll them back with a single button push, and it\u2019s totally repeatable as a process. If I\u2019m an oncall engineer, being able to roll out a patch through a deployment process that runs automatically at 3 a.m. is a world of difference from manually pulling assets.<\/li>\n<li>The second thing I would do is set up something like StatsD. You don\u2019t need to allocate a person to spend several days \u2013 it\u2019s a Friday afternoon kind of thing. When you start tracking something \u2013 anything! &#8211; and put it up on the wall that\u2019s when people start to get religion. We saw this ourselves with our product \u2013 once we put up some monitors with some of the things coming from StatsD, like the number of times users were logging in and login failures. And it was like watching an ultrasound monitor of your child. People started gathering around, big smiles on their faces \u2013 things were happening, and they felt this connection between what they were doing and their baby, out there in the big bad old world. Right away some of that empathy gap started to close.<\/li>\n<li>Third would come crash reporting. There\u2019s just no excuse not to put this into place \u2013 it takes like ten minutes, and it cuts out all that waste and thrash in troubleshooting and fuels an improvement culture.<\/li>\n<\/ul>\n<p>How do we communicate in the language of business? What I wish more engineering teams understood is how to communicate in the language of business. I\u2019m not asking developers to get an MBA in their off hours \u2013 but please TRY to frame things in terms of dollars, economic impact, or cost to the customer. Instead we say, this shiny new thing looks like it could be helpful.<\/p>\n<p>There\u2019s a reason why we often have to beg to get our priorities on the table from the business. We haven\u2019t earned the trust yet to get \u201ca seat at the table\u201d, plain and simple. We tend to be very maxed out, overwhelmed, and we\u2019re pretty cavalier with our estimates around development. This reflects technology \u2013 which is fast moving, there\u2019s so much to learn, and it\u2019s not in a stable state. But when engineers hem and haw about their estimates, or argue for prioritizing pet projects that are solely tech-driven, it makes us look unreliable as a partner. And we haven\u2019t learned yet to use facts and tie our decisions into saving money or getting an advantage in the market.<\/p>\n<p>Always keep this in mind \u2013 any business person can make the leap to dollars. But if you\u2019re making an argument and you are talking about code \u2013 that\u2019s a bridge too far. It\u2019s too much to expect them to make that jump from code to customer to dollars. So if you tell me you need React 16, that won\u2019t sell. But if you say 10% of your customers will have a better experience because of this new feature \u2013 any business person can look at that and make the connection, that could be 5,000 customers that are now going to have a better customer experience. You don\u2019t have to be Bill Gates to figure out that\u2019s a good move!<\/p>\n<p>Let\u2019s get down to brass tacks \u2013 how do I make this monitoring data actionable? We wouldn\u2019t think about putting planes in the air without a black box \u2013 some way of finding out after something goes wrong what happened, and why. That\u2019s what crash monitoring is, and it\u2019s incredibly actionable. You know the health of your deployment cycle, you can respond faster when changes are introduced that degrade that customer experience.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/31\/2019\/04\/errors.png\"><img decoding=\"async\" style=\"margin: 0px 10px 0px 0px;float: left\" title=\"errors\" src=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/31\/2019\/04\/errors_thumb.png\" alt=\"errors\" width=\"244\" height=\"141\" align=\"left\" border=\"0\" \/><\/a>Let\u2019s say you are seeing 100,000 errors a month. Once you group them by root cause, that overwhelming blizzard of problems gets cut down to size which is more common than you\u2019d think. You may have 1,000 distinct errors, but only 10 actual, honest-to-goodness bugs. Then you break it down by user, and that\u2019s when things really settle out. You might find that one user is using a crappy browser that\u2019s blocking half your scripts \u2013 that isn\u2019t an issue really. But then there\u2019s that one error that\u2019s happened only 500 times \u2013 but it\u2019s hitting 250 of your customers. That\u2019s a different story! So you\u2019re shifting your conversation already from how many errors you\u2019re seeing to the actual number of customers you\u2019re impacting \u2013 that\u2019s a more critical number, and one that everyone from your CEO down understands. And it\u2019s actionable. You can \u2013 and you should \u2013 take those top 2 or 3 bugs and drop it right into your dev queue for the next sprint.<\/p>\n<p>This isn\u2019t rocket science, and it isn\u2019t hard. Reducing technical debt and improving speed is just a matter of listening to what your own application is telling you. By nibbling away on the stuff that impacts your customers the most, you end up with a hyper reliable system and a fantastic experience, the kind that can change the entire game. One company we worked with started to just take the top bug or two off their list every sprint and it was dramatic \u2013 in 8 weeks, they reduced the number of impacted customers by 96%.<\/p>\n<p>Think about that \u2013 a 96% reduction in two months. Real user monitoring, APM, error and crash reporting \u2013 this stuff isn\u2019t rocket science. But think about how powerful a motivator those kinds of gains are for behavioral change in your company. Data like that is the golden ticket you need to get support from the very top levels of your company.<\/p>\n<p>One of my mentors was Rod Drury, who founded Xero right here in Wellington, New Zealand. He says all the time: \u201cIt\u2019s not the big that eat the small, it&#8217;s the fast that eat the slow\u201d. That\u2019s what DevOps is about &#8211; making your engineering team as reliably fast as possible. To get fast, you have to have a viable monitoring system that you pay close attention to. Monitoring is as close as you can get in this field to scratching your own itch.<\/p>\n<p>What about building versus buying a monitoring system? I\u2019ll admit that I\u2019m biased on the subject, running a SAAS-based monitoring business. But I do find it head-scratching when I talk to people that are trying to build their own. I ask them, \u201chow many people are you putting on this?\u201d And they tell me \u2013 oh, 4 people, say a six month project. And then I say, \u201cwhat are their names?\u201d They look at me funny, and ask why \u2013 I tell them, \u201cI\u2019ve had 40 people working on this for 5 years \u2013 now I can fire them and hire your people!\u201d Back in 2005, it made total sense to roll your own, since so much of the stuff we use nowadays didn\u2019t exist. But those times have changed. Even self-hosting as its issues. Let\u2019s say you decide to go down the ELK stack route. Well, that means running a fairly large elastic instance, which is not a sit-and-forget type system. It\u2019s a pain in the ass to manage, and it\u2019s not a trivial effort.<\/p>\n<p>To me it also is answering the wrong question. To me, there\u2019s one question that should be the foundation for any decision an engineering team makes \u2013 does this create value for our customer? Is our customer magically better off because we made the decision to build our own? I think \u2013 for most companies \u2013 probably building a robust monitoring system has little or nothing to do with answering that question. It ends up being a distraction, and they spend far more to get less viable information.<\/p>\n<p>Etsy says \u201cif it moves, track it.\u201d Do you agree \u2013 should customers track everything? I\u2019m pragmatic on this \u2013 if you\u2019re small, tracking everything makes sense. Where it goes wrong is where the sheer amount of data clogs our decision making.<\/p>\n<p>So then you start to think about sampling data. However, what I often see is someone sitting in a chair, looking off into the distance and says \u2013 \u201cyeah, I think about 10% of the data would give us enough\u201d. Rarely do we see people breaking out Excel and talking about what would be statistically significant &#8211; people tend to make gut calls. If you\u2019re tracking everything you possibly could with real user monitoring for example, it can be a real thicket \u2013 a nightmare, there\u2019s so many metric streams. You trip over your own shoelaces when something goes wrong \u2013 there\u2019s just so much detail, you can\u2019t find that needle in the haystack quickly. So you need both aggregate and raw data \u2013 to see high level aggregates and spot trends, but then be able to drill in and find out why something happened at the subatomic particle level. We still see too many tools out there that offer that great strategic view and it\u2019s a dead end \u2013 you know something happened, but you can\u2019t find out exactly what\u2019s wrong.<\/p>\n<p>Any closing thoughts? I never get tired of trying to tie everything back to the customer, to the end user experience. It\u2019s so imperative to everything you&#8217;re doing. There is literally no software written today for any reason other than providing value to humans. Even machine to machine, IOT systems are still supporting a human being.<\/p>\n<p>Human beings are the center of the universe. But you wouldn\u2019t know that by the way we\u2019re treated by most of the software we write. Great engineers and great executives grasp that. They know that to humans, the interface is the system \u2013 everything else simply does not matter in the end. So they never let anything get in the way of improving that human, end user experience.<\/p>\n<p>References:<\/p>\n<ul>\n<li><a href=\"https:\/\/raygun.com\/\">https:\/\/raygun.com\/<\/a> &#8211; official Raygun site<\/li>\n<li><a href=\"https:\/\/hanselminutes.com\/421\/managing-errors-across-platforms-with-raygunio\">https:\/\/hanselminutes.com\/421\/managing-errors-across-platforms-with-raygunio<\/a> &#8211;\u00a0 May 22, 2014 podcast interview with Scott Hanselman and John-Daniel Trask on Raygun<\/li>\n<li><a href=\"https:\/\/channel9.msdn.com\/Events\/dotnetConf\/2015\/Handling-billions-of-exceptions-with-NET--Raygunio\">https:\/\/channel9.msdn.com\/Events\/dotnetConf\/2015\/Handling-billions-of-exceptions-with-NET&#8211;Raygunio<\/a> &#8211; March 5, 2015 Channel9 Interview with John-Daniel Trask on how Raygun handles billions of exceptions<\/li>\n<li><a href=\"https:\/\/channel9.msdn.com\/Events\/TechEd\/NewZealand\/2013\/DEV302\">https:\/\/channel9.msdn.com\/Events\/TechEd\/NewZealand\/2013\/DEV302<\/a> &#8211; TechEd New Zealand 2013, \u201cDevOps at LightSpeed, lessons we learned from building a Raygun\u201d, 9\/6\/2013, by Jeremy Boyd, John-Daniel Trask<\/li>\n<li>Dominos Pizza story and Raygun &#8211; <a href=\"https:\/\/qz.com\/938620\/dominos-dpz-stock-has-outperformed-google-goog-facebook-fb-apple-aapl-and-amazon-amzn-this-decade\/\">https:\/\/qz.com\/938620\/dominos-dpz-stock-has-outperformed-google-goog-facebook-fb-apple-aapl-and-amazon-amzn-this-decade\/<\/a><\/li>\n<\/ul>\n<hr \/>\n<p><a href=\"https:\/\/blogs.msdn.com\/b\/premier_developer\/archive\/2014\/09\/15\/welcome.aspx\"><strong>Premier Support for Developers<\/strong><\/a> provides strategic technology guidance, critical support coverage, and a range of essential services to help teams optimize development lifecycles and improve software quality. Contact your Application Development Manager (ADM) or <a href=\"https:\/\/blogs.msdn.microsoft.com\/premier_developer\/contact-us\/\">email us<\/a> to learn more about what we can do for you.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>App Dev Manager Dave Harrison talks with John-Daniel Trask, co-founder and CEO of Raygun, about the adoption of DevOps.  &#8220;Human beings are the center of the universe. But you wouldn\u2019t know that by the way we\u2019re treated by most of the software we write. Great engineers and great executives grasp that.  They know that to humans, the interface is the system \u2013 everything else simply does not matter in the end. So they never let anything get in the way of improving that human, end user experience.&#8221;<\/p>\n","protected":false},"author":582,"featured_media":27431,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[22],"tags":[21,3],"class_list":["post-26445","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-devops","tag-devops","tag-team"],"acf":[],"blog_post_summary":"<p>App Dev Manager Dave Harrison talks with John-Daniel Trask, co-founder and CEO of Raygun, about the adoption of DevOps.  &#8220;Human beings are the center of the universe. But you wouldn\u2019t know that by the way we\u2019re treated by most of the software we write. Great engineers and great executives grasp that.  They know that to humans, the interface is the system \u2013 everything else simply does not matter in the end. So they never let anything get in the way of improving that human, end user experience.&#8221;<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts\/26445","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/users\/582"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/comments?post=26445"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts\/26445\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/media\/27431"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/media?parent=26445"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/categories?post=26445"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/tags?post=26445"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}